INSURANCE · PRICING & RATING

The factor predicts risk. It may also be a proxy for something you cannot price on.

A pricing model finds the factors that predict loss. The hazard is that the most predictive factors are often proxies for characteristics an insurer is not permitted to discriminate on — and a model optimised purely for accuracy will reach for them unless something stops it and records that it did.

Pricing is where the actuarial function meets the model most directly, and where the gain from machine learning is largest. A modern rating model can read hundreds of variables and find combinations a traditional generalised linear model would miss, pricing risk more finely and defending the loss ratio. A European supervisory survey found half of non-life carriers already running such models in production. The accuracy is real, and in a competitive market the insurer that prices risk better wins the good risks and sheds the bad ones. The problem is what the model learns to use to do it.

The most predictive variables are frequently proxies for protected characteristics. Postcode stands in for ethnicity and for deprivation; occupation and shopping pattern stand in for income and sometimes for gender; a dense feature set can reconstruct a protected attribute the insurer deliberately excluded, and price on it indirectly. A model optimised purely for predictive accuracy has no reason not to, and every reason to — the proxy improves the fit. Unfair discrimination by proxy is the central pricing risk of machine learning in insurance, and it is invisible unless someone is testing for it.

The regulators have named this directly. South Africa's Treating Customers Fairly framework requires fair outcomes in product design and pricing. The European Union's artificial-intelligence regime classes life and health pricing as high-risk and requires bias testing against protected classes, technical documentation, and a per-decision record, backed by penalties reaching tens of millions of euros — and although it stops short of pricing in property and casualty, the principle is unambiguous. The standard the actuary now has to meet is not that the model is accurate; it is that the model is accurate and demonstrably not discriminatory, with the testing on the record.

The method that answers this is itself a data problem. To prove a model is not discriminating by proxy, the actuary has to test its outputs against the very protected attributes — gender, ethnicity, disability — that the model is forbidden to price on. The attributes have to be excluded from training but retained for audit, which is a precise and awkward requirement: the insurer must hold the protected data to prove it is not using it, while ensuring it never enters the rate. Get that architecture wrong and the insurer either cannot prove fairness or creates a data-protection exposure in the act of trying.

The operational reality is that most pricing models cannot show their working at this level. They produce a rate, and the actuary can describe the model's structure, but the per-decision basis — which factors drove this premium, and whether the combination amounts to a proxy for a protected characteristic — is not preserved in a form a regulator could audit. When a pattern of pricing is challenged as discriminatory, the insurer is left arguing about the model in the abstract rather than showing, decision by decision, that it priced on permitted factors and tested for the rest.

The African context gives the proxy problem unusually sharp edges. Where income, ethnicity, and location are strongly correlated — as they are across much of the continent's urban geography — a postcode or estate variable is a far more powerful proxy for a protected characteristic than it would be in a more mixed market, and a usage-based or telematics model can reconstruct sensitive attributes from behaviour the applicant never disclosed. The competitive dynamic then bites: the insurer that refuses the proxy prices less finely than one that quietly uses it, so a fair pricing floor is something the regulator has to hold for the whole market — and the insurer that can already prove its own fairness is the one positioned to gain when it does.

To prove the model is not pricing on a protected characteristic, you have to hold that characteristic — and never let it touch the rate.

HOW THE THREE PRODUCTS HANDLE THIS

Where each sits.

AKKI

Akki governs which factors enter the pricing model and logs them, so the actuary can state exactly what the rate was built on and reproduce it. The separation between the permitted factors that set the price and the protected attributes held only for audit is enforced and recorded by the platform rather than promised in a model-governance policy.

SOLVA

Solva structures the pricing reasoning and refuses to rest a rate on a factor it cannot justify, surfacing where a predictive variable is acting as a proxy for a protected characteristic rather than letting accuracy override fairness silently. Underneath each rate sits the basis and the bias testing — which factors drove the premium, and the evidence that the combination was tested against protected classes. This is the per-decision record the high-risk regimes demand and the fair-treatment regimes imply.

SYNISENSE

This is SyniSense's strongest home in insurance. It keeps protected attributes inside the perimeter, available for bias testing and audit, while ensuring they never enter the model that sets the rate — resolving the precise architectural bind the fairness requirement creates. The model reasons on permitted factors; the protected data is held, separately and inside, solely to prove that it did. The insurer can demonstrate non-discrimination without ever pricing on, or exposing, the attributes it is testing against.

WHAT CHANGES

For the pricing actuary, fairness becomes demonstrable rather than asserted. The model prices on permitted factors, the protected attributes are held separately for testing, and the bias testing is on the record per decision — so a discrimination challenge is met with evidence rather than an argument about the model in the abstract.

For the data protection officer, the awkward requirement to hold protected data in order to prove it is not used has a clean architecture. The attributes are kept inside the perimeter for audit only and never enter the rate, which is both the fairness control and the data-protection control at once.

For the policyholder, the price rests on factors that genuinely relate to risk rather than on a proxy for who they are. The fair-outcome standard the conduct regimes demand is built into how the rate is set, not bolted on after a complaint.

For the board and the regulator, pricing stops being an unexamined discrimination risk. The insurer can show, across the book, that its rates are accurate and tested for unfair discrimination — the posture the high-risk regimes require and the direction every conduct regulator is travelling. And as those regimes arrive, the carrier that already prices on permitted factors and can prove it adapts without re-engineering its book, while competitors that leaned on proxies must unwind them under supervision and in public.

See how SyniSense proves your pricing is fair without ever pricing on what you're testing for →
← Back to the Insurance hub