How the EU-hosted models compare.
An indicative leaderboard of every model in the Pryvan catalogue, composited across seven capability dimensions. All hosted inside the EU, ranked so you can choose with eyes open.
Composite · 0–100
General knowledge
MMLUBroad factual and academic knowledge across domains.
Reasoning
GPQAMulti-step logical and graduate-level problem solving.
Coding
HumanEval / LiveCodeBenchCode generation, review and agentic software tasks.
Math
MATHArithmetic, algebra and competition-level mathematics.
Multilingual
EU-language evalQuality across European and other languages.
Long context
Long-context recallRecall and reasoning over very large inputs.
Efficiency
Cost / performanceQuality per unit of compute, cost-effectiveness.
Scores are 0-100, composited from public benchmarks (MMLU, GPQA, HumanEval, MATH and others) and a qualitative read on EU-language and long-context behaviour. They are indicative, not a guarantee for your data, and they shift as new model versions ship. Use them to shortlist, then test on your own workload.
Bring AI into your business, without giving up your data.
Join the waitlist. We're onboarding GDPR-sensitive SMEs across Europe.