Question 1

What does eval-driven AI actually mean?

Accepted Answer

Before AssurePath writes a prompt, we write the test suite. Fifty to five hundred real cases drawn from your data, scored by your experts, weighted by what actually matters. It runs on every commit during the build, and on every new frontier model release after go-live. We don't deploy a version that regresses against it - the model that's good enough in week three has to still be good enough next year when a frontier model upgrade lands.

Question 2

Will our data go to OpenAI, Anthropic or the wider internet?

Accepted Answer

Only if you say so, and only to the frontier endpoints you choose. AssurePath defaults to dedicated zero-retention endpoints from Anthropic, OpenAI or Google for regulated work, a private VPC deployment for high-stakes engagements, and on-prem open-weights models such as Llama 4 or Mistral for true data-residency constraints. Training data never leaves your perimeter. We sign DPAs and configure data-retention to zero.

Question 3

How much does an AssurePath AI build cost?

Accepted Answer

It depends on the shape, the data, and the integration surface - every build is bespoke, scoped per business, and quoted as a fixed fee after a free 30-minute scoping call. The productised 6-week AI Pilot is a separate, fixed-price offer on its own page. If the maths doesn't work for what you have in mind, we'll tell you in the scoping call.

Question 4

How do you stop the AI hallucinating?

Accepted Answer

Three things. Retrieval grounding - the AI cites the paragraph it pulled its answer from. A measurable quality bar that specifically tests for refusing-when-uncertain and citing-when-confident. Human-in-the-loop checkpoints on any action that changes data. Hallucination is a system-design problem, not a model problem. The quality bar is what catches it before users do.

Question 5

Will we be tied to Claude or GPT-5.5 after an AssurePath AI build?

Accepted Answer

No. Every build is model-agnostic by default. Prompts, retrieval and tool calls live behind a thin abstraction. Swapping Claude Opus 4.7 for GPT-5.5, Gemini 3.1 Pro or an on-prem Llama 4 later is a config change, not a rewrite. We re-run the quality bar, you see the trade-offs, you pick.

Question 6

Do you work with FCA-regulated firms and law firms?

Accepted Answer

Yes. For FCA work AssurePath defaults to dedicated zero-retention frontier endpoints or a private VPC deployment, with full audit logging and a signed DPA. For law firms with strict data-residency or privilege concerns, we deploy on-prem open-weights models. Cyber Essentials Plus held, ISO 27001-aligned controls, evidence trails per case.

Question 7

Can you train a custom model on our data?

Accepted Answer

Yes - and we usually push back on it. Fine-tuning is the right answer about one engagement in twenty. The other nineteen are better solved with prompt engineering, retrieval and structured outputs. AssurePath will tell you which yours is in the scoping call. Fine-tuning, when needed, runs through frontier-vendor APIs or on open-weights models hosted in your environment.

Question 8

What if we already tried AI and it didn't work?

Accepted Answer

Common. Usually the issue is one of three: no measurable quality bar so 'good' is a vibe, AI bolted onto a process that should have been redesigned first, or hosted-vendor lock-in that meant nobody could iterate. AssurePath is happy to do a short paid review of an existing AI build and either fix it or recommend starting over.

Eval-driven AI.
Not vibes-driven.

A quality bar,
before a single prompt is shipped.

We don't sell AI dreams.

Three AI builds,
live in production.

Contract triage agent

AML doc-AI pipeline

CV-to-JD ranking + outreach

Four shapes of practical AI.

Fixed scope. Fixed fee.
Per workflow.

Eval-first. Then build.

What's in the box.

Less talking. More shipping.

From four FTEs reviewing AML files by hand - to one reviewer, 12 minutes per file.

Sectors with the playbooks.

Where AI lives.

Questions we answer in week one.