Case study: agentic AI for BSA/AML and SR 26-2 readiness at a top-20 U.S. bank

Virtova case study · U.S. banking · AI strategy, MRM under SR 26-2, and financial-crime platform deployment

Agentic AI for BSA/AML and SR 26-2 readiness at a top-20 U.S. bank

A top-20 U.S. national bank converted a year of AI strategy, AI governance under SR 26-2, and an agentic regulatory-documentation factory into the production deployment of AutoCEN — the Frontier Foundry AI-native financial-crime platform — alongside a portfolio of targeted process-automation pilots. The work moved BSA/AML, OFAC sanctions, and fraud compliance from a cost center stuck in queue management to a deterministic, examiner-readable, AI-native operating discipline.

A top-20 U.S. national bank entered the engagement with a familiar problem in an unfamiliar shape. Hundreds of thousands of customer and transaction records were flagged across BSA/AML monitoring, OFAC sanctions screening, and fraud queues. Most of those flags were not real risk findings. They were artifacts of dirty data inherited from a series of acquisitions, brittle data-science platform decisions, and core-system migrations that had never reconciled cleanly. Investigator queues were drowning. Per-alert remediation cost was running well into the double digits. SAR drafting took analysts hours per filing. False-positive rates on sanctions screening sat above 5%, which is industry-typical and also the wrong number to be near. New supervisory expectations under SR 26-2, the FinCEN CDD Rule, NY DFS Part 500, the EU AI Act, the Colorado AI Act, ISO/IEC 42001, and NIST AI RMF were tightening the perimeter every quarter.

Manual effort

−60%+

Compliance workload across BSA/AML, OFAC, and fraud functions.

False positives

−72%

Across the alert queues that previously consumed investigator capacity.

SAR drafting

−97%

Drafting time, hours per filing collapsed to under sixty seconds.

Alert review SLA

98.2%

Attainment, examination-grade across BSA/AML, OFAC, SOX, and FFIEC.

Production outcomes measured inside the first fiscal year of AutoCEN deployment. AutoCEN reference performance per Frontier Foundry's published platform metrics.

The engagement opened as a strategic AI assessment for the Office of the CEO and Enterprise Architecture and converged, twelve months later, on a production deployment of AutoCEN, the Frontier Foundry AI-native financial-crime intelligence platform, plus a parallel slate of targeted AI process-automation pilots that returned measurable, board-reportable value inside a single fiscal year. Virtova led the strategy, governance, and implementation arc; Frontier Foundry’s AutoCEN platform was the financial-crime spine. Both entities share ownership; the relationship is disclosed in writing on every contract.

This case study documents how the bank got from a queue management problem to a deterministic, examiner-readable AI operating discipline.

The reframing that unlocked the program

Most BSA/AML and fraud transformation programs at large U.S. banks follow a familiar script. Hire more analysts. Buy a Tier-1 case-management platform. Re-tune the rules engine. Repeat in eighteen months. The unit economics never change, the queue never empties, and the next examination cycle finds the same gaps the last one did.

The engagement reframed the problem. At a top-20 national bank with multiple core systems stitched together by acquisition, the dominant cost driver is not analyst speed. It is that a very large fraction of every alert queue is downstream of bad data and manual process drag. Records are flagged because a name field is malformed, an address is stale, a beneficial-ownership chain is incomplete, or the same customer exists three times across three legacy cores. SARs take hours to draft because the underlying narrative-generation process is artisanal. Sanctions hits run 95% false positive because the matching logic was tuned for a 1990s threat surface and never seriously rebuilt.

The first KPI in BSA/AML transformation is not alerts cleared per analyst per day. It is alerts that should never have been generated in the first place, and process steps that should never have required a human at all.

That single reframe — stop generating fake work, then automate the rest — is what made the rest of the program tractable. It is also the part most consultancies will not say out loud to a chief compliance officer, because the implication is that the existing operating model is the problem and not the staffing math.

12-month engagement arc

Phase 0

AI strategy + the two frameworks

Office-of-the-CEO assessment. AI Rubric and Hybridization Strategy fall out as the program's standing references.

Phase 1

Governance + MRM factory

Vendor MRM under SR 26-2. Agentic documentation orchestrator and QC validator make the bank's MRM packets reproducible.

Phase 2

AutoCEN deployment

Production deployment of the Frontier Foundry AI-native financial-crime platform across BSA/AML, OFAC, and fraud.

Phase 2b

Targeted pilot portfolio

Five narrow-scope AI pilots scored against the AI Rubric, governed under the Phase 1 framework.

Phase 0: AI strategy and the two frameworks the program rests on

The engagement opened with a multi-month AI strategy and assessment package for executive stakeholders. The deliverables included an Executive Strategic Context paper, stakeholder dossiers, an executive-interview questionnaire customized to the bank’s lines of business, an Implementation Recommendations playbook, an AI Strategy and Data Strategy deck for Enterprise Architecture and the SVP-Data function, and an AI Working Group operating cadence with an executive-edited AI Guidelines deck.

Two pieces of original IP came out of that work and the rest of the program rests on them.

The AI Rubric. A scoring framework for evaluating candidate AI use cases inside a regulated bank, across regulatory exposure, data readiness, model-class fit, vendor concentration, and value-at-stake. The Rubric is what the bank used to triage a sprawling backlog of AI ideas down to a defensible portfolio of fundable pilots. Every pilot in Phase 2b was scored against it before scope, budget, or vendor selection.

The Hybridization Strategy. A methodology for combining traditional ML, deep learning, generative AI, and agentic AI inside a single regulated environment with deterministic auditability. The principle is plain-spoken and intentional. No black-box models. Every output examinable. Every decision traceable to a regulator-readable artifact. That principle is what makes the platform passable in front of OCC and FRB exams, and it is what most generative-AI vendors quietly cannot offer when the conversation moves from a sales deck to a model-risk committee.

The AI Rubric and the Hybridization Strategy became the bank’s standing reference for any AI investment decision. Three quarters in, both frameworks were embedded in the AI Working Group’s intake process. The bank now applies them to vendor pitches without external help.

Phase 1: The MRM documentation factory

Before any pilot could ship, the bank needed a governance spine that would survive examiner scrutiny under the SR 11-7 / SR 21-8 / SR 26-2 stack, the FFIEC IT Handbook, and the OCC’s third-party risk guidance. The April 17, 2026 SR 26-2 update supersedes SR 11-7 and SR 21-8 for in-scope models and explicitly leaves generative and agentic AI outside formal scope (footnote 3), while telling banks to apply the SR 26-2 principles to those systems anyway. The bank’s existing MRM program had been built against SR 11-7 alone, with no parallel discipline for the generative and agentic systems that had quietly entered production through vendor channels.

The Phase 1 deliverable was a working operating discipline plus the documentation infrastructure to support it. The artifact set included an AI Model Risk Management Vendor Requirements document mapping every vendor obligation against SR 11-7, SR 21-8, SR 26-2, BSA/AML, USA PATRIOT Act, GLBA, NY DFS Part 500, the Colorado AI Act, the EU AI Act, ECOA/Reg B, FCRA, DORA, OFAC, the OWASP LLM Top 10, NIST AI RMF, ISO/IEC 42001, ISO 27001, SOC 2 Type II, and NIST 800-88. It also included a Standard Operating Procedure for AI Vendor MRM, AI/ML and data-vendor due-diligence questionnaires, a high-priority vendor doc-request workbook, a Generative AI Contract Analysis workbook for legal and procurement, an AI Use Case Omnibus Tracker covering every candidate use case across business units, and an AI 2.0 Engineering Incubation Lab reference architecture spanning the bank’s cloud and legacy on-prem datacenter.

The piece most banking consultancies cannot deliver, and the piece that pays for the rest, is what came next: a working agentic regulatory-documentation factory built on top of all of the above.

A typical large bank carries 200 to 500 models in inventory. Each one needs an MRM response packet that satisfies multiple regulators, multiple internal committees, and multiple lines of defense. Producing those packets by hand is the reason banks staff entire MRM departments full of people who would rather be doing actual quantitative work. The factory replaced the artisanal version with three coordinated artifacts.

MRM Vendor Response Template. A twelve-section, machine-fillable, regulator-ready template covering Document Control, Executive Summary, Governance, Model Development, Validation and Testing, Explainability, Bias, Data Governance, Monitoring and Drift, Cybersecurity, Third-Party Risk, Regulatory Compliance, Contract and Exit, and Sign-off, plus appendices.

MRM Agentic Orchestrator. An agentic CLI orchestration prompt that walks a YAML-encoded model inventory, merges per-model definitions with shared org-wide YAMLs (governance, security, data privacy, subcontractors, certifications, policies, organizational structure), applies conditional logic (generative versus traditional ML; consumer-facing versus internal; high-risk versus low-risk), and emits one fully populated regulator-grade response document per model.

MRM QC Validator. A second agentic pass that scores each generated response across structural validation, regulatory sufficiency mapped regulator-by-regulator, content quality, plain-language Board-summary readability, internal consistency, and formatting. Findings classify as Critical, High, Medium, or Low. Each batch produces a submission-readiness report before a human ever opens a document.

The factory drops the marginal cost of producing an MRM-grade artifact for a new model by roughly an order of magnitude. Every artifact is QC’d by a deterministic second agent before reaching a reviewer. The regulator on the other side of the table sees a consistent narrative across the model inventory rather than a stack of one-off documents written by different analysts in different quarters under different assumptions. This is operational AI governance at a layer almost no traditional consultancy delivers, because it lives at the agentic-orchestration layer rather than the slide-deck layer.

Phase 2: AutoCEN — the financial-crime platform deployment

The strategy and governance work converged on a production deployment of AutoCEN, positioned by Frontier Foundry as the first purpose-built, AI-native platform for financial-crime detection and regulatory compliance. AutoCEN exists because retrofitting AI onto a twenty-year-old AML stack cannot produce the deterministic auditability the Hybridization Strategy requires. The platform was engineered AI-native from day one for the operating conditions a top-20 national bank’s BSA/AML, OFAC, and fraud functions actually run under, and for the rapid-deployment conditions a de novo charter needs on day one.

AutoCEN is a Frontier Foundry product. Virtova led the bank-side engagement and the implementation arc; Frontier Foundry produced the platform. Both entities share ownership; the relationship is disclosed in writing.

The platform’s reference architecture and performance numbers, as published by Frontier Foundry, sit at seventeen specialized microservices on a cloud-native, multi-region AWS architecture with a two-hour RTO, sustained throughput above 2,000 transactions per second, 99.87% platform uptime, and sub-50ms screening latency on real-time transaction monitoring. The fraud detection layer runs a multi-model ensemble (Transformer, LSTM, Autoencoder) with full explainability and a reference AUC above 93%.

Sanctions screening false-positive rate

~166× lower than industry baseline.

Industry average ~5%+

AutoCEN reference 0.03%

Industry baseline ~5%+ per AutoCEN reference benchmarking. AutoCEN reference rate from Frontier Foundry's published platform metrics. The AutoCEN bar is drawn at minimum visible width; true proportional width would be a single pixel.

What the platform does for an operating BSA/AML function:

Real-time transaction monitoring screens every transaction against fifty-plus configurable AML detection scenarios in real time at sub-50ms latency.

AI-powered fraud detection runs the multi-model ensemble with full explainability and regulatory audit trails. The bank moved from reactive fraud response toward anticipatory disruption inside the first two quarters of production, supported by 30-, 60-, and 90-day risk-trajectory forecasting under the predictive risk layer.

Sanctions and watchlist screening runs OFAC, UN, and EU lists with intelligent fuzzy matching at a 0.03% false-positive rate against an industry average above 5%. That alone recovered a substantial fraction of the bank’s investigator capacity.

Automated SAR filing generates AI-drafted narratives with FinCEN direct submission. Examiner quality score on the AutoCEN narrative output is 94.1%. SAR drafting collapsed from hours per filing to under sixty seconds. This is direct FinCEN integration, not screen-scraping.

Before

~3 hrs

Per SAR filing. Manual narrative drafting. Each analyst working a queue.

After

<60 sec

Per SAR filing. AI-drafted narrative. FinCEN direct submission. 94.1% examiner quality score.

SAR drafting time, before and after AutoCEN deployment. AutoCEN reference performance per Frontier Foundry's published platform metrics.

KYC / KYB verification runs end-to-end customer and business onboarding with document proofing, ultimate-beneficial-owner mapping, and risk tiering, on the same governance spine the rest of the platform sits under.

Cryptocurrency compliance covers Bitcoin, Ethereum, and Solana natively, with Travel Rule enforcement, multi-hop transaction tracing, and sanctioned-wallet detection. Legacy AML platforms structurally cannot see this surface; the bank’s exposure to a new product line was now defensible from the day the line opened.

Graph-based network analysis uncovers fraud rings and money-laundering networks through relationship mapping and pattern detection, surfacing what alert-by-alert review structurally misses.

Predictive risk intelligence runs a four-layer weighted risk model with adjustable per-institution weights, plus 30-, 60-, and 90-day risk-trajectory forecasting that identifies emerging threats before they become regulatory findings.

The properties that let AutoCEN pass the examinations other AI platforms do not are specific. Deterministic AI execution: verifiable, repeatable outputs with full audit trails, not black-box predictions; every decision is explainable to examiners. Multi-provider AI sovereignty: local, cloud, or hybrid deployment, so the bank’s data stays where its regulators require it. Immutable audit architecture: cryptographically secured, tamper-proof records with seven-year retention, built for examiner scrutiny. Cloud-native resilience: multi-region AWS with the two-hour RTO and auto-scaling that ranges from a community bank’s footprint to the largest institutions. Infinitely configurable: every rule, workflow, threshold, and risk model is reconfigurable through a no-code admin portal, with no engineering tickets and no release trains. Open integration: pre-built connectors for Oracle, Refinitiv / LSEG, SAS, and major blockchain networks, plus an extensible API framework into any core banking system.

Frontier Foundry rates AutoCEN’s regulatory coverage at five stars across BSA/AML, OFAC, SOX, GLBA, FFIEC CAT, USA PATRIOT Act, OCC, GDPR, PCI DSS, ISO 27001, NIST CSF, and Travel Rule, out of the box.

Phase 2b: The targeted-pilot portfolio

In parallel with the AutoCEN deployment, the bank ran a slate of narrow-scope AI pilots designed to attack high-friction internal processes where the ROI math was unambiguous. Each pilot was scored against the AI Rubric, governed under the MRM framework from Phase 1, and held to a measurable value-capture target before funding.

Generative AI contract analysis. An LLM-powered triage layer over the bank’s third-party contract corpus, producing structured extracts of obligations, renewal triggers, change-of-control clauses, AI and data clauses, and termination rights. Replaced a manual review process measured in analyst-weeks per quarter with one measured in hours, with full reviewer-in-the-loop QC.

Marketing AI use-case portfolio. A scored set of marketing-side AI deployments (segmentation, generative content with guardrails, personalization with fair-lending controls), each scoped against the AI Rubric so that every pilot had a defensible answer to ECOA / Reg B and UDAAP exposure before a single model went into production.

Investment management AI pilot. A targeted AI workflow inside the investment management business line, scoped to research-summary generation, client-document drafting, and portfolio-commentary acceleration, with deterministic outputs, full PII redaction, and an MRM-grade audit trail.

Cryptocurrency compliance pilot. A focused deployment of multi-blockchain monitoring (Bitcoin, Ethereum, Solana) with Travel Rule enforcement and sanctioned-wallet detection, establishing a defensible compliance posture for a new product line that legacy AML platforms could not cover.

Vendor-risk and contract-intake automation. An agentic intake pipeline that takes a new AI/ML or data vendor’s contract, due-diligence packet, and MRM submission and emits a populated, regulator-readable governance package, closing the loop with the Phase 1 documentation factory.

The pilot layer cleared roughly a 70% reduction in cycle time across the targeted internal processes, sub-quarter payback on each pilot, and a reusable scoping pattern the bank now applies to additional pilots without re-litigating governance every cycle. Three quarters into the engagement, the bank ran the pipeline without external scoping support.

What this engagement is illustrative of

There are dozens of consultancies that will sell a top-20 bank an AI strategy. There are a handful that can build a compliance platform. There are essentially none that combine an AI-native production financial-crime platform, an agentic regulatory-documentation factory, and a governed pilot pipeline inside a single twelve-month engagement, with the whole stack regulator-readable by design.

The reasons that combination is rare are practical. An AI-native production platform with measured outcomes in a top-20 bank is not a slide deck; it requires a product entity with a real platform behind it, which is what the Frontier Foundry side of the Virtova–Frontier Foundry pairing supplies. An agentic CLI orchestration practice that runs in the room with the bank’s MRM team is not a deliverable a typical consultancy can produce; the work is usually outsourced to a systems integrator who outsources to an offshore development shop, and the layer where the leverage actually lives is lost in translation. Regulator-readable structure across every artifact (model card, vendor response, sprint deliverable, supervisory recommendation, SAR narrative, mapped against SR 26-2, NY DFS Part 500, the EU AI Act, NIST AI RMF, and the FATF Travel Rule) is the only AI architecture that holds up under a serious OCC or FRB examination in 2026, and it is a discipline that has to be designed in from the first artifact.

The reframe (“stop generating fake work, then automate the rest”) is what unlocks the rest of the program, and it is not a position most consultancies have either the latitude or the conviction to argue to a CCO. The Hybridization Strategy is what holds it together: traditional ML where it fits, deep learning where it fits, generative AI where it fits, agentic AI where it fits, all four under a single auditable governance spine. Most vendors are selling a hammer. The toolbox is the actual product.

The full intellectual-property stack travels. The AI Rubric, the Hybridization Strategy, the MRM Vendor Response Template, the MRM Orchestrator and QC Validator pattern, the AutoCEN platform, and the targeted-pilot scoping methodology are reusable across other top-50 U.S. banks, foreign banking organizations, broker-dealers, large credit unions, and de novo charters scoped against the same regulatory perimeter.

Continuing analysis

For continuing analysis on AI, audit, and U.S. banking regulation, the Sultan Meghji Substack publishes weekly to over 13,000 subscribers. The regulatory thesis the engagement was scoped against is laid out in Cyber audit in the neural-net era, AI governance for U.S. banks after the FDIC years, and NIST AI RMF in practice.

If the engagement profile is relevant to work your firm is currently scoping, the discovery call is the right entry point. Confidential by default; NDAs available on request.

Sultan Meghji founded Virtova in 2009 and is the Co-Founder and CEO of Frontier Foundry Corporation. He served as the inaugural Chief Innovation Officer of the U.S. FDIC. AutoCEN is a Frontier Foundry product; Virtova is the consulting channel for engagements involving it. Both entities share ownership; the relationship is disclosed on every contract and at the start of every conversation.