
What audit-readiness means for AI agents, what evidence one sampled action requires, and why as-of authority is the hardest question: a GRC and audit Q&A on agentic approvals.
Definition: Audit-ready agentic approvals are AI-driven decisions structured so that, for any sampled action, an organization can produce a complete evidence bundle: the authority record that authorized the agent, the rule that applied, the approval (if required), the execution log, and the exception handling. Reconstruction after the fact is not the same as audit-ready.
An auditor sampling an agent's action does not ask the agent how it reasoned. The auditor asks the same five questions used in human delegation-of-authority audits: who had authority, what were the limits, what approvals were required, what action was executed, and how were exceptions handled. The audit-readiness problem for agentic workflows is therefore not a model-explainability problem. It is an evidence and authority problem, and most of the answer already lives in the discipline of delegation of authority.
The stakes are confirmed by recent data. SailPoint's 2025 AI Agents survey found that 80% of organizations using AI agents have experienced unintended actions attributable to an agent. Gartner projects that more than 40% of agentic AI projects will be canceled by the end of 2027 due to escalating costs, unclear value, or inadequate risk controls. And the ACFE's 2024 Report to the Nations, based on 1,921 occupational fraud cases across 138 countries, found that the typical occupational fraud lasts 12 months before detection and produces a median loss of $145,000 per case. The longer the gap between action and audit, the more expensive evidence reconstruction becomes. The pattern carries over to agentic actions: evidence captured at the time of an action is decisively easier to defend than evidence assembled after the fact.
This Q&A answers the operative questions a GRC leader, internal auditor, or external auditor will ask about an organization's agentic approval program: what audit-ready actually means, what evidence is required for one sampled action, why as-of authority is the hardest question, how sampling works, how long evidence has to be retained, and how to design approvals that scale without abandoning control. For the governance frame underneath this article, see Agentic Authority Management. For the accountability frame, see Human Accountability in Agentic Workflows.

Audit-ready means that for any sampled agent action, you can produce a complete evidence bundle within audit timelines: the authority record, the rule applied, the approval (if required), the execution log, and exception handling. Audit-readiness is binary at the action level.
For one sampled action, an auditor expects to receive a coherent evidence package within 48 to 72 hours. The package answers five questions, each of which is answerable independently in a mature program:
Audit-readiness is binary at the action level. You can either answer all five questions for a sampled action or you cannot. There is no partial credit. Programs that score well on the first four questions and fail on the fifth produce findings; programs that fail on the first three produce material weaknesses.
The cost of getting this wrong is well-documented. ACFE 2024 data shows that 51% of occupational fraud cases involved either the absence of internal controls (32%) or the override of existing controls (19%). Both failure modes create exactly the audit-readiness gap this section describes: missing or non-defensible evidence at the action level. For the broader connection between authority programs and SOX-relevant control evidence, see DOA and SOX/Internal Controls.
Auditing agents is not a model-explainability problem. It is an authority and evidence problem, identical in structure to the human delegation audits enterprises have run for fifty years. The right framework is already on the shelf.
Two framings compete for how organizations think about agentic audits. The first asks: can we explain what the model did? This framing is seductive because the technology is new, but it leads to dead ends. Modern agentic systems are non-deterministic, and reconstructing internal reasoning is neither feasible nor what auditors actually want.
The second framing asks: was the actor authorized, and did the controls operate? This is the framing used for human approvals for fifty years, codified in Sarbanes-Oxley Sections 302 and 404, embedded in PCAOB standards, and replicated in the audit programs of every major external auditor. It works for humans, and it works for agents, because it audits the system around the actor rather than the actor's cognition.
Definition: Authority evidence framework is the audit approach that examines whether an actor was authorized to take a specific action and whether the surrounding controls operated as designed. It evaluates the system around the decision rather than the decision-maker's reasoning.
The regulatory direction of travel confirms this framing. The EU AI Act Article 12 mandates record-keeping for high-risk AI systems, focused on operational logs rather than reasoning explanations. The OWASP Top 10 for LLM Applications 2025 elevated Excessive Agency to LLM06, framing the risk as ambiguous or excessive authority rather than opaque reasoning. And Deloitte's State of AI in the Enterprise 2026 reports that only 21% of organizations have mature agentic AI governance, with the maturity gap concentrated in authority and evidence practices, not model explanation.
For organizations that already run a mature DOA program, the work is extension rather than invention. The authority matrix extends to agent actors. The DOA policy extends to agent grants. The audit chain is the same chain, with one new actor type added.
The control stack has six layers: authority grant, preconditions, approval capture, execution evidence, monitoring and exception handling, and periodic recertification. Missing any layer forces reconstruction at audit time, and reconstruction is the most expensive evidence to produce.
A control stack is the layered set of preventive and detective controls that together produce audit-ready evidence for a class of actions. For agentic approvals, six layers are required, and each layer has a specific evidence output that auditors will look for. Skipping any layer does not eliminate the audit question. It only means that when the question is asked, the answer has to be reconstructed from primary records that were not designed to answer it.
The most underbuilt layer in current agentic deployments is the first. EY and the Society for Corporate Governance's 2025 survey of 222 companies found that almost 90% maintain a delegation of authority policy, yet only 14% embed that policy in a dedicated IT system for tracking and enforcement. For agent grants, that gap widens. Most organizations document agent authority informally (a wiki page, a slide in a deployment review, an email approval) and then expect that informal documentation to function as the authority record at audit time. It does not.
The sixth layer, periodic recertification, is the layer most often missed entirely. An agent grant that was appropriate at deployment may not be appropriate six months later, after model updates, scope drift, or organizational change. Recertification on a defined cadence is the control that catches authority drift before it becomes an audit finding. For the broader operating model that holds these layers together, see the Operating Model for Authority Management.
A complete evidence package for one agent action contains five elements: transaction details, the approval record, the authority rule that applied, as-of authority proof, and exception documentation. Missing any one of the five breaks the audit chain.
Consider a worked example. On April 14, 2026, a procurement agent renews a SaaS vendor contract for $48,000. On June 3, an external auditor samples this transaction in Q2 testing and asks for the supporting evidence. What can the organization produce within 48 hours?
A mature program produces a five-element package. An immature program produces what it has and reconstructs the rest. The cost difference is roughly two orders of magnitude. West Monroe's 2026 Speed Wins research, based on a survey of 1,000 managers and 214 C-suite executives, found that each additional analysis request adds an average of three weeks of delay. For audits, those three weeks are billable hours, deferred close, and accumulating risk on the work paper.

Each element must reference the others by stable identifier. The transaction record references the approval record by approval ID. The approval record references the authority rule by rule ID and version. The rule references the delegation record by delegation ID. The delegation record references the accountable human owner by employee ID. Without these stable identifiers, the chain has to be reassembled by hand for every audit, every quarter, for every sampled action.
Definition: Evidence chain is the linked set of records (transaction, approval, rule, delegation, accountable owner) that together prove an authorized action took place under operating controls. The chain is only as strong as its weakest link.
For the accountable human side of the chain, see Human Accountability in Agentic Workflows. For the signatory governance side, see Authorized Signatory Lists Explained.
As-of authority is hard because it requires reconstructing what the agent was authorized to do on a specific past date. Without versioned delegation records and explicit effective dates, that reconstruction is not possible at any cost.
Continue the worked example. The auditor samples the April 14 vendor renewal in June. The decisive question is not does the agent currently have $48K authority? It is did the agent have $48K authority on April 14? Most organizations cannot answer the second question, because the only authority record they keep is the current one. The April 14 state has been overwritten.
Definition: As-of authority reconstruction is the audit task of proving what authority an actor (human or agent) held on a specific historical date. It requires versioned authority records with explicit effective and expiration dates, retained for the full audit lookback period.
Three failure modes account for almost every as-of authority audit finding. Each has a specific root cause and a specific fix.
The data underscores how widespread the gap is. The same EY/SCG survey found that almost 90% of companies maintain a delegation of authority policy, yet only 14% embed it in a dedicated IT system. The 76-point gap between policy and system embedment is precisely where as-of reconstruction fails. A policy document does not preserve historical state. A point-in-time recall capability does.

For agentic workflows, the as-of question is structurally identical to the human version, but it surfaces faster because agent volumes are higher. A human approver may sign 200 transactions a year. An agent may execute 200 a day. The same as-of authority question, asked across that volume, makes manual reconstruction unaffordable. For the related discipline of preventing the underlying records from drifting in the first place, see Avoiding Sync Drift Between Authority Systems.
Our recommendation: implement immutable, versioned delegation records for agents from day one, including pilot phases. The most expensive audit finding is not "the agent exceeded its authority." It is "we cannot prove what authority the agent had at the time of the action." Retroactive evidence construction costs orders of magnitude more than proactive record-keeping, and the cost only compounds as agent volume scales.
Sample agent actions the same way auditors sample human approvals: risk-stratified, threshold-stratified, and exception-flagged. The viability of any sampling method depends on the authority record being indexed by agent identity and decision type.
Auditors do not test every transaction. They sample. For agent actions, the sampling method is the same one used for human approvals, with adjustments for volume. A practical sampling program draws from four strata, each with its own selection logic and evidence requirement.
Sampling efficacy depends on a single underlying condition: the authority record has to be indexed by agent identity and by decision type. If the record exists only as a flat list of human delegations with agents appended in a notes column, sampling collapses into manual searching. If the record is properly indexed, sampling completes in a query.
Gartner's projection that more than 40% of agentic AI projects will be canceled by the end of 2027 attributes the cancellations to inadequate risk controls and unclear value. Sampling is one of the controls being cited. A program that cannot demonstrate proactive sampling against a documented methodology cannot satisfy an internal audit committee, an external auditor, or a regulator who asks for the evidence behind the assertion that the agent operates within bounds.
For the metric set that supports ongoing sample-based monitoring, see Authority Monitoring and Reporting Metrics.
Apply the same retention rules used for human approval evidence: SOX 7 years for financial transactions, longer for regulated industries. Storage cost is trivial compared to reconstruction cost, so default to retaining everything for the longest applicable period.
Retention requirements for agent evidence are not new. They are the same retention requirements that already apply to the human equivalents of those actions, and the same statutes and regulations govern both.
Definition: Evidence retention is the discipline of preserving authority records, approvals, execution logs, and exception documentation for the duration required by applicable law, regulation, or contract. The retention clock typically starts at the date of the action, not the date the record was created.
The most common retention floors:
For agent actions, retain the full evidence chain (delegation record version, rule version, approval record, execution log, exception documentation) for the longest applicable period among the action types the agent touches. If a procurement agent occasionally executes SOX-relevant transactions and occasionally executes routine OpEx, retain the full chain for seven years for the entire population. The cost difference between retaining for seven years versus three is negligible. The cost difference between having the evidence and not having it is total.
The point-in-time recall requirement also applies to retention. It is not enough to retain the records. The records have to be recallable in their as-of state. A delegation record retained as the current state in 2033, with no version history showing the 2026 state, satisfies the storage requirement and fails the audit requirement.
Move from approval-of-every-action to approval-of-authority-grants, approval-of-exceptions, and periodic recertification. This pattern preserves audit defensibility while letting the agent operate at speed.
If a human has to approve every agent action, the agent provides no leverage. Most organizations therefore move to a three-pattern model that retains audit defensibility without sitting in the approval path of every transaction.
Approval at the grant level. The decision an organization actually controls is who can operate the agent and within what bounds. That decision is reviewed and approved at the time of grant, with formal documentation, and re-approved on a defined cadence. Once the grant is in place, the agent operates within bounds without per-action human approval.
Approval at the exception level. When an agent encounters an action outside its bounded authority (a transaction over threshold, a counterparty on a watch list, a category not in scope), the action escalates to a named human approver. The exception path is the only place humans sit in the real-time approval flow. Designed correctly, exception volume is small enough to handle without becoming a bottleneck.
Periodic recertification. On a defined cadence (typically quarterly for high-risk agents, semi-annually for medium, annually for low), the accountable human owner reviews the agent's activity, confirms the delegated authority is still appropriate, and renews or revokes the grant. Recertification is the control that catches drift between deployment-time intent and current-state behavior.
Definition: Periodic recertification is the scheduled review and explicit renewal (or revocation) of an authority grant by the accountable owner. Recertification is the discipline that prevents authority records from quietly aging into invalidity.
The business case for getting this design right is large. West Monroe's 2026 research found that 73% of leaders estimate their organizations lose up to 5% of annual revenue to slow decision-making and delayed execution: revenue lost to missed opportunities, stalled initiatives, and lost momentum. Well-designed agent approvals are one mechanism for closing that gap. APQC's 2024 research on DOA effectiveness, based on a survey of 311 finance professionals, found that organizations with effective delegation report 62% higher productivity, 53% higher organizational agility, and 49% reduction in bottlenecks. The same pattern applies to well-bounded agent authority.
Our recommendation: design the exception path before deploying the agent. Most agent governance failures we see are not failures of the bounded operating mode. They are failures of the exception path: an exception arrives, no human is named to handle it, the exception gets routed to whoever is nearest, and the resulting approval has no rule reference, no audit trail, and no defensible evidence position. Naming the exception approver in the delegation record itself prevents this failure mode entirely. For the change-management practice that supports this discipline, see the Authority Change Management Playbook.
Aptly is the authority system of record for both human and agent actors. It maintains versioned delegation records, point-in-time recall, the evidence chain, and the runtime decisions that audit-ready agentic approvals require.
Aptly's Authority Hub maps to each layer of the control stack described above:
Point-in-time recall is the underlying capability that makes as-of authority answerable. Any delegation record can be queried for its state on any past date within the retention window. The five-element evidence package becomes a single query rather than a multi-day reconstruction project. For the broader Aptly platform context, see the Authority Hub and the single source of truth pattern that underpins it.
You do not audit the reasoning. You audit the authority and evidence chain: was the agent authorized, were preconditions met, was the action within delegated limits, and was the outcome recorded? This is the same framework auditors use for human decisions.
For human approvers, auditors do not reconstruct the approver's thought process. They verify that the approver had authority, the approval was within scope, and the evidence trail is complete. The same logic applies to agents. Non-determinism in the reasoning is irrelevant to the audit if the authority and evidence chain is intact. The doctrine underneath this approach is articulated in the Restatement (Third) of Agency, which has been the operative framework for principal-agent audits for two decades.
The EU AI Act (Articles 12 and 14), SOX Sections 302 and 404, MiFID II, NIST AI RMF, and ISO/IEC 42001:2023 all require audit-evidentiary practices for AI-driven decisions. The frameworks differ in scope but converge on the same evidence requirements.
The EU AI Act Article 14 requires effective human oversight for high-risk AI, and Article 12 mandates record-keeping. Sarbanes-Oxley Sections 302 and 404 require effective internal controls over financial reporting, which extends to any agent executing financial transactions. NIST AI Risk Management Framework 1.0 embeds accountability and oversight as governance requirements. ISO/IEC 42001:2023 provides a certifiable AI management system standard. MiFID II and APRA CPS 510 add jurisdiction-specific requirements for financial services.
The exception itself is not the audit failure. The audit failure is the absence of documented detection, escalation, and resolution. A logged exception with a clear resolution trail is a control operating as designed. An undocumented exception is a control gap.
Auditors expect exceptions. They do not expect undocumented exceptions. A program that catches an out-of-bounds agent action, escalates it to a named approver, documents the resolution, and updates the authority record (or the agent's scope) is a program where the controls operated. A program that has no record of the exception, or that resolved it informally, has a material control gap regardless of whether the underlying business outcome was acceptable.
Combine three sources: the third-party system's API audit log, the third-party SOC 2 or equivalent control attestation, and the internal authority record. The internal record provides the authorization; the third-party log and attestation provide the execution evidence.
For SaaS-executed actions (a CLM platform signing a contract, a treasury platform initiating a payment, a procurement platform issuing a purchase order), the execution log lives in the vendor's system. Auditors accept third-party logs paired with the vendor's SOC 2 Type II report (or equivalent) as evidence of the control environment around the log. The internal authority record completes the chain by establishing that the agent was authorized to invoke the third-party action in the first place.
The accountable human owner named in the delegation record. SOX certifying officers (CEO and CFO) certify the control environment that bounded the agent, but the named accountable owner certifies the specific authority and activity record on a defined cadence.
The two-tier model parallels the human approver case. CEOs and CFOs certify the control environment under SOX Section 302, including the control environment around agent actions. The accountable human owner named in the delegation record (typically a process owner: head of procurement operations, treasurer, head of customer operations) certifies the agent-specific authority and activity records. Both certifications are required for a complete control framework. For the broader treatment of accountable ownership, see Human Accountability in Agentic Workflows.
Yes, provided the control framework meets the same evidence standard as human approvals: documented authority grants, versioned delegation records with effective dates, captured approvals with identity and timestamps, and exception handling with formal resolution.
The key is demonstrating that the agent operated within a controlled framework, not that a human reviewed every individual transaction. PCAOB AS 2201 (the standard governing auditor evaluation of internal controls over financial reporting) does not require human-in-the-loop on every transaction. It requires that the control environment, including the controls around any actor (human or agent) executing inside it, operates effectively. A well-designed agent control stack satisfies this requirement.
Six controls: a versioned delegation record naming an accountable human owner, scoped authority with effective dates, automated precondition enforcement, approval capture with rule reference, execution log linked to the authority record, and exception handling with formal closure. Without all six, the agent should not be in the SOX scope.
The six controls map directly to the six layers of the control stack in this article. Organizations sometimes attempt to deploy an agent into a SOX-relevant process with only the first three controls in place, planning to add the rest later. The result is a SOX-relevant action with non-defensible evidence, which is a control deficiency on the day the agent first acts. Deploy with all six or do not deploy into SOX scope.
Connect with our team for a discovery session to learn more about how Aptly can help within your organization. If you are already a client and need support, contact us here.