When AI agents interact with each other: are traditional IT General Controls still in control

AI agent interactions and the future of IT General Controls

AI agent interactions and the future of IT General Controls
  • 25 Feb 2026

By: Mona de Boer (Responsible AI, Advisory)

As organisations deploy autonomously collaborating AI agents, the risk profile of IT systems shifts. Are traditional IT General Controls still sufficient when digital agents influence each other, delegate tasks and make decisions within critical business processes?

When AI becomes an actor, not just a tool

Artificial intelligence is no longer confined to isolated models answering questions or classifying data. Increasingly, organisations are deploying agentic AI systems—autonomous software agents that can plan tasks, make decisions, use digital tools, and interact with other systems or agents in order to achieve defined objectives.

Unlike AI models that simply respond to prompts, AI agents can act. For example, an AI agent may retrieve financial data, draft a report, request additional analysis from another agent, validate outputs against policies, and trigger downstream workflows. When multiple agents are connected in a coordinated chain, they form what can be described as a ‘digital workforce’.

At the same time, most organisations already operate within structured control environments built around IT General Controls (ITGCs). ITGCs are the foundational controls that ensure the reliability, security, and integrity of IT systems and data. They typically cover three core areas: access management (ensuring only authorised users or systems can access data and functionality), change management (ensuring major and minor system changes are tested, approved, and documented), and IT operations controls (ensuring systems are stable, monitored, logged, and resilient).

These controls are the backbone of e.g. management reporting reliability, cybersecurity and regulatory compliance. However, as organisations move from traditional IT systems to interconnected AI agents that reason and act (semi-)autonomously, an important question arises: When AI agents start interacting with each other, are traditional IT General Controls still enough?

What changes in a multi-agent world?

Traditional IT systems are largely deterministic. Risks typically arise from unauthorised access, uncontrolled changes, or operational failure. Multi-agent AI systems introduce dynamic autonomy and emergent behaviour. For example, in an AI-driven procurement process, one agent analyses demand, another researches suppliers, and a third negotiates contract clauses. Individually, each agent may function correctly. However, interaction effects may cause e.g. compliance requirements or ESG clauses to be unintentionally deprioritised. The resulting contract may not reflect the organisation’s intended risk posture—even though no access violation occurred.

Where traditional ITGCs still hold strong

Access management remains foundational

Agents rely on Application Programming Interfaces (APIs), databases and enterprise systems. Traditional controls such as role-based access and least privilege (access permissions granted to the minimum required) remain critical. For example, if a reporting agent preparing quarterly management reports has unrestricted write access to financial ledgers, an erroneous interpretation could overwrite sensitive data. Strong ITGC environments restrict permissions and enforce separation between preparation and approval system roles.

Change management is more important than ever

Prompts, orchestration logic (the system-level coordination of multiple autonomous AI agents to solve complex, multi-step tasks that single models cannot handle alone), and model updates directly influence behaviour. For example, imagine an AI agent assisting with regulatory reporting. If its system instructions are simplified from “include all potentially material disclosures” to “prioritise concise reporting,” the agent may begin omitting borderline disclosures that compliance teams would normally include. Prompts and AI policies must therefore be formally version-controlled and subject to structured approval processes.

IT operations controls still matter

Infrastructure resilience, monitoring and logging remain essential. However, while logs may show that an agent called a particular tool, they may not explain why a decision was made. This limitation becomes more pronounced as multiple agents interact.

Where traditional ITGCs should be augmented

Traditional ITGCs focus on who accessed a system and whether changes were authorised. They do not typically assess whether autonomous behaviour remains aligned with organisational intent. For example: In a customer complaint workflow, one agent categorises complaints, another drafts responses, and a third approves communications. A subtle misclassification at the first stage could cascade into inappropriate responses at scale. All controls may appear compliant—yet the outcome may still create reputational risk.

Additionally, system-level risks may emerge. For example, two financial modelling agents may iteratively refine forecasts, gradually amplifying optimistic assumptions. Individually logical decisions may collectively distort risk assessments.

Periodic reviews are often insufficient for systems that adapt in real time. A treasury agent adjusting strategies based on live market data requires runtime monitoring rather than quarterly review alone.

Augmenting ITGCs for the agentic era

In practice, organisations do not need to replace traditional ITGCs. They need to extend them in targeted and pragmatic ways. The objective is not to over-engineer controls, but to ensure that autonomous systems remain aligned with business intent, risk appetite and regulatory obligations.

Below are four concrete areas for action:

Multi-agent environments should not evolve organically without visibility. Just as organisations maintain user directories and application inventories, they should maintain an agent inventory.

In practice, this means:

  • Assigning each agent a unique digital identity (not shared service accounts).
  • Registering each agent in a central catalogue with, among other things: 
    • Defined purpose
    • Responsible business owner
    • Approved tool access
    • Risk classification (low, medium, high impact)
  • Defining explicit capability boundaries, e.g.:
    • “May retrieve data but not execute transactions”
    • “May draft content but not publish externally”
  • Enforcing least privilege at API and database level.

The above is an extension of identity and access management (IAM) as well as a way to ensure data access is traceable and aligned with policies. 

Traditional monitoring focuses on system health and security events. Multi-agent systems require monitoring of behavioural patterns. For example: A pricing agent consistently approves discounts just below approval thresholds. No single transaction breaches policy, but the pattern suggests control circumvention.

Concretely, organisations can implement:

  • Automated detection of:
    • Policy violations in outputs (e.g. prohibited disclosures)
    • Deviations from approved risk thresholds
    • Unusual delegation chains between agents
  • Drift monitoring:
    • Has output tone, risk tolerance or decision pattern changed over time?
  • Escalation triggers:
    • E.g. if an agent overrides a validation agent more than X times in 24 hours, trigger review.

This type of behavioural insight does not emerge from standard IT logs. It requires monitoring built with purposes and aligned to business risk indicators. This means expanding monitoring dashboards beyond uptime and performance to include AI behaviour KPIs.

Most organisations test individual systems before release. Multi-agent environments require testing of interaction dynamics. For example: In a finance setting, simulate conflicting signals (e.g. revenue growth vs. liquidity pressure) and observe how forecasting and strategy agents interact. Does the system amplify optimism or escalate to human oversight?

Concrete actions include:

  • Multi-agent scenario simulations:
    • What happens if upstream data is ambiguous?
    • What if two agents interpret a policy differently?
  • Red teaming:
    • Intentionally attempt to trigger policy violations across agent chains.
  • Stress testing recursive interactions (i.e. agents calling each other repeatedly in a loop):
    • Do agents enter unstable feedback loops?
    • Do small biases gradually escalate into material risk exposure?
  • Tool misuse simulation:
    • What happens if an agent calls an unintended but accessible tool?

The above means AI testing must become part of standard release governance and not a one-off review exercise.

Not every AI action requires human approval. But oversight must be proportionate to impact.

A practical model to address this issue is risk-tiered supervision:

  • Human-in-the-loop (pre-approval required)
    High-impact decisions such as:
    • Regulatory filings
    • Financial postings
    • Customer-facing contractual commitments
  • Human-on-the-loop (real-time monitoring with escalation)
    Medium-risk activities such as:
    • Internal reporting
    • Pricing recommendations
    • Procurement assessments
  • Fully automated with post-hoc review
    Low-risk operational tasks.

In this model it is critical that escalation triggers are explicit:

  • If output confidence drops below threshold → escalate.
  • If financial exposure exceeds X → require approval.
  • If an agent deviates from historical behaviour patterns → flag for review.

The risk-tiered supervision model ensures governance scales with impact (i.e. proportionate) — without paralysing automation.

Practical first steps for organisations

Organisations beginning to deploy multi-agent AI do not need to redesign their entire control framework at once. Practical first steps can significantly increase governance maturity:

  • Map all AI agents currently active in production environments.
  • Assign a named business owner for each agent.
  • Review whether agent access rights reflect least privilege principles.
  • Select one high-impact use case and introduce behavioural monitoring.
  • Integrate multi-agent interaction testing into the next release cycle.

Governance maturity can evolve incrementally; it does not require a full redesign on day one. Understanding the current maturity stage helps prioritise control investments and avoid over- or under-engineering governance measures. The following maturity stages can be defined and applied in doing so:

  1. Ad-hoc agent deployment – limited visibility and local experimentation.
  2. Controlled agent expansion – formalised identity management and basic monitoring.
  3. Governed multi-agent ecosystem – system-level testing, behavioural KPIs and risk-tiered oversight embedded in standard governance processes.

In most organisations, this evolution does not require entirely new governance structures. Existing bodies—such as change advisory boards, risk committees, data governance councils, and internal audit—can extend their mandate to include oversight of agentic AI systems.

A practical governance test is simple: if an AI agent were to make a materially wrong decision tomorrow, is it immediately clear who in the organisation is accountable? If ownership is ambiguous, the control framework is yet incomplete.

Robust governance should not be seen purely as a defensive measure. Clear guardrails, behavioural monitoring and defined oversight structures reduce incident risk, increase control/confidence and accelerate regulatory acceptance. In practice, mature control environments often enable faster scaling of AI initiatives rather than slowing them down.

Conclusion

Multi-agent AI does not make traditional ITGCs obsolete. Access management, change management and operational controls remain essential foundations. However, when AI agents interact and influence each other within critical business processes, risk increasingly arises from dynamic behaviour and system-level effects rather than isolated control failures.

Organisations must therefore augment—not replace—their ITGC frameworks. By formalising agent governance, introducing behavioural monitoring, strengthening system-level testing and applying proportionate oversight, they can align autonomy with accountability. The challenge is not whether AI agents will collaborate, but whether governance will evolve fast enough to ensure that collaboration remains lawful, ethical, robust, and value-generating.

Contact us

Mona de Boer

Mona de Boer

Partner, PwC Netherlands

Tel: +31 (0)61 088 18 59

Follow us