Technical Whitepaper · May 2026

Governed Agent Tool Audit

A StreamKernel evidence run for normalizing, enriching, governing, routing, and auditing agent tool-call events before they affect enterprise systems.
Run streamkernel_agent_tool_audit_5m
Status Completed
Runtime Java 21 · CPU JAR
Evidence 22 May 2026
Agentic AI Governance In-Process ONNX Policy + Risk Labels Kafka Evidence Topic Synthetic Safe Demo
Policy Decision Coverage
ALLOWpermitted operational path
AUDIT_ONLYinspect and retain evidence
ALERTafter-hours sensitive access
RESTRICTconstrained write path
BLOCKdisallowed tool action
ESCALATEhuman review route
25,488 governed records

Written to streamkernel-agent-tool-audit-governed with policy, risk, audit, route, model, and provenance metadata.

01

Executive Summary

Governed Agent Tool Audit demonstrates StreamKernel as an event-boundary control layer for agentic systems. The pipeline takes synthetic agent tool-call events, normalizes them into a stable event envelope, runs in-process ONNX embedding, applies deterministic OPA-style governance and risk rules, stamps provenance, and writes governed audit records to Kafka.

This is not an agent framework and does not replace LangChain, LangGraph, CrewAI, AutoGen, custom planners, memory stores, model gateways, identity systems, or human approval workflows. It sits beside them as a governed event boundary: every agent action becomes an inspectable record with policy decision, risk score, reason codes, routing label, audit record, model labels, and StreamKernel provenance.

Core thesis: enterprise AI adoption needs more than agent planning. It needs a durable governance boundary around what agents read, write, execute, escalate, and route.

25,488
Governed records written
5.21m
Actual run duration
6
Policy decisions observed
12
Kafka partitions
4
Pipeline parallelism
16
Pipeline batch size
0
Dropped / DLQ / denied counters
CPU
ONNX Runtime execution
02

The Business Problem

Agent frameworks typically focus on planning, memory, tool invocation, and orchestration. Enterprise teams also need to answer different questions: Was the tool call allowed? Was the target resource sensitive? Was the action high-risk? Was a human review required? Which policy version made the decision? Did the event carry provenance downstream?

Without a governed event boundary, each agent application tends to implement its own logging, policy, routing, and audit conventions. That creates inconsistent evidence and makes it harder for platform, security, risk, and compliance teams to reason about agent behavior across frameworks.

AI

Agent teams

Keep their planner and tool architecture, while sending tool-call events into a normalized governance stream.

SEC

Security teams

See blocked, alerted, escalated, restricted, and audit-only actions in an inspectable stream.

RISK

Risk teams

Get risk labels, reason codes, resource sensitivity, trust-tier context, and review routes.

AUD

Compliance teams

Receive an audit record for every governed agent decision, with model and policy labels attached.

03

Runtime Architecture

The run exercises a single StreamKernel pipeline profile. It uses a synthetic agent tool-call source, a stable event envelope transformer, in-process ONNX embedding, an agent tool audit transformer, and Kafka delivery to a governed evidence topic.

Figure 1 — Governed agent tool-call pipeline
SYNTHETICAGENT_TOOL_CALLS STRING_TOWIRE_EVENT DJL_EMBEDDINGONNX / CPU AGENT_TOOLAUDIT KAFKA12 partitions Evidence carried with the event agent.policy.decision · agent.risk.level · agent.risk.score · reason_codes · audit_record · streamkernel.provenance.*
LayerComponentRun value
SourceSYNTHETICGenerates agent tool events without customer systems or real logs.
NormalizeSTRING_TO_WIREEVENTCreates stable key, byte payload, and source metadata.
Model executionDJL_EMBEDDINGRuns ONNX inference in-process before governance enrichment.
GovernanceAGENT_TOOL_AUDITAdds policy decisions, risk labels, audit records, route labels, escalation details, and DLQ details.
DeliveryKAFKAWrites inspectable records to streamkernel-agent-tool-audit-governed.
Evidencestreamkernel.provenance.enabled=trueStamps config, transform, model, and run identifiers.
04

Run Configuration

The profile used parallelism 4, batch size 16, a fixed executor, disabled latency sampling, disabled cache, and Kafka sink delivery. It used MiniLM ONNX artifacts through DJL/ONNX Runtime on CPU, with a predictor pool of 4 and batch size 16.

pipeline.id=sk-agent-tool-audit
pipeline.parallelism=4
pipeline.batch.size=16
transform.chain=STRING_TO_WIREEVENT,DJL_EMBEDDING,AGENT_TOOL_AUDIT
sink.type=KAFKA
sink.kafka.topic=streamkernel-agent-tool-audit-governed
source.synthetic.text.profile=AGENT_TOOL_CALLS
ai.embedding.engine=OnnxRuntime
ai.embedding.pool.size=4
ai.embedding.batching.max.size=16
security.type=PERMIT_ALL

Security boundary: this is a reproducible local evidence profile. It uses PERMIT_ALL and PLAINTEXT Kafka in the run; production profiles should replace those with enterprise security controls.

05

Evidence Baseline

The pre-run script confirmed Kafka reachability, absence of the target topic before the run, presence of the pipeline profile, benchmark matrix row, ONNX model, tokenizer, and the expected governed-agent JSON/header contract. The post-run script confirmed the topic existed, counted records by partition, validated required headers and JSON fields, and sampled governed output records.

Evidence itemObserved value
Test namestreamkernel_agent_tool_audit_5m
Run IDrun-agent-tool-audit-01
StatusCOMPLETED
Actual duration5.21 minutes
Kafka topicstreamkernel-agent-tool-audit-governed
Total records written25,488
Kafka partitions12
Policy decision coverageALLOW AUDIT_ONLY ALERT RESTRICT BLOCK ESCALATE
Metrics snapshotPrometheus captured on port 8080
Runtime artifactCPU all-in-one JAR
Figure 2 — Evidence trail: before, run, after
PRE-RUN STATEKafka reachable · assets foundcontract explicit before runPIPELINE RUNONNX embed · audit transform300s benchmark auto-stopPOST-RUN RESULTS25,488 records writtenheaders + JSON contract verified
06

Policy Decision Coverage

The post-run evidence confirmed all six policy outcomes. This is the key buyer-facing point: the pipeline does not merely log agent actions; it classifies and routes them according to policy and risk.

DecisionRepresentative meaningRoute / outcome
ALLOWApproved operational action.Operational route.
AUDIT_ONLYLow-risk read-only action retained for audit and analytics.analytics_sink and audit_sink.
ALERTSensitive after-hours access requiring alerting.alert_topic plus audit route.
RESTRICTPrivileged write action permitted only with restrictions.restricted_operational_sink plus audit route.
BLOCKDisallowed or destructive tool action.Blocked route or DLQ-style path.
ESCALATEHigh-impact or low-confidence action requiring review.review_queue plus audit route.

Engineering takeaway: the decision is carried both in the JSON payload and in Kafka headers, allowing downstream consumers, SIEM pipelines, review queues, and analytics sinks to route without reparsing the entire event body.

07

Governed Output Contract

Each emitted record carries a normalized agent event, policy decision, risk metadata, audit record, route labels, and StreamKernel provenance. The post-run script verified required JSON fields and Kafka headers.

JSON

Payload contract

event_id, agent_id, session_id, user_id, tool, action, resource_type, policy_decision, risk_level, risk_score, reason_codes, risk_reasons, audit_record, normalized_context, target_routes.

HDR

Kafka headers

agent.policy.decision, agent.route, agent.risk.level, agent.risk.score, agent.reason_codes, agent.policy.version, agent.model.version, agent.tool, agent.action, agent.audit.required, and streamkernel.provenance.*.

{
  "event_type": "agent_tool_call_audit",
  "tool": "retrieve_account_activity",
  "action": "read",
  "policy_decision": "ALERT",
  "route": "alert_topic",
  "risk_level": "medium",
  "risk_score": 0.63,
  "reason_codes": ["AFTER_HOURS_SENSITIVE_ACTION"],
  "audit_required": true,
  "safe_synthetic_demo": true
}
08

Performance Envelope

The run processed and emitted 25,488 records with in-process ONNX embedding, deterministic agent-tool governance, Kafka delivery, and provenance stamping. The observed pipeline total maps to approximately 85 records/sec over the 300-second benchmark window, or approximately 81.5 records/sec when measured against the full 5.21-minute process window including lifecycle overhead.

~85/s
Benchmark-window throughput
1,629
Embedding batches
100%
Final batch fill metric
36.3ms
Max observed GC pause
MetricObserved value
Pipeline in / processed / out25,488 / 25,488 / 25,488
Dropped / DLQ / denied counters0 / 0 / 0
Kafka sent OK25,488
Embedding records25,488
Embedding batches1,629
Average speedometer EPS84.8 records/sec across 60 five-second windows
Observed speedometer range38.4 to 163.2 records/sec
Kafka request latency average~11.38 ms
Kafka queue time average~12.59 ms
G1 heap after GC~4.1% long-lived heap usage

Interpretation: this run is evidence of governed agent-record processing with local CPU inference. It should not be presented as a maximum throughput benchmark; the profile intentionally performs ONNX embedding and governance enrichment per event.

09

Security and Deployment Posture

The run is intentionally safe as a synthetic local demo, but not hardened as a production security profile. It uses PERMIT_ALL, PLAINTEXT Kafka, and Prometheus bound broadly without auth/TLS, which the runtime warns about. Those settings are acceptable for a local evidence run and should be replaced for enterprise demonstrations.

Use synthetic agent tool-call events; no customer logs or real agent data required.
Preserve policy, model, transform, config, source, sink, and run provenance.
Replace PERMIT_ALL with OPA or customer SecurityProvider for production profiles.
Use TLS/SASL/mTLS or secured Kafka profile instead of PLAINTEXT Kafka.
Protect Prometheus with loopback bind, auth, TLS, or network policy.
Route physical fan-out to audit, analytics, alert, review, operational, and DLQ sinks in the next increment.
10

Why It Resonates With Architects and Engineers

The architecture is not trying to replace the agent stack. It gives enterprise teams a consistent event boundary for governing tool calls across agent frameworks. That distinction matters because buyers can adopt it without rewriting their planners, chains, tools, or memory systems.

AudienceValue
AI platform teamsAdd a reusable governance boundary without rewriting agent apps.
Security teamsInspect blocked, alerted, restricted, and escalated tool calls in Kafka.
Compliance teamsReceive an audit record for every governed tool decision.
Enterprise architectsExternalize policy and provenance from individual agent frameworks.
Risk teamsRoute high-impact or low-confidence actions to review before execution.
11

Current Limits

The current runtime writes a single governed evidence topic. Records carry route labels such as operational_sink, analytics_sink, alert_topic, review_queue, restricted_operational_sink, and dlq. Physical conditional fan-out to separate operational, audit, alert, review, analytics, and DLQ sinks is the next demo increment.

The governance rules in this public run are deterministic OPA-style rules. That makes the demo reproducible and safe, but production deployments should connect live OPA/Rego policy, customer-owned risk models, enterprise identity, approval workflows, and SIEM routing as needed.

Careful claim boundary: this run proves a governed event contract and evidence topic. It does not prove production multi-tenant isolation, remote security hardening, physical multi-sink fan-out, or customer-specific policy correctness.

12

Conclusion

The governed agent tool-audit run shows StreamKernel operating as a policy-aware event boundary for agentic systems. The pipeline normalized synthetic agent tool calls, ran in-process ONNX embedding, enriched events with deterministic governance and risk metadata, stamped provenance, and wrote 25,488 governed records to Kafka.

For architects and engineers, the important result is not only the record count. The important result is the contract: every tool action becomes a durable event with a policy decision, risk label, reason codes, route, audit record, model version, policy version, and provenance headers. That is the missing operational layer many agent stacks still leave to each application team to invent.

Final takeaway: StreamKernel makes agent behavior governable at the event boundary, without forcing enterprises to standardize on one agent framework.