AI security | June 2026

OpenClaw AgentShield: Apache-2.0 AI Agent Security Audit Toolkit.

A defensive scanner and policy engine for teams connecting AI agents to files, tools, memory, documents, APIs, and other agent workflows.

View GitHub repository Back to insights

Release note

Why AgentShield was built.

AI agents are increasingly connected to operational files, memory stores, browser tools, SaaS accounts, GitHub repositories, email, databases, and internal APIs. That creates a new class of risk: untrusted content can attempt to steer an otherwise legitimate agent toward unsafe tool use, data exposure, poisoned memory, or unauthorized workflow changes.

OpenClaw AgentShield is Zeberike Group's defensive answer to that problem: a local-first audit toolkit for scanning prompts, documents, logs, memory writes, MCP/tool manifests, and agent context before those inputs are trusted.

Public release

Apache-2.0 repository for developer review.

The public repository includes the scanner, policy engine, examples, JSON and Markdown report outputs, buyer-facing documentation, security disclaimer, and product cover image. It is released under Apache-2.0 so developers can inspect, adapt, and integrate the core defensive toolkit.

Open repository View product image

What it reviews

AgentShield focuses on practical AI-agent risk surfaces.

The toolkit is defensive-only. It helps teams review untrusted content and policy decisions before those inputs reach privileged agent execution.

Prompt and document risk

Flags injection, override attempts, hidden instructions, malicious tool requests, and risky document-ingestion patterns.

Tool and MCP governance

Reviews high-impact capabilities such as shell, filesystem, browser, network, email, GitHub, database, and payment access.

Memory and data-egress control

Identifies memory poisoning, sensitive-data movement, approval fatigue, autonomy risk, and weak containment boundaries.

Developer proof

Local CLI audit output for blocked risk.

AgentShield produces structured decisions such as allow, quarantine, approval_required, and block. Reports can be saved as JSON and Markdown for CI gates, release review, or internal audit evidence.

$ python3 -m agentshield.cli examples/risky-exfiltration.txt --pretty
{
  "decision": "block",
  "risk_score": 95,
  "summary": {
    "data_exfiltration": 1
  }
}

Integration pattern

Designed to sit before privileged agent context.

Scan prompts, PDFs, webpages, emails, issue comments, logs, and retrieved documents before ingestion.
Review memory writes before they become persistent agent memory.
Check MCP/tool manifests before connectors receive sensitive permissions.
Save JSON and Markdown reports for compliance, incident review, and release gates.
Use human approval gates for high-impact actions rather than promising impossible bulletproof protection.

Security maturity

Policy validation, not false certainty.

AgentShield is not marketed as an unbreakable shield or a guarantee against every zero-day, prompt injection, model failure, or malicious actor. Its value is layered risk reduction: isolation, least privilege, scanning, policy enforcement, audit logs, and human review for actions that matter.

This is the posture Zeberike Group believes serious AI-agent deployments need: practical governance before autonomy, evidence before trust, and clear boundaries before tools.