Score & Grade
Get a score from 1-100 with a letter grade (A+ to F). Know exactly where your agent stands.
Free benchmark tool for AI agents. Get a score from 1-100, detailed recommendations, and compare on the public leaderboard. Works with LangChain, AutoGPT, CrewAI, Claude Code, or any framework.
Your data never stored • No account needed • Results in 60 seconds
$ sudodog-benchmark
_____ __ ______ ____ ____ ____ ______
/ ___// / / / __ \/ __ \/ __ \/ __ \/ ____/
\__ \/ / / / / / / / / / / / / / / / / __
___/ / /_/ / /_/ / /_/ / /_/ / /_/ / /_/ /
/____/\____/_____/\____/\____/\____/\____/
Scanning for AI agents...
Found: langchain (PID 12847)
Monitoring agent behavior (30s)...
[████████████████████████████] 100%
Analyzing with AI...
═══════════════════════════════════════
BENCHMARK RESULTS
═══════════════════════════════════════
Score: 78/100 (B+)
+ Good error handling
+ Efficient API usage
- High memory usage detected
- Consider adding rate limiting
Full Report: https://sudodog.com/report/abc123
Leaderboard: https://sudodog.com/leaderboard
Three simple steps. No account required. Free forever.
One command. Works on Linux, macOS, and Windows.
pip install sudodog
Start your agent, then run the benchmark. It auto-detects running agents.
sudodog-benchmark
AI-powered analysis gives you a score, grade, and actionable recommendations.
Get a score from 1-100 with a letter grade (A+ to F). Know exactly where your agent stands.
See what's working well and what needs improvement. Shareable link for your team.
AI-powered analysis provides specific, actionable fixes you can implement today.
Compare your agent against others. See top performers by framework. Compete for the top spot.
Embed a badge in your GitHub README showing your agent's score. Build trust with users.
Linux, macOS, Windows. LangChain, AutoGPT, CrewAI, Claude Code, or custom agents.
Benchmark for free. Upgrade when you need continuous monitoring.
Free Forever
Test and score your AI agents
Free Beta
Continuous monitoring for teams
Install in seconds. Run your first benchmark in under a minute.
Or install via pip (all platforms):
# Install from PyPI (Linux, macOS, Windows)
pip install sudodog
# Start your AI agent first, then run:
sudodog-benchmark
# The tool will:
# 1. Detect running AI agents on your machine
# 2. Let you select which one to benchmark
# 3. Monitor it for 30 seconds
# 4. Give you a score and recommendations
# Scan for AI agents running on your machine
sudodog-scan
# Run an agent with continuous monitoring (connects to dashboard)
sudodog run python your_agent.py
# Integrate with Claude Code
sudodog integrate claude-code
Full Documentation • View on GitHub • MIT License
The benchmark gives you a snapshot. The dashboard gives you real-time visibility across all your agents, all your platforms, with security alerts and cost tracking.
Try Dashboard Free