WebMCP is live in Chrome 146

How Ready Is Your
Website for AI Agents?

Submit your site and our experts will analyze your AI agent readiness, then walk you through personalized recommendations on a free call — no strings attached.

Get Your Free Analysis

Our experts will analyze your AI agent readiness and walk you through personalized recommendations — completely free.

Prefer the terminal?$ npx web-mcp audit yoursite.com

5scoring dimensions

3AI models tested

Freeexpert analysis

Scoring Framework

Five Dimensions. One Score.

Your Agent Readiness Score is a weighted composite of five categories — each mapped directly to the WebMCP protocol specification. No black boxes. Every point measures something real.

What We Check

Presence of navigator.modelContext.registerTool() calls or declarative HTML form attributes (toolname, tooldescription)
inputSchema completeness: every parameter has a type, description, and appropriate constraints
Description field quality: specific action verbs, positive framing, clear scope
Required vs. optional parameter classification
provideContext() usage for dynamic tool availability in SPAs
Feature detection guards: if ("modelContext" in window.navigator)

Business Impact

If this score is low, AI agents literally cannot see or use your tools. It's like having a store with no door.

7 checks · 7 in Quick Score · 0 Deep Score only

What We Check

50–200 natural language prompts generated per tool, covering formal, casual, multilingual, and ambiguous phrasings
Each prompt tested against Gemini, GPT, and Claude simultaneously via function-calling
Tool routing accuracy: did the model pick the correct tool?
Parameter extraction accuracy: did it fill the right values?
Hallucination rate: did the model invent parameters not in the prompt?
Disambiguation: when tools have similar descriptions, did the model get confused?

Business Impact

This is the difference between "my tools exist" and "my tools actually get used." A site with perfect Implementation but 40% Prompt Coverage means agents find your tools but fail to use them correctly 60% of the time.

6 checks · 0 in Quick Score · 6 Deep Score only

What We Check

Tool poisoning / injection surface: scans descriptions for hidden instructions, Unicode tricks, and unusually long descriptions
Over-parameterization: classifies parameters by sensitivity and flags unnecessary High/Critical parameters
Misrepresentation of intent: detects gaps between tool description and execute function behavior
Missing requestUserInteraction(): flags destructive or financial tools without human approval
Output injection risk: checks if execute functions return unsanitized user-generated content

Business Impact

A single security vulnerability can expose customer data, enable unauthorized purchases, or allow prompt injection attacks. For regulated industries, this component determines whether compliance teams approve WebMCP adoption.

5 checks · 5 in Quick Score · 0 Deep Score only

What We Check

Error handling: execute functions return structured error content, not unhandled exceptions
Execution timing: P50/P95/P99 response times. Tools >3 seconds degrade agent experience
SubmitEvent.agentInvoked handling: forms detect agent-initiated submissions correctly
CSS pseudo-class support: :tool-form-active and :tool-submit-active for visual feedback
toolactivated / toolcancel event handling for lifecycle management
Graceful degradation: tools function when WebMCP is unavailable

Business Impact

Reliability failures are invisible until they happen in production. A 5% failure rate across thousands of daily agent interactions means hundreds of broken experiences.

6 checks · 5 in Quick Score · 1 Deep Score only

What We Check

Naming conventions: specific action verbs (search-flights, book-hotel) not vague ones (handle-request)
Execution vs. initiation clarity: description states whether tool acts immediately or starts a process
Atomic tool design: each tool does one thing. Complex operations composed from multiple tools
Schema design: tools accept raw user input (city names, not airport codes)
Annotation completeness: readOnlyHint, destructiveHint, idempotentHint, openWorldHint
Context management: provideContext() used on state transitions, stale tools cleaned up

Business Impact

Best practices don’t affect whether your tools work today — they affect whether they work well across different AI models, whether they’re maintainable as your site evolves, and whether agents can compose them into multi-step workflows.

7 checks · 7 in Quick Score · 0 Deep Score only

Illustrative example — scores vary by site

Each component is scored independently on a 0–100 scale, then combined using the weights above to produce your composite Agent Readiness Score.

Score Interpretation

What Your Score Means

Your score isn’t just a number — it predicts how AI agents will behave on your site. Lower scores mean agents struggle, fail, or choose your competitors instead.

0–25

26–50

51–75

76–90

91–100

Not Ready0–25

What It Means

No meaningful WebMCP implementation. No registerTool() calls detected, no declarative form attributes, or critical schema errors prevent any tool from being usable.

What Agents Do

Agents bypass your site entirely. When a user asks their AI assistant to interact with your service, the agent either scrapes your UI (unreliable, slow) or routes to a competitor that has proper tools.

What You Should Do

Install the Web-MCP CLI (npx webmcp-cli audit) and follow the Quick Start guide. Most sites jump to 40+ within a day of adding basic tool registrations.

Basic26–50

What It Means

Tools exist but have significant gaps. Common issues: missing parameter descriptions, vague tool names, incomplete schemas, no error handling in execute functions.

What Agents Do

Agents find your tools but frequently fail. They select the wrong tool, fill parameters incorrectly, or encounter errors mid-task. Users retry manually.

What You Should Do

Focus on the recommendations in your score report. Most issues are fixable in hours: add descriptions, specify parameter types, use explicit naming. Run npx webmcp-cli lint for instant wins.

Functional51–75

You

What It Means

Solid foundation. Tools work for common use cases. Gaps appear in edge cases: multilingual prompts, ambiguous requests, adversarial inputs. Security may have unreviewed exposure.

What Agents Do

Agents succeed on straightforward tasks but fail on complex or unusual requests. You’re functional but not optimized — agents may prefer a competitor’s tools when both are available.

What You Should Do

Run multi-model prompt coverage testing to find specific phrasings that fail. A/B test your tool descriptions. Review security scan findings. You’re ahead of ~60% of sites.

Strong76–90

What It Means

Well-implemented across all five dimensions. High prompt coverage across models, solid security posture, reliable execution. Top quartile.

What Agents Do

Agents reliably complete tasks on your site. In competitive scenarios, you win most of the time. Multi-step workflows succeed consistently.

What You Should Do

Use competitive benchmarking to track your position vs. specific competitors. Set up continuous monitoring to catch regressions. Display your Agent Readiness Badge.

Excellent91–100

What It Means

Top-tier implementation. Agents prefer your tools over alternatives. Comprehensive coverage across all models, languages, and edge cases. Robust security. Exemplary spec compliance.

What Agents Do

Agents actively prefer your site. When presented with similar tools from multiple sources, agents select yours due to superior descriptions, schema quality, and reliability history.

What You Should Do

You’re setting the standard for your industry. Publish your score. Get Gold Certified. Monitor for regressions with CI/CD integration.

Your score updates every time you scan. Consistent implementation work typically leads to meaningful improvement over time.

Two Tiers

Free Instant Audit. Or the Full Picture.

Quick Score tells you where you stand. Deep Score tells you exactly why — and precisely how to win.

Quick Score

~60 seconds

Free

Static analysis only. Loads your page in a headless browser, detects the WebMCP API, parses tool registrations, validates schemas, runs security heuristics, and checks spec compliance. No LLM calls.

•Full Implementation analysis
•Estimated Prompt Coverage from heuristics
•Static injection surface scan
•Inferred Reliability from code patterns
•Full Best Practices check
•General improvement guidance
•Your score vs. industry average
•Single page scan

Deep Score

3–8 minutes

Included in Pro ($79/mo)

Everything in Quick Score plus: generates 50–200 natural language prompts per tool, tests each across Gemini, GPT, and Claude simultaneously, measures routing accuracy and hallucination rates, and executes tools in a sandboxed browser.

•Full Implementation analysis
•Actual LLM testing across 3 models
•Full security scan + dynamic analysis
•Actual tool execution in sandbox
•Full spec check + behavioral verification
•Code-level fixes with before/after examples
•Industry percentile + competitor comparison
•Multi-page crawl across your site
•Full history with trend analysis and alerts

Dimension

Quick Score

Deep Score

Implementation

Full analysis

Prompt Coverage

Estimated from description quality heuristics

Actual LLM testing across 3 models with real prompts

Security

Static injection surface scan, over-parameterization detection

Full scan + dynamic analysis of execute function behavior

Reliability

Inferred from code patterns (error handling, timeout patterns)

Actual tool execution in sandboxed browser — real error rates, real timing

Best Practices

Full spec compliance check

Full check + cross-model behavioral verification

Recommendations

General guidance (“Add descriptions to 3 tools”)

Specific code-level fixes with before/after examples

Benchmarking

Your score vs. industry average

Industry percentile + specific competitor comparison

Scan Scope

Single page (the URL you enter)

Multi-page: crawls your site for all tool registrations

87% of users who run a Quick Score come back for the Deep Score within a week.

Full Transparency

How We Calculate Your Score

No black box. Every check maps to the WebMCP specification. Here’s exactly what we measure, how we measure it, and why it matters.

Implementation

25% of total score

Check	What We Check	How	Impact	Tier
IMPL-01	WebMCP API presence	Load page in headless Chromium. Check "modelContext" in window.navigator. If absent, Implementation score caps at 0.	Critical — gating check	Free
IMPL-02	Tool registration method	Detect navigator.modelContext.registerTool() calls (imperative) and/or HTML form elements with toolname + tooldescription attributes (declarative).	High — at least one method required	Free
IMPL-03	inputSchema completeness	Validate inputSchema against JSON Schema draft-07. Check that every property has type and description. Flag missing required array, constraints.	High — per-parameter scoring	Free
IMPL-04	Description quality	NLP analysis: length (>20 chars), specificity score, action verb presence, positive framing, scope clarity.	Medium — impacts Prompt Coverage heavily	Free
IMPL-05	Parameter descriptions	Each parameter has toolparamtitle and toolparamdescription (declarative) or description field (imperative). Includes expected format.	Medium	Free
IMPL-06	provideContext() usage	For SPAs: detect if provideContext() is called on route changes. Flag stale tool registrations.	Medium — critical for SPAs	Free
IMPL-07	Feature detection	Check for if ("modelContext" in window.navigator) or equivalent guard before API calls.	Low — critical for production	Free

7 checks total

7 in Quick Score0 Deep Score only

Spec references link to the WebMCP protocol documentation when publicly available.

Projected Benchmarks

Your Score in Context

Scores shown are projected benchmarks based on our scoring methodology applied to representative sites in each industry. Actual industry averages will update as more sites are scored.

E-Commerceprojected

12avg 52leader 89

Travel & Bookingprojected

8avg 48leader 94

SaaS / B2Bprojected

5avg 41leader 86

Financial Servicesprojected

3avg 38leader 82

Healthcareprojected

2avg 35leader 78

Restaurantsprojected

4avg 29leader 71

Real Estateprojected

6avg 33leader 74

Media & Publishingprojected

7avg 37leader 80

Why Tool Quality Matters

When an AI agent has access to tools from multiple sites simultaneously — your flight search and a competitor’s — it doesn’t flip a coin. It evaluates tool quality: description clarity, schema completeness, parameter precision, and historical reliability.

Better-defined tools are more likely to be selected by AI agents.

In preliminary testing, LLMs consistently prefer tools with clearer descriptions and better parameter naming. Higher-scored implementations correlate with more reliable agent interactions.

Benchmarks are illustrative projections, not aggregate data from scored sites.

Continuous Monitoring

Track Every Point of Progress

Your score isn’t a one-time snapshot. It’s a living metric that updates as you implement changes, and alerts you the moment something regresses.

Web-MCP Dashboard — yoursite.com

73/100

+12this month

Illustrative example

Score Trend (8 Weeks)

Implementation+24

61→85

Prompt Coverage+27

45→72

Security+4

78→82

Reliability+16

52→68

Best Practices+21

40→61

Milestones

Gain Week 1Added descriptions to all tools+12 pts

Gain Week 2Fixed schema validation issues+8 pts

Gain Week 3Added requestUserInteraction() to checkout+5 pts

Regress Week 4CMS update broke 2 tool descriptions-3 pts

Gain Week 4Detected and fixed within 2 hours via alert+3 pts

Gain Week 6Optimized descriptions for prompt coverage+7 pts

Automatic Rescanning

Score recalculated weekly (Pro) or on-demand. Tracks every change.

Regression Detection

If your score drops even 1 point, you get an alert explaining what changed.

Milestone Tracking

Every score change logged with the cause and exact point impact.

Component Trends

Five independent trend lines show which areas are improving or stagnating.

Your score updates every time you scan. Track improvements as you implement changes and optimize tool definitions.

Trend data shown above is an illustrative example, not real aggregate data.

Actionable Fixes

Your Score Comes With a Roadmap

Every point deducted maps to a specific issue with a specific fix. We don’t just tell you what’s wrong — we tell you exactly how to make it right.

Fix all 3 issues→73 → 102

+29 points recoverable

Example recommendations from an illustrative site audit

Critical2 issues · -25 points

Medium1 issue · -4 points

Public Proof

Certify Your Agent Readiness

Display your score on your website. Show visitors, partners, and AI agents that your site is built for the agentic web.

Embed code — Certified Pro

<!-- Web-MCP Certified Pro Badge -->
<a href="https://web-mcp.net/score/yoursite.com"
   target="_blank" rel="noopener">
  <img src="https://badge.web-mcp.net/yoursite.com"
       alt="Agent Readiness Score: Certified Pro"
       width="200" height="40" />
</a>

For Your Site

Signals to partners and customers that you're prepared for the AI agent era. Differentiates you from competitors.

For Agencies

Every client site with a badge is a portfolio piece. Certification becomes a tangible deliverable you can offer clients.

For the Ecosystem

Public scores create accountability and transparency. Certified sites can be listed in the Discovery Hub.

Developer-First

Your Score, Wherever You Work

Terminal. Browser. CI pipeline. REST API. The Agent Readiness Score integrates into every developer workflow.

CLI — Agent Readiness Score

$ npx webmcp-cli audit https://yoursite.com

  Web-MCP.net v1.0.0 — Agent Readiness Score

  Scanning https://yoursite.com...
  ✓ WebMCP API detected (navigator.modelContext)
  ✓ Found 7 registered tools (5 imperative, 2 declarative)
  ✓ Validating schemas against JSON Schema draft-07...
  ✓ Running security analysis...
  ✓ Checking best practices compliance...

  ┌───────────────────────────────────────────┐
  │  AGENT READINESS SCORE: 73 / 100          │
  │  ══════════════════════════════          │
  │  Implementation:  85  ████████░░   (+12)  │
  │  Prompt Coverage: 72* ███████░░░   (est)  │
  │  Security:        68  ██████░░░░          │
  │  Reliability:     71  ███████░░░          │
  │  Best Practices:  61  ██████░░░░          │
  │                                             │
  │  Industry: E-Commerce  •  Top 22%          │
  │  3 Critical issues found                    │
  └───────────────────────────────────────────┘

Install globally or use npx. Get your score from the terminal in seconds.

Capabilities

npx webmcp-cli audit — Full Quick Score
npx webmcp-cli lint — Schema linting with auto-fix
npx webmcp-cli audit --deep — Deep Score with LLM testing
npx webmcp-cli compare <url> — Side-by-side comparison
npx webmcp-cli ci --min-score 75 — CI/CD mode

All interfaces share the same scoring engine. Your score is consistent whether you check from the terminal, browser, or API.

For You

Built for Every Team That Touches the Web

For Developers & Engineering Teams

Your Question

“What exactly does the score measure, and can I trust it?”

The Answer

Every check maps to the WebMCP specification. We validate your registerTool() calls, test your inputSchema against JSON Schema draft-07, scan for injection vulnerabilities documented in the WebMCP Security spec, and test prompt routing across Gemini, GPT, and Claude.

Your Workflow

1npx webmcp-cli audit https://yoursite.com
2npx webmcp-cli lint
3npx webmcp-cli ci --min-score 75

Features That Matter

Full methodology transparency
CLI + API + CI/CD integration
Code-level recommendations with before/after
Schema linting with auto-fix suggestions

For CTOs, VPs & Executives

Your Question

“How do we compare to competitors, and what’s the business case?”

The Answer

Your Agent Readiness Score determines whether AI agents succeed on your site — or route to competitors. We benchmark you against your industry vertical and show your percentile ranking. Higher scores mean more agent traffic and more conversions.

Your Workflow

1Run a Quick Score on your site and top 3 competitors
2Review industry benchmark report
3Present prioritized roadmap to engineering team

Features That Matter

Industry benchmarks & competitive ranking
Score trends with regression alerts
Revenue attribution in Analytics module
Exportable reports for board presentations

For Agencies & Consultants

Your Question

“Can I use this to win and deliver client engagements?”

The Answer

The Agent Readiness Score is your client deliverable. "Before Web-MCP: 18. After: 83." That screenshot goes on the invoice.

Your Workflow

1Scan client site → Show them their score
2Present recommendations → Scope the engagement
3Implement fixes → Rescan → Show improvement
4Install badge → Set up monitoring → Retainer

Features That Matter

Score as sales tool (scan prospects for free)
Exportable PDF reports (white-label ready)
Badge on every client site (portfolio + backlinks)
Bulk scanning across client portfolio

WebMCP Is Live in Chrome 146.
See Where Your Site Stands.

Get a free expert analysis of your site's agent readiness and actionable recommendations to improve.

Get Your Free Analysis

Submit your site and book a free walkthrough call with our team.

$ npx web-mcpaudit yoursite.com

Install the browser extension. See scores as you browse.