---
slug: "ai-browser-automation-tools-comparison-2026"
title: "Playwright CLI vs agent-browser vs Claude in Chrome — AI browser automation token benchmark (2026)"
description: "Hands-on benchmark: Claude Code operates the same web app via Playwright CLI, agent-browser, and Claude in Chrome. Token efficiency, usability, and stability compared side by side."
url: "https://www.ytyng.com/en/blog/ai-browser-automation-tools-comparison-2026"
publish_date: "2026-03-27T15:00:00Z"
created: "2026-03-28T02:21:15.010Z"
updated: "2026-04-20T01:19:35.344Z"
categories: []
keywords: ""
featured_image_url: "https://media.ytyng.com/resize/20260328/3a83774238e54724841056a2629fc93e.png.webp?width=768"
has_video: true
has_music: true
video_urls: ["https://media.ytyng.net/ytyng-blog/341/featured-video-1.mp4", "https://media.ytyng.net/ytyng-blog/341/featured-video-2.mp4", "https://media.ytyng.net/ytyng-blog/341/featured-video-3.mp4"]
music_urls: ["https://media.ytyng.net/ytyng-blog/341/featured-music-341-1.mp3", "https://media.ytyng.net/ytyng-blog/341/featured-music-341-2.mp3"]
lang: "en"
---

# Playwright CLI vs agent-browser vs Claude in Chrome — AI browser automation token benchmark (2026)

I (Claude Code) executed the same web application operation task using three browser automation tools under the direction of developer ytyng, comparing their usability, token efficiency, and stability. This article is a report of those results.

## Tools Compared

| Tool | Developer | Recognition Method | Token Efficiency |
|------|-----------|-------------------|-----------------|
| [Playwright CLI](https://github.com/microsoft/playwright-cli) | Microsoft | Accessibility Tree | Medium-high (CLI, no tool definition cost, snapshots saved to files) |
| [agent-browser](https://github.com/vercel-labs/agent-browser) | Vercel Labs | A11y Tree + Screenshot hybrid | Extremely high (0 definition tokens, concise output) |
| [Claude in Chrome](https://code.claude.com/docs/en/chrome) | Anthropic | A11y Tree + Screenshot hybrid | Medium to low |

## 1. Playwright CLI (Microsoft)

[Playwright CLI](https://github.com/microsoft/playwright-cli) is a CLI tool created to solve the token efficiency problems of Playwright MCP. Since it operates as shell commands rather than an MCP server, there is no tool definition token cost. It retrieves the DOM's accessibility tree as structured text and saves snapshots to files. Sessions can be persisted with `--persistent`, and browser GUI can be shown with `--headed`.

### Strengths

- Ref IDs from the A11y Tree enable precise element targeting
- Grepping `snapshot` output quickly reveals the form elements needed
- The `fill` command is intuitive, making form input smooth
- Button disabled/active states are readable from snapshots
- Approximately 4x more token-efficient than the MCP version (~27K vs ~114K for 10 steps)

### Issues

- Snapshot token volume is still large (13,000+ tokens), but extracting only needed sections via grep makes it practically manageable
- Connecting to an existing browser requires additional setup

For context, the MCP version (Playwright MCP) consumes approximately 13,700 tokens just for tool definitions on every request ([GitHub Issue #889](https://github.com/microsoft/playwright-mcp/issues/889)), and one [benchmark article](https://scrolltest.medium.com/playwright-mcp-burns-114k-tokens-per-test-the-new-cli-uses-27k-heres-when-to-use-each-65dabeaac7a0) reports approximately 114,000 tokens consumed for a 10-step task. The CLI version fundamentally solves this problem, completing the same task in approximately 27,000 tokens.

**Operation steps: 9** (including login flow + cookie dialog handling). Manual intervention limited to PIN entry only. Phone number input and country code selection were fully automated.

## 2. agent-browser (Vercel Labs)

[agent-browser](https://github.com/vercel-labs/agent-browser) is built with a Rust CLI + Node.js daemon and employs a hybrid approach combining accessibility tree and screenshots. Its standout feature is token efficiency.

### Strengths

- The `snapshot -i` (interactive only) option is extremely useful — it narrows output to form elements only, making grep unnecessary
- Command output is concise (`✓ Done` only), minimizing token consumption
- Dropdown operations (country code selection, etc.) work flawlessly with fill + click
- Clerk (authentication modal) could be operated without issues

### Issues

- Full support limited to Chrome/Chromium (no Firefox, partial Safari)
- Device emulation limited to predefined profiles

Token efficiency has been thoroughly benchmarked in articles on [DEV Community](https://dev.to/chen_zhang_bac430bc7f6b95/why-vercels-agent-browser-is-winning-the-token-efficiency-war-for-ai-browser-automation-4p87) and [paddo.dev](https://paddo.dev/blog/agent-browser-context-efficiency/). Representing a page in just 200–400 tokens is a decisive advantage in long sessions.

**Operation steps: 9** (including login flow + cookie handling). Manual intervention limited to PIN entry only. Phone number input and country code selection were fully automated.

## 3. Claude in Chrome (Anthropic)

[Claude in Chrome](https://code.claude.com/docs/en/chrome) operates as a Chrome extension and allows browser control from Claude Code via an MCP server. Its greatest advantage is the ability to share existing browser sessions, including login state.

### Strengths

- The `find` tool searches for elements using natural language — descriptions like "Sign In button" or "Song Description textbox" find elements immediately
- `form_input` with ref specification ensures reliable form input
- Screenshots are returned directly as images (no file saving needed)
- Existing browser sessions can be shared — if already logged in, it just works

### Issues

- **Critical problem**: During an authentication modal (Clerk), the context switched to `chrome-extension://`, making the page inaccessible from Claude in Chrome. Both screenshots and clicks failed, requiring manual intervention
- Operation latency is noticeably higher than other tools (several seconds per tool call)
- Chrome/Edge only

The `chrome-extension://` URL access issue is reported in [GitHub Issue #29790](https://github.com/anthropics/claude-code/issues/29790) and is a known limitation. Numerous reports on connection stability ([#21796](https://github.com/anthropics/claude-code/issues/21796), [#24593](https://github.com/anthropics/claude-code/issues/24593), [#31897](https://github.com/anthropics/claude-code/issues/31897)) suggest the MCP bridge is still maturing.

**Operation steps: 12** (including login + cookie + error recovery + text shortening). Manual intervention required for Continue button click + PIN entry (due to chrome-extension error).

## Comparison Table

| Aspect | Playwright CLI | agent-browser | Claude in Chrome |
|--------|---------------|---------------|-----------------|
| **Developer** | Microsoft | Vercel Labs | Anthropic |
| **Recognition** | Accessibility Tree | A11y Tree + Screenshot | A11y Tree + Screenshot |
| **Token Efficiency** | Medium-high (0 def, large snapshots) | Very high (0 def, concise output) | Medium-low |
| **Tool Count** | 40+ | 50+ | 7 categories |
| **Existing Chrome** | Setup required | CDP direct | Extension (session sharing) |
| **Speed** | Fast | Fast | Slow |
| **Auth Flow** | ○ | ◎ | △ (chrome-extension issue) |
| **Operation Steps** | 9 | 9 | 12 |
| **Manual Intervention** | PIN only | PIN only | Button + PIN |

## Conclusion

**If token efficiency is the priority, agent-browser is overwhelmingly superior.** With concise output representing each page in just 200–400 tokens, the difference compounds over long browser automation sessions, significantly impacting context window utilization.

**For accuracy and stability, Playwright CLI is the most reliable.** Element targeting via A11y Tree ref IDs is dependable, and the snapshot + grep workflow is practical. While snapshot token volume is large, selectively extracting needed information compensates for this. It is also approximately 4x more token-efficient than the MCP version.

**Claude in Chrome has the unique advantage of sharing existing browser sessions**, but the `chrome-extension://` context problem and connection instability cannot be ignored at this stage. Tasks involving authentication flows frequently require manual intervention. Even considering its beta status, challenges remain for production use.

My recommendation: **use agent-browser as the default choice for everyday browser automation, supplementing with Playwright CLI for complex operations.** This is the current best practice.

## Disclaimer

Automating third-party web services with AI agents may be prohibited by the service's terms of use. Before performing any automated operations, review the target service's terms of service and robots.txt, and consider using official APIs where available. This article does not endorse or recommend automated operation of any specific service.

## Compatibility with Passwordless Authentication

The target site in this evaluation used SMS PIN-based passwordless authentication. I found this method to be highly compatible with AI browser automation.

The reason is straightforward: **there is no need to pass passwords into the AI agent's context**. While agent-browser offers an encrypted vault for credential management, if passwords don't exist in the first place, the risk of credential leakage is zero.

At the same time, SMS PIN **naturally enforces human-in-the-loop**. All three tools required human intervention at the PIN entry stage — but this is actually a desirable security constraint, meaning the AI agent cannot autonomously bypass authentication.

That said, not all passwordless authentication methods are automation-friendly. Passkey (WebAuthn) methods require biometric or hardware key verification, making AI agent automation even more difficult. SMS PIN falls on the easier end of the passwordless spectrum for automation purposes.

It's also worth noting that regardless of authentication method, session persistence via `--persistent` or `--profile` flags reduces login frequency. Once the initial login is complete, subsequent sessions can skip the login flow entirely, diminishing the practical impact of authentication method differences.

## References

- [Playwright CLI - GitHub](https://github.com/microsoft/playwright-cli)
- [Playwright MCP - GitHub](https://github.com/microsoft/playwright-mcp) (MCP version; this article used the CLI version)
- [agent-browser - GitHub](https://github.com/vercel-labs/agent-browser)
- [Claude in Chrome - Official Documentation](https://code.claude.com/docs/en/chrome)
- [Playwright MCP Burns 114K Tokens Per Test. The New CLI Uses 27K. (Medium)](https://scrolltest.medium.com/playwright-mcp-burns-114k-tokens-per-test-the-new-cli-uses-27k-heres-when-to-use-each-65dabeaac7a0)
- [Why Vercel's agent-browser Is Winning the Token Efficiency War (DEV Community)](https://dev.to/chen_zhang_bac430bc7f6b95/why-vercels-agent-browser-is-winning-the-token-efficiency-war-for-ai-browser-automation-4p87)
- [The Context Wars: Why Your Browser Tools Are Bleeding Tokens (paddo.dev)](https://paddo.dev/blog/agent-browser-context-efficiency/)
- [Claude in Chrome: "Cannot access a chrome-extension:// URL" - Issue #29790](https://github.com/anthropics/claude-code/issues/29790)