Skip to main content
Autonomous Test Intelligence

AutoPOM-Agent

AI spider for web discovery and multi-language Playwright Page Objects

AutoPOM-Agent self-navigates your application or attaches to your existing browser to map resilient selectors, infer semantic element names, and synthesize clean POM code (Java, JavaScript, or TypeScript) with built-in selector verification and session reuse.

End-to-End Automation Intelligence

From autonomous crawl to production-ready Playwright Page Objects

CRAWLER

Autonomous & Hybrid Discovery

Explores pages autonomously or attaches to existing browser sessions (CDP/Profiles) for manual/automated hybrid workflows.

SEMANTIC

DOM + Vision Mapping

Combines compact DOM context and visual hints for accurate semantic naming such as closeModalButton and signInButton.

SELECTORS

Resilient Locator Strategy

Builds ranked fallback selectors and avoids unstable IDs/classes common in React and styled-component ecosystems.

POM

POM Synthesis

Transforms structured page models into compile-ready Playwright Page Object classes in Java, JavaScript, or TypeScript.

AI

Agentic Loop

Observation -> Thought -> Action loop orchestrated by LangChain to make context-aware navigation and extraction decisions.

HEALING

Self-Healing Verification

Tests generated selectors immediately, promotes reliable fallbacks, and re-scores confidence before persisting outputs.

NEXT GENERATION

Autonomous Test Asset Generation

AutoPOM-Agent turns application exploration into actionable test automation assets by combining browser actions, semantic reasoning, and immediate selector verification.

Semantic UnderstandingMaps meaningful names from labels, context, and icon hints instead of raw DOM noise.
Deterministic Code SynthesisGenerates consistent POM classes in Java, JavaScript, or TypeScript from a stable JSON schema contract.
Verification Before SaveSelectors are tested and healed immediately to reduce flaky generated artifacts.

Layered Architecture

Clear separation between discovery, reasoning, verification, and code generation

Layer 4 - Multi-language POM Output
BasePage · Page Objects · Reports
Layer 3 - Synthesis & Healing
Schema Mapping · Template Rendering · Selector Verification
Layer 2 - Agentic Intelligence
Observe -> Think -> Act · State Graph · Semantic Extraction
Layer 1 - Browser Runtime
BrowserUse Context · Playwright Actions · DOM & Screenshot Capture

Built for Real-World Teams

Practical defaults and extension points for enterprise automation programs

🧭

State-Aware Crawling

Route + DOM fingerprint signatures prevent repeated traversal and infinite loops in dynamic SPAs.

🧠

Token-Efficient AI

Only compact interactive context is sent to the model, reducing cost while preserving decision quality.

🧱

Schema-Driven Pipeline

A strict JSON contract decouples crawl logic from code generation for maintainable, testable architecture.

Language-Targeted Output

Generate Java, JavaScript, or TypeScript page objects with descriptive names, encapsulated locators, and intent-level methods.

🔐

Auth-Aware Discovery

Supports credentialed flows with environment-based secrets for deeper exploration of protected application areas.

📊

Actionable Reporting

Produces crawl summaries, selector confidence metrics, and generated artifacts for quick review.

Technology Stack

Composable tools optimized for quality, speed, and maintainability

PythonOrchestration Core
LangChainAgent Loop
BrowserUseBrowser Agent
PlaywrightExecution + Validation
OpenAI/GeminiVision + Reasoning
JSON SchemaIntermediate Contract
Generator CoreLanguage Rendering
JavaScript/TypeScript/JavaPOM Output
MermaidArchitecture Diagrams

Quick Start

Run your first crawl and generate language-specific page objects

# guided enterprise runner (recommended)
bash run.sh

# or run directly
PYTHONPATH=src python3 -m autopom.cli.main \
  --base-url "https://example.com" \
  --pom-language "typescript" \
  --browser-adapter "playwright"