2026-04-24· 5 min read

Verification, not detection.

Inside the 4-tier trust ladder, the 3-axis classifier, and the policy engine that turns observation into action.

Most bot-detection tools try to answer one question: is this a bot? A yes/no. That question made sense when there was one kind of bot (scrapers) and one response (block). That era ended.

In 2026 your traffic includes:

Real humans in browsers, in AI-agent browsers, on mobile apps.
Search crawlers that index you — Googlebot, Bingbot, Applebot.
AI training crawlers that scrape you to feed their models — GPTBot, ClaudeBot, Bytespider, CCBot, Meta-ExternalAgent.
AI answer crawlers that cite you in real-time AI search — PerplexityBot, OAI-SearchBot.
AI user-agents that are humans in the loop, doing a task on your site — ChatGPT-User, Claude-User, Google-Agent.
Monitoring bots, social previews, feed readers, health checks.
Spoofers — anything that puts Googlebot/2.1 in its User-Agent and hopes.

The question isn't "bot or not." It's "who, why, and can we prove it?" Z-Gate treats every visitor as that question and answers it with three axes and four tiers of trust.

Three axes

Every AI actor gets classified on three independent dimensions:

Operator — OpenAI, Google, Anthropic, Meta, ByteDance, Amazon, Perplexity, Microsoft, Apple, CommonCrawl, DuckDuckGo, Yandex, Baidu, and the rest. 33 operators recognised today.
Purpose — train, index, answer, user-agent, monitor, preview.
Accountability — signed, ip-verified, ua-claimed, unverified.

Policy is per-cell: "allow Google index, block Meta train, challenge anonymous AI, allow Perplexity answer." The combinations you care about are yours to set.

Four tiers of trust

Accountability collapses into a four-tier ladder, degraded by how much proof we have.

Tier 1 — Signed

The bot sends a cryptographic signature over the request. We fetch the operator's public-key directory at /.well-known/http-message-signatures-directory, verify with Ed25519, and attribute the request to that operator with confidence. This is HTTP Message Signatures (RFC 9421) — the IETF Web Bot Auth profile that OpenAI, Cloudflare, and Perplexity already ship.

If the signature verifies, there's no ambiguity: this is really OpenAI.

Tier 2 — IP-verified

The bot didn't sign, but its User-Agent matches a known operator and its source IP is in that operator's published CIDR ranges. Googlebot from 66.249.64.10 clears Tier 2 because Google publishes its ranges. Bingbot from 40.77.x.x the same. The check is looser than cryptographic — Google could theoretically assign an IP outside its public list — but in practice it's robust, and it's how the old bot-verification world works.

Tier 3 — UA-claimed

The bot's User-Agent matches a known operator, but that operator doesn't publish IP ranges we can check against. ClaudeBot is the classic example: Anthropic doesn't (yet) publish CIDRs, so all we have is the UA string. This isn't malicious — it's under-documented infrastructure. But we can't prove the claim.

Default policy: challenge.

Tier 4 — Unverified

The UA claims a known operator, but the source IP does not match that operator's published ranges. This is a spoof. Mozilla/5.0 (compatible; Googlebot/2.1) from a residential Argentinian ISP is not Googlebot. Someone is wearing a costume.

Default policy: block.

Why this matters

One number from the pipeline: in our own testing, 50% of the "GPTBot" traffic Z-Gate classified landed at Tier 4. The bot wrote "GPTBot" in the User-Agent. The source IP was not in OpenAI's published ranges. That half is not OpenAI — it's someone claiming to be.

If you take every GPTBot hit at face value and respect it the way you'd respect a real OpenAI crawler — honour its training opt-outs, apply the policies you set for OpenAI specifically — you're extending that courtesy to a population that's half impostors. The UA string is not an authentication system. It never was. The agentic web is the moment where pretending it is stops being tolerable.

From observation to action

Classification without enforcement is telemetry.

Every Z-Gate account gets four default rules on its first classified bot hit, seeded automatically:

Priority 900 · Tier 1 Signed       → allow
Priority 910 · Tier 2 IP-verified  → allow
Priority 920 · Tier 3 UA-claimed   → challenge
Priority 930 · Tier 4 Unverified   → block

All four ship in dry-run mode. The rule fires, the decision is logged, the dashboard shows "would have blocked N spoofed requests this week" — but the request passes through untouched. Zero risk of locking out real traffic you forgot to whitelist.

When the dry-run numbers look right, flip rules to live one at a time. Dashboard → Policy tab → [Go live]. The block / challenge / allow responses start flowing from real classification instead of speculation.

Custom overrides plug in at lower priorities:

operator=meta · purpose=train → block (priority 100) — no Meta training crawlers, period.
operator=perplexity · purpose=answer → allow — Perplexity can cite you freely.
path=/checkout · tier=unverified → challenge — harder check only on checkout.

Two enforcement surfaces

Z-Gate enforces policy in two places. You can install either or both.

The script tag

<script src="https://zgate.dev/static/zgate.js?k=pk_..."> is the one-line install. When a human — or a JS-executing AI agent — loads your page, the SDK scores behaviour and calls siteverify. Policy decides; your backend receives allow / challenge / block from the siteverify response and acts on it. Catches the JS-executing half of the agentic web: humans, browser-based AI agents, some crawlers.

The origin middleware

@zgate/next runs policy on every request at your Next.js origin, before JavaScript ever has a chance to run. This is the only way to act on raw-HTTP fetchers: curl, WebFetch, training crawlers that don't execute scripts. One middleware.ts, one env var, done.

Both surfaces call the same classify + policy engine. Both read the same per-site rules from your dashboard. Both use the same 4-tier vocabulary. One source of truth.

A third (free) tier of visibility

Even without middleware, the script tag URL itself carries a signal. Crawlers that parse HTML — even without executing JavaScript — will often fetch the linked zgate.js file. When they do, Z-Gate logs the fetch as a weak signal — classified by UA + IP, shown in your Agent Activity with a muted ◯ weak badge. Not as strong as a full session, not as weak as nothing. Covers a meaningful chunk of the training-crawler population with zero extra work.

What this unlocks

The agentic web needs protocol, not a blocklist. Z-Gate puts three things in one place for the first time:

Verification — cryptographic when the operator signs, IP-range when they publish, UA when that's all we have, and honest about the limits throughout.
Classification — a 3-axis vocabulary for policy that maps to how the internet actually looks now.
Enforcement — rules that span from script-tag to edge-middleware, dry-run by default, real when you're ready.

Nobody else is offering all three open, explainable, and self-serve. The top-shelf enterprise vendors won't. This category is wide open.

Try it at zgate.dev. 100K sessions/month free, no credit card. Script tag install is one line; middleware is three. Pick the one your infrastructure allows.

← Reportes

2026-04-24· 5 min de lectura

Verificación, no detección.

Adentro de la escalera de confianza de 4 niveles, el clasificador de 3 ejes y el motor de políticas que convierte observación en acción.

La mayoría de las herramientas de detección de bots intentan responder una pregunta: ¿esto es un bot? Sí o no. Esa pregunta tenía sentido cuando había un solo tipo de bot (scrapers) y una sola respuesta (bloquear). Esa era terminó.

En 2026 tu tráfico incluye:

Humanos reales en navegadores, en navegadores de agentes de IA, en apps móviles.
Crawlers de búsqueda que te indexan — Googlebot, Bingbot, Applebot.
Crawlers de entrenamiento de IA que te scrapean para alimentar sus modelos — GPTBot, ClaudeBot, Bytespider, CCBot, Meta-ExternalAgent.
Crawlers de respuesta de IA que te citan en búsqueda IA en tiempo real — PerplexityBot, OAI-SearchBot.
User-agents de IA que son humanos en el loop, haciendo una tarea en tu sitio — ChatGPT-User, Claude-User, Google-Agent.
Bots de monitoreo, previews sociales, lectores de feed, health checks.
Spoofers — cualquier cosa que ponga Googlebot/2.1 en su User-Agent y rece.

La pregunta no es "bot o no". Es "quién, por qué, y podés probarlo". Z-Gate trata a cada visitante como esa pregunta y la responde con tres ejes y cuatro niveles de confianza.

Tres ejes

Cada actor de IA se clasifica en tres dimensiones independientes:

Operador — OpenAI, Google, Anthropic, Meta, ByteDance, Amazon, Perplexity, Microsoft, Apple, CommonCrawl, DuckDuckGo, Yandex, Baidu, y el resto. 33 operadores reconocidos hoy.
Propósito — train, index, answer, user-agent, monitor, preview.
Accountability — signed, ip-verified, ua-claimed, unverified.

La política es por celda: "permitir Google index, bloquear Meta train, challenge a IA anónima, permitir Perplexity answer". Las combinaciones que te importan, las definís vos.

Cuatro niveles de confianza

Accountability colapsa en una escalera de cuatro niveles, degradada por cuánta prueba tenemos.

Nivel 1 — Signed

El bot envía una firma criptográfica sobre la request. Fetcheamos el directorio de claves públicas del operador en /.well-known/http-message-signatures-directory, verificamos con Ed25519 y atribuimos la request a ese operador con confianza. Esto es HTTP Message Signatures (RFC 9421) — el perfil IETF Web Bot Auth que OpenAI, Cloudflare y Perplexity ya usan.

Si la firma verifica, no hay ambigüedad: esto es realmente OpenAI.

Nivel 2 — IP-verified

El bot no firmó, pero su User-Agent matchea un operador conocido y su IP de origen está en los rangos CIDR publicados por ese operador. Googlebot desde 66.249.64.10 pasa Nivel 2 porque Google publica sus rangos. Bingbot desde 40.77.x.x también. El check es más laxo que el criptográfico — Google podría teóricamente asignar una IP fuera de su lista pública — pero en la práctica es robusto, y así funciona el viejo mundo de verificación de bots.

Nivel 3 — UA-claimed

El User-Agent del bot matchea un operador conocido, pero ese operador no publica rangos de IP contra los cuales podamos verificar. ClaudeBot es el ejemplo clásico: Anthropic no publica (todavía) CIDRs, así que todo lo que tenemos es el string del UA. Esto no es malicioso — es infraestructura sub-documentada. Pero no podemos probar el reclamo.

Política por default: challenge.

Nivel 4 — Unverified

El UA reclama un operador conocido, pero la IP de origen no matchea los rangos publicados de ese operador. Esto es un spoof. Mozilla/5.0 (compatible; Googlebot/2.1) desde un ISP residencial argentino no es Googlebot. Alguien se puso un disfraz.

Política por default: block.

Por qué esto importa

Un número del pipeline: en nuestras pruebas, el 50% del tráfico de "GPTBot" que Z-Gate clasificó cayó en Nivel 4. El bot escribió "GPTBot" en el User-Agent. La IP de origen no estaba en los rangos publicados por OpenAI. Esa mitad no es OpenAI — es alguien haciéndose pasar.

Si tomás cada hit de GPTBot al pie de la letra y lo respetás como respetarías a un crawler real de OpenAI — honrás sus opt-outs de training, aplicás las políticas que definiste específicamente para OpenAI — le estás extendiendo esa cortesía a una población que es mitad impostores. El string de UA no es un sistema de autenticación. Nunca lo fue. La web agéntica es el momento donde pretender que lo es deja de ser tolerable.

De observación a acción

Clasificación sin enforcement es telemetría.

Cada cuenta Z-Gate recibe cuatro reglas por default en su primer hit clasificado de bot, sembradas automáticamente:

Prioridad 900 · Nivel 1 Signed       → allow
Prioridad 910 · Nivel 2 IP-verified  → allow
Prioridad 920 · Nivel 3 UA-claimed   → challenge
Prioridad 930 · Nivel 4 Unverified   → block

Las cuatro shippean en modo dry-run. La regla se dispara, la decisión se loggea, el dashboard muestra "hubiera bloqueado N requests spoofeados esta semana" — pero la request pasa intacta. Cero riesgo de bloquear tráfico real que olvidaste whitelistear.

Cuando los números de dry-run te cierran, pasás reglas a live una por una. Dashboard → Policy → [Go live]. Las respuestas block / challenge / allow empiezan a fluir desde clasificación real en lugar de especulación.

Overrides custom enchufan a prioridades menores:

operator=meta · purpose=train → block (prioridad 100) — sin crawlers de entrenamiento de Meta, punto.
operator=perplexity · purpose=answer → allow — Perplexity puede citarte libremente.
path=/checkout · tier=unverified → challenge — check más duro solo en checkout.

Dos superficies de enforcement

Z-Gate hace enforcement de política en dos lugares. Podés instalar cualquiera o ambas.

El script tag

<script src="https://zgate.dev/static/zgate.js?k=pk_..."> es la instalación de una sola línea. Cuando un humano — o un agente de IA que ejecuta JS — carga tu página, el SDK puntúa comportamiento y llama a siteverify. La política decide; tu backend recibe allow / challenge / block de la respuesta de siteverify y actúa. Captura la mitad JS-executing de la web agéntica: humanos, agentes de IA basados en navegador, algunos crawlers.

El middleware en el origen

@zgate/next corre la política en cada request en tu origen Next.js, antes de que JavaScript tenga la chance de correr. Esta es la única forma de actuar sobre fetchers de HTTP crudo: curl, WebFetch, crawlers de entrenamiento que no ejecutan scripts. Un middleware.ts, una env var, listo.

Ambas superficies llaman al mismo motor de classify + política. Ambas leen las mismas reglas por sitio desde tu dashboard. Ambas usan el mismo vocabulario de 4 niveles. Una fuente de verdad.

Un tercer (gratis) nivel de visibilidad

Incluso sin middleware, la URL del script tag carga una señal. Los crawlers que parsean HTML — incluso sin ejecutar JavaScript — frecuentemente fetchean el archivo zgate.js linkeado. Cuando lo hacen, Z-Gate loggea el fetch como una señal débil — clasificada por UA + IP, mostrada en tu Agent Activity con un badge muteado ◯ weak. No tan fuerte como una sesión completa, no tan débil como nada. Cubre una porción significativa de la población de crawlers de entrenamiento sin trabajo extra.

Qué desbloquea esto

La web agéntica necesita protocolo, no un blocklist. Z-Gate pone tres cosas en un mismo lugar por primera vez:

Verificación — criptográfica cuando el operador firma, por rango de IP cuando publican, por UA cuando es todo lo que tenemos, y honesta sobre los límites en cada nivel.
Clasificación — un vocabulario de 3 ejes para política que mapea a cómo se ve internet realmente ahora.
Enforcement — reglas que abarcan desde script-tag hasta edge-middleware, dry-run por default, real cuando estés listo.

Nadie más está ofreciendo los tres open, explicable y self-serve. Los vendors enterprise top no van a hacerlo. Esta categoría está abierta de par en par.

Probalo en zgate.dev. 100K sesiones/mes gratis, sin tarjeta de crédito. Instalación con script tag es una línea; middleware son tres. Elegí la que tu infraestructura permita.