A guide to track AI crawler bots on your website and get automated weekly insights via Slack.
n8n File Link
What This Does
This system:
1. Detects AI crawlers (GPTBot, ClaudeBot, PerplexityBot, etc.) visiting your site
2. Logs their activity to Supabase automatically
3. Analyzes the data weekly using AI and sends insights to Slack
Section 1: Setting Up the Bot Tracker
What You'll Need
- A website (Next.js, Astro, SvelteKit, or any Node.js framework)
- Supabase account (free tier works)
Step 1: Create Supabase Table
1. Go to your Supabase project
โ SQL Editor2. Run this SQL:
CREATE TABLE ai_crawler_logs (
id UUID DEFAULT gen_random_uuid() PRIMARY KEY,
bot_name VARCHAR(100) NOT NULL,
user_agent TEXT NOT NULL,
path TEXT NOT NULL,
method VARCHAR(10) NOT NULL DEFAULT 'GET',
referer TEXT,
query_string TEXT,
ip_address VARCHAR(45),
host VARCHAR(255),
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
NOT NULL
);
-- Indexes for fast queries
CREATE INDEX idx_crawler_logs_bot_name ON
ai_crawler_logs(bot_name);
CREATE INDEX idx_crawler_logs_created_at ON
ai_crawler_logs(created_at DESC);
-- Allow public inserts (for middleware logging)
ALTER TABLE ai_crawler_logs ENABLE ROW LEVEL SECURITY;
CREATE POLICY "Allow public inserts" ON
ai_crawler_logs FOR INSERT TO public WITH CHECK (true);
CREATE POLICY "Allow public reads" ON ai_crawler_logs
FOR SELECT TO public USING (true);
Step 2: Install Dependencies
npm install @supabase/supabase-js
Step 3: Add Environment Variables
Add to your .env file:
SUPABASE_URL=https://your-project.supabase.co
SUPABASE_ANON_KEY=your-anon-key-here
Where to find these:
- Supabase Dashboard โ Settings โ API
- Copy "Project URL" โ SUPABASE_URL
- Copy "anon public" key โ SUPABASE_ANON_KEY
Step 4: Create Detection File
Create lib/crawler-detection.ts:
export const AI_CRAWLER_PATTERNS = [
{ name: 'GPTBot', pattern: /GPTBot/i },
{ name: 'ChatGPT-User', pattern: /ChatGPT-User/i },
{ name: 'ClaudeBot', pattern: /ClaudeBot/i },
{ name: 'Claude-Web', pattern: /Claude-Web/i },
{ name: 'PerplexityBot', pattern: /PerplexityBot/i },
{ name: 'Google-Extended', pattern: /Google-Extended/i },
{ name: 'Bingbot', pattern: /Bingbot/i },
{ name: 'Meta-ExternalAgent', pattern: /Meta-ExternalAgent/i },
{ name: 'Bytespider', pattern: /Bytespider/i },
{ name: 'Applebot-Extended', pattern: /Applebot-Extended/i },
];
export function detectAICrawler(userAgent: string) {
if (!userAgent) return { isBot: false, botName: null };
for (const bot of AI_CRAWLER_PATTERNS) {
if (bot.pattern.test(userAgent)) {
return { isBot: true, botName: bot.name };
}
}
return { isBot: false, botName: null };
}
Step 5: Create Logger
Create lib/crawler-logger.ts:
import { createClient } from '@supabase/supabase-js';
const supabase = createClient(
process.env.SUPABASE_URL!,
process.env.SUPABASE_ANON_KEY!
);
export async function logCrawlerActivity(data: {
bot_name: string;
user_agent: string;
path: string;
method: string;
referer?: string | null;
query_string?: string | null;
ip_address?: string | null;
host?: string | null;
}) {
try {
await supabase.from('ai_crawler_logs').insert(data);
} catch (err) {
console.error('[Crawler Logger] Error:', err);
}
}
Step 6: Add Middleware
For Next.js - Create middleware.ts in project root:
import { NextResponse } from 'next/server';
import type { NextRequest } from 'next/server';
import { detectAICrawler } from './lib/crawler-detection';
import { logCrawlerActivity } from './lib/crawler-logger';
export function middleware(request: NextRequest) {
const userAgent = request.headers.get('user-agent') || '';
const detection = detectAICrawler(userAgent);
if (detection.isBot && detection.botName) {
const url = request.nextUrl;
const ipAddress = request.headers.get('x-forwarded-for')?.split(',')[0].trim()
|| null;
logCrawlerActivity({
bot_name: detection.botName,
user_agent: userAgent,
path: url.pathname,
method: request.method,
referer: request.headers.get('referer')
query_string: url.search || null,
ip_address: ipAddress,
host: url.hostname, }).catch(() => {});
}
return next();
});
Step 7: Test It
1. Test locally:
curl -H "User-Agent: GPTBot/1.0" http://localhost:3000/
2. Check Supabase โ Table Editor โ ai_crawler_logs to see the log entry
Bot tracker is now set up! Crawler activity will be logged automatically.
Section 2: Setting Up the n8n Workflow
What You'll Need
- n8n instance (self-hosted or n8n.cloud)
- OpenAI API key (GPT-4 access)
- Google account (for Google Docs)
- Slack workspace
Step 1: Create Google Docs Knowledge Base
1. Create a new Google Doc
2. Copy the document ID from the URL
3. Add this content structure:
AEO Knowledge Base
Bot Types
- Training Bots: GPTBot, ClaudeBot, Meta-ExternalAgent
- User Bots: ChatGPT-User, PerplexityBot, OAI-SearchBot
Benchmarks
- Healthy 404 rate: <5%
- Critical 404 rate: >10%
- Homepage traffic ratio: <40% is healthy
- User bot ratio: >5% indicates citation potential
Warning Signs
- High 404 rates indicate broken links
- Homepage monopoly (>40%) suggests poor internal linking
- Training-only bots means content isn't being cited
Success Signals
- New bot types appearing
- User bots increasing over time
- Path diversity growing
Step 2: Import n8n Workflow
1. In n8n, go to Workflows
โ Import from File
2. Upload your workflow JSON file (AI-bot_Crawler_Insights.json)
3. The workflow will appear in your workflow list
Step 3: Configure Credentials
Update each credential in n8n:
1. Supabase- Project URL: https://your-project.supabase.co- Service Role Key: Found in Supabase โ Settings โ API โ Service Role Key
2. OpenAI- API Key: Your OpenAI API key- Model: gpt-4o or gpt-4.1
3. Google Docs- OAuth2 credentials from Google Cloud Console- Document ID: Your knowledge base document ID
4. Slack- OAuth2 app credentials- Channel: Select your target Slack channel
Step 4: Configure Workflow Nodes
Schedule Trigger:
- Set to run weekly (e.g., Monday 9:00 AM)
Supabase Node:
- Operation: Get All
- Table: ai_crawler_logs
- Return All: Enabled
Filter Node:
- Condition: created_at is after {{ $now.minus({ days: 7 }).toISO() }}
Google Docs Node:
- Operation: Get-
Document ID: Your knowledge base document ID
Slack Node:
- Channel: Select your target channel
- Text: {{ $json.message }}
Step 5: Test & Activate
1. Click Execute Workflow to test
2. Verify Slack messages appear correctly
3. Toggle workflow to Active
n8n workflow is now set up! You'll receive weekly AI crawler insights in Slack.
What You'll Get
Weekly Slack reports with:
- Stats Overview: Total crawls, bot distribution, 404 rates
- Key Insights: Critical issues, important findings, opportunities
- Priority Actions: Ranked recommendations with timelines
- Top 10 Pages: Most crawled URLs and optimization opportunities
- Bot Profiles: Behavioral analysis per bot type
Troubleshooting
No logs appearing:
- Check Supabase credentials are correct
- Verify table has public insert policy enabled
- Check server console for errors
n8n workflow errors:
- Verify all credentials are configured
- Check Google Docs document ID is correct
- Ensure Slack bot is invited to channel
----------------
If your company or brand is interested in improving your AEO, Ghost Team can help you.
Please book a strategy call with our team here.