ScryWatch
Cloudflare-native observability platform. Ingest millions of logs, search them instantly, and keep them forever — without Datadog-level pricing.
View Project →
The Problem
Small teams need production-grade log visibility, but Datadog and New Relic pricing is built for enterprise DevOps budgets. Teams either overpay, cobble together something brittle, or fly blind. The gap was a production-grade observability platform a small team could self-host and actually afford to run.
What Was Built
- → Multi-tenant SaaS platform with organization and user management
- → Log ingestion pipeline with burst buffering via Cloudflare Queues
- → Web dashboard with full-text search, faceted filtering, and live tail
- → Pattern Intelligence — automatic log clustering to surface error patterns
- → Deploy Diff — behavioral comparison across deployment windows
- → JavaScript/TypeScript and Flutter SDKs for drop-in log submission
- → AI-powered log summaries via Workers AI (Llama 3)
- → Long-term retention via R2 archival with hot/cold routing
- → Admin portal for platform-level org, user, and billing management
Technical Approach
Built entirely on Cloudflare's edge — Workers for compute, D1 for hot storage, R2 for cold archival, Durable Objects for real-time WebSocket streaming. The SDK was designed for drop-in integration: one import, logs flowing in under five minutes. No servers, no agents, no separate infrastructure to manage.
Outcome
A fully operational, self-hosted observability SaaS running in production. Small teams get Datadog-level visibility — pattern clustering, distributed tracing, live tail, AI summaries — without enterprise contracts or a dedicated DevOps hire. The admin portal handles all multi-tenant operations without manual database access.
Overview
ScryWatch is a multi-tenant observability platform built entirely on Cloudflare’s edge infrastructure. It provides log management, distributed tracing, APM metrics, and alerting — all without agents, servers, or separate infrastructure.
Key Features
- Pattern Intelligence — Automatically clusters repetitive logs into behaviors, surfacing emerging error patterns without manual configuration
- Deploy Diff — Behavioral diff around deployments comparing 15-minute before/after windows to catch regressions immediately
- Live Tail — WebSocket-powered real-time log streaming with server-side filtering via Durable Objects
- Distributed Tracing — Custom JSON and OpenTelemetry (OTLP) ingestion with tail-based sampling
- AI Insights — On-demand summaries powered by Workers AI (Llama 3)
- Long-term Retention — Automatic archival to R2 object storage; hot/cold storage routing based on time range
Technical Highlights
- Compute: Cloudflare Workers (edge gateway, queue consumers, cron jobs)
- Hot Storage: D1 SQLite with a 24-hour retention window
- Cold Archive: R2 object storage for long-term log retention
- Real-time: Durable Objects for WebSocket broadcasting during ingestion
- Queue: Cloudflare Queues for burst buffering with at-least-once delivery
- AI: Workers AI with Llama 3 8B for pattern summaries
- Dashboard: Next.js with full-text search, faceted filtering, and Lenses (saved filter presets)
SDKs
- JavaScript/TypeScript — Zero-dependency, supports Node.js, Deno, Bun, and browsers with exponential backoff retry
- Flutter (Dart) — Auto-detects device type (iOS, Android, macOS, Windows, Linux)
Have something similar to build?
Get a Quote →