Wealth Advisor — System Overview

A multi-agent equity research tool for wealth advisors. Combines official primary-source data (SEC + market quotes) with unverified social signal (Reddit, Stocktwits, news), in clearly separated sections.

Live: wealth-advisor.pages.dev Version: v0.2.0 — multi-agent Date: 2026-05-25

← App Productivity Impact How it works

TL;DR for a customer

Type a US-listed ticker, get a one-screen briefing in ~60 seconds: live quote, business snapshot, recent 8-K material events, top 10-K risks (all cited to the filing), plus a separately-labeled "social signal" section showing what people are saying on Reddit, Stocktwits, and recent news — with explicit unverified tags, pump-bot detection, and an explicit divergences section when the two sources disagree.

Built for compliance: never gives buy/sell recommendations, scopes to US public equities only, and shows the advisor the agent's work (every tool call) for audit.

01 · The Wealth Advisor

the user

"What does this thing actually do for me before my 9:30am client call?"

The 60-second workflow

Open wealth-advisor.pages.dev in a browser.
Type a US-listed ticker (e.g. F, AAPL, NVDA) and hit Send.
Watch the agent's work stream in real time — you can see exactly which SEC filings and social sources it's consulting.
Read the briefing. Ask follow-ups ("what's the debt load?", "what changed in the latest 10-Q?").

What you get

Section	Source	What it tells you
Snapshot	Yahoo Finance	Price, change, day range, market cap, sector — live.
Business snapshot	Latest 10-K Item 1	1–2 sentence company description, cited.
Material events	Last 5 filings (10-K + 8-K)	What was announced and when, cited per filing.
Top risks	Latest 10-K Item 1A	The risk factors the company itself discloses, cited.
Social signal	Reddit + Stocktwits + News + volume anomaly	What's being said, with an unverified tag and pump-bot detection.
Divergences	Synthesis	Cases where social claims contradict the filings — surfaced, not suppressed.

What you do not get

No buy/sell/hold recommendations. That's investment advice; it's outside this tool's scope. If you ask, the agent declines and offers factual alternatives.
No coverage of crypto, FX, options, fixed income, non-US listings, or private companies.
No "sentiment is bullish therefore the stock will go up" inferences. Social is treated as a separate signal, never inlined with cited facts.

02 · System Architect

build the boxes & arrows

"What's the shape of the system and why?"

Pattern: sub-agents-as-tools

One Coordinator agent exposes two macro-tools: research_official and research_social. Each macro-tool internally runs its own focused sub-agent with its own system prompt and its own tool set. The coordinator sees clean structured Reports back, never raw tool output, and synthesizes a single briefing.

┌─────────────────────────────────────────────────┐
│  Coordinator Agent (Sonnet 4.6)                 │
│    tools: research_official, research_social    │
│       │                                         │
│       ├──► Official sub-agent (Sonnet 4.6)      │
│       │     prompt: strict primary-source       │
│       │     tools: resolve_ticker, quote,       │
│       │            filings, 10-K sections,      │
│       │            8-K summary  (6 tools)       │
│       │                                         │
│       └──► Social sub-agent (Sonnet 4.6)        │
│             prompt: skeptical forum analyst     │
│             tools: reddit, stocktwits, news,    │
│                    volume anomaly  (4 tools)    │
└─────────────────────────────────────────────────┘

Why this pattern, not "one big agent"

Trust-regime separation. Primary-source analysis and forum analysis need different skepticism, different output schemas, different rules. One prompt can't encode both well.
Bounded tool surfaces. 10 tools split across two focused contexts beats 10 tools mixed in one. Tool selection accuracy is higher.
Reusability. The Coordinator could later be repointed at different sub-agents (e.g., "research_options") without retraining its instincts.

Tech stack

Layer	Choice
Runtime	Cloudflare Pages + Pages Functions (Workers V8 isolates at the edge)
Language	TypeScript (strict, noUncheckedIndexedAccess)
LLM	`@anthropic-ai/sdk`, Claude Sonnet 4.6 across all three agents
Tool-use loop	Hand-rolled (the Agent SDK doesn't run in Workers — no subprocess)
HTML parsing	`node-html-parser` (Workers-compatible)
Front end	Single static `index.html`, vanilla JS, no build step
Server state	None. Browser holds conversation history, posts it back each turn.
Tests	`vitest` — unit, integration, golden behavior, contract

03 · Security Architect

how can this hurt us?

"What's the threat surface and what does each layer defend?"

Layered defenses (in defense-in-depth order)

Layer	Defense	Threat addressed
Input	2,000-char cap on user messages; `JSON.parse` validation	Prompt-injection payload size; malformed bodies
Input	`resolve_ticker` rejects non-US tickers (containing `.`) before any other call	Scope abuse; non-US listings
Prompt	System prompt forbids stating facts without a tool result	Hallucinated "facts"
Prompt	Hard refusal for investment advice questions	Regulatory exposure (SEC/FINRA "investment advice" rules)
Tool	Discriminated-union returns; errors-as-data, not exceptions	Silent failures; partial data passing as complete
Output	`marked` + `DOMPurify` sanitization before `innerHTML`	XSS from filing content or LLM output
Output	Tool-trace UI built with `textContent` only	XSS from tool names / parameters
Loop	Max 10 iterations per sub-agent; max 4k tokens/call	Runaway cost; infinite tool loops

Secrets management

Wrangler secrets for production (ANTHROPIC_API_KEY, optionally REDDIT_CLIENT_*, NEWSAPI_KEY). Never committed.
.env + .dev.vars are in .gitignore.
Reddit OAuth uses client-credentials (no user identity); token cached in memory only.
No outbound network calls carry user data except the Anthropic API call.

Known gaps (intentional for v1)

gapNo advisor auth — anyone with the URL can query. Production needs Cloudflare Access or a session layer.
gapNo rate limiting beyond Workers' platform limits. A single advisor in a loop could blow the Anthropic budget.
gapNo output content-filter — Sonnet's compliance is currently enforced by prompt + tests, not by a second-pass classifier.

04 · UX Architect

design for trust under time pressure

"An advisor has 60 seconds before a client call. The UI must earn trust fast."

Key design decisions

One input box, one button. Same field for the opening ticker and follow-up questions. No mode-switch.
Streaming output. The brief appears word-by-word, so the advisor never stares at a spinner wondering if it's stuck.
Visible tool trace. Every tool call shows: status (✓/✗), name, latency. Sub-agent calls collapse under their parent. Transparency → trust.
Tables for tabular data. Snapshot stats, filing list, social-source signal — all rendered as proper tables. Bullets reserved for qualitative items (risks, divergences).
Visual disclaimer for social. Yellow left-border blockquote above the "Social signal" section. Unmissable.
No buttons that don't exist. No "export to PDF" lying about being implemented. Ship what works.

What the advisor sees stream by stream

"Resolving ticker..." ✓ (39 ms)
"Fetching live quote..." ✓ (121 ms)
"Pulling latest 5 filings..." ✓ (137 ms)
"Reading 10-K Item 1A (Risk Factors, 9778 words)..." ✓ (73 ms)
"Checking Reddit / Stocktwits / News in parallel..." (with per-source results)
"Detecting volume anomaly..." ✓ (117 ms — 0.88σ, no flag)
Final markdown briefing renders with tables.

05 · Test Architect

how do we know it works?

"Each commit followed test → red → implement → green → commit. The spec is encoded in the tests."

Test pyramid

Layer	Files	Count	What it covers
Unit (tools)	`tests/tools/*.test.ts` ×7	~25	Each external API integration mocked at `fetch`
Prompts	`tests/prompts.test.ts`	14	Sanity assertions that key behavior clauses survive edits
Agent loop	`tests/agent/loop.test.ts`	2	Generic `runSubAgent` with a mock Anthropic client
Sub-agents	`tests/agent/{official,social}.test.ts`	4	JSON parsing, graceful failure when output is non-JSON
Coordinator	`tests/agent/coordinator.test.ts`	2	Parallel sub-agent dispatch + event nesting
API	`tests/api/chat.test.ts`	3	SSE stream shape; 400 on malformed input
Golden behavior	`tests/golden.test.ts`	10	End-to-end behavior contract (no advice, citations, scope, divergence handling, anomaly handling, graceful degradation)
Contract	`tests/contract/contract.test.ts`	5	API↔UI event registry sync; SSE wire format

The contract test is the differentiator

It catches drift in both directions:

API adds a new event type → registry assertion fails until UI adds a handler.
UI references an event type not in the registry → fails until UI is corrected or registry is extended.
SSE wire format changes → fails until contract is updated.
(It already caught real drift in this codebase — surfaced that the UI was relying on [DONE] sentinel and silently ignoring the done event.)

Live verification

Final smoke driven via Playwright MCP against the deployed URL with ticker F (Ford). Verified: tables rendered, no JSON leakage, 3 [UNVERIFIED] tags including a flagged pump-promotion bot, explicit divergence section.

06 · Integration Architect

5 third-party systems, zero hard dependencies

"What happens when an upstream is down? Define it explicitly."

Upstream integrations

System	Auth	Used for	Failure mode
SEC EDGAR `data.sec.gov`	none (User-Agent required)	Filings list, 10-K sections, 8-K items	Returns `sec_unavailable` → official report carries a warning, briefing continues
Yahoo Finance `query1/v8/chart`	none (undocumented)	Live quote, 30-day volume baseline	Returns `quote_unavailable` → snapshot table replaced by "Live quote unavailable"
Reddit OAuth	client-credentials	r/wallstreetbets, r/stocks, r/investing (whitelisted)	Missing keys → `missing_credentials` warning; rate-limited → `reddit_unavailable`
Stocktwits `/streams/symbol`	none	Aggregate Bullish/Bearish counts	`stocktwits_unavailable` → row shows "unavailable"
NewsAPI `/v2/everything`	API key	Recent headlines (14-day window)	Missing key → `missing_credentials`; over-quota → `news_unavailable`
Anthropic Messages API	API key	All three agents	Network error → SSE emits `{type:"error"}` event; UI surfaces it

Failure philosophy

Errors as data, not exceptions. Every tool returns { ok: false, error, reason? } on failure.
No retries. We're inside a Workers wallclock budget. Retries burn the budget without buying much; a single failure surfaces in the briefing instead.
Warnings bubble up. Sub-agent Reports include a warnings: string[]; coordinator surfaces them.
One bad upstream ≠ no brief. If Yahoo is down, the agent still delivers an SEC-only briefing. If Reddit is down, the social section reports "unavailable" and continues with Stocktwits + news.

07 · Performance Architect

edge-deployed, parallel where possible

"Where does the time actually go, and what's the headroom?"

Latency budget (observed on AAPL/F runs)

Phase	Time	Notes
Cold start (Workers V8 isolate)	< 100 ms	Edge isolates; no container spin-up
Coordinator turn 1 (LLM)	~3 s	Decides to call both macro-tools in parallel
Official sub-agent (full loop)	~40–60 s	5–6 internal iterations; tool calls dominate
Social sub-agent (full loop)	~15–25 s	4 tools in parallel within one assistant turn
Coordinator turn 2 (synthesis)	~10–20 s	Generates markdown brief from two Reports
End-to-end	~60–90 s	Sub-agents run sequentially in v1; could be in parallel via `Promise.all`

Parallelism actually used

Within each sub-agent: Sonnet emits multiple tool_use blocks in one assistant turn (e.g., "fetch quote + list filings + parse 10-K simultaneously"). The runner dispatches them sequentially but the LLM doesn't have to wait between thoughts.
Coordinator: Issues research_official + research_social as a tool_use pair. The current macro-tool dispatcher runs them sequentially; the v1.1 optimization is to Promise.all them, cutting ~25 s.

Cost / wallclock controls

4,096-token cap per LLM call.
10-iteration cap per sub-agent loop.
6-iteration cap on the coordinator.
Stocktwits limited to 30 messages, Reddit to top-10 per sub, news to 10 headlines.
Workers free-tier real-time limit is generous for streaming responses, but we conservatively budget for <120 s end-to-end.

08 · Compliance / Regulatory Architect

"Not in this section" — the financial-services version

"Investment advice triggers SEC / FINRA regulation. Stay on the right side."

What the system asserts about itself

Never recommends buying, selling, or holding. Hard-coded in the coordinator prompt and asserted by golden test #4.
Cites every official claim. Format: (10-K Item 1A, FY24, filed 2024-11-01). Asserted by golden test #1.
Tags every social claim as [UNVERIFIED]. Live-verified on the Ford demo where 3 social claims were so tagged, including one auto-promotion bot.
Never inlines social with official. Two distinct, labeled sections in the briefing template.
Surfaces divergences explicitly. When social claims contradict filings, they get their own section ("Divergences worth noting").
Refuses out-of-scope. Non-US listings, crypto, FX, options, private companies — declined with a scope explanation.

Audit trail

Every tool call emits a structured SSE event with: tool name, input, latency, ok/error status, and result summary. The same stream that drives the UI can be teed to a log sink for retention. Audit is a free byproduct of inspectability.

What this design does not claim

It is not a registered investment adviser.
It is not a substitute for the firm's compliance review.
It does not retain conversation history server-side (no PII storage problem to solve).

09 · AI / Agent Architect

model + pattern + eval

"Why these models, why this topology, how do you know it's stable?"

Pattern choice: sub-agents-as-tools (orchestrator-workers variant)

From Anthropic's "Building effective agents" taxonomy. Coordinator dispatches; workers execute; each worker has a focused tool surface. We chose this over:

Single agent with combined toolset — would need one prompt to encode both "trust SEC verbatim" and "treat Reddit with suspicion"; the trust regimes leak into each other.
Full orchestrator with parallel workers in separate processes — overkill for a demo; adds infrastructure (queues, parallel runtimes) without proportional benefit.

Model selection: Sonnet 4.6 for all three agents

Model	Considered for	Why not chosen
Haiku 4.5	Sub-agents (cost)	Misreads financial document language under tool-chain pressure; risk too high for a wealth-advisory product.
Sonnet 4.6 ✓	All three agents	Production-grade reasoning, ~3× faster and ~5× cheaper than Opus.
Opus 4.7	—	Overkill — the bottleneck is tool latency, not reasoning depth.

Evals

10 golden behavior-contract tests run on every PR. The spec is the test file.
Contract test enforces API↔UI sync.
Smoke test on a real ticker (Ford was last verified) at deploy time via Playwright.

Production hardening hooks (deliberately deferred)

Cloudflare AI Gateway in front of Anthropic for caching, retry, and per-key budgets.
Output content-filter classifier (cheap Haiku call) for defense-in-depth on the no-advice rule.
Coordinator on Haiku 4.5 (it only synthesizes pre-structured input — likely safe and ~5× cheaper).

10 · Data Architect

two trust regimes, well-defined schemas

"Where does data come from, what shape is it, and who can touch it?"

Two epistemic regimes

Regime	Sources	Provenance	UI treatment
Official	SEC EDGAR, Yahoo Finance	Filed, signed, audit-required	Cited inline; treated as fact
Social	Reddit, Stocktwits, NewsAPI, volume stats	Anonymous, may include bots / pumps	Always tagged unverified; in its own section with a disclaimer

Tool return shapes

Every tool returns a discriminated union: T | { ok: false; error: string; reason?: string }. Callers must branch on the discriminator. This makes "I forgot to handle the failure" a TypeScript error at compile time.

Report schemas (the sub-agent contract)

OfficialReport = {
  findings: { claim: string; citation: string }[]
  quote:    Quote | null
  warnings: string[]
}

SocialReport = {
  sentiment: { reddit_breakdown, stocktwits, news_tone }
  themes:        { topic, supporting_posts[] }[]
  claimed_facts: { claim, source_url, tag: "[UNVERIFIED]" }[]
  anomalies:     { kind, evidence }[]
  warnings:      string[]
}

Persistence

None. Server is stateless per request. Browser holds the conversation array. There is no PII storage problem because there is no storage. Production hooks: Cloudflare KV for 15-min cache on Reddit/Stocktwits/News (cost + rate-limit relief).

11 · Observability / SRE Architect

if you can't see it, you can't run it

"Every tool call leaves a trace. The same stream that renders the UI is the audit log."

Event-driven instrumentation

The agent loop emits an AgentEvent for every meaningful action:

type AgentEvent =
  | { type: "token"; text: string }
  | { type: "tool_call"; name; input; iteration; ts; parent? }
  | { type: "tool_result"; name; ok; latency_ms; result_summary; ts; parent? }
  | { type: "sub_agent_start"; agent; ts }
  | { type: "sub_agent_end"; agent; report_summary; ts }
  | { type: "done" }
  | { type: "error"; reason }

How this becomes ops

Latency per tool call is captured in latency_ms; trivially aggregated into p50/p95 per tool.
Failures per integration: tool_result with ok: false + result_summary.error; count by error code per upstream.
Iteration distribution: how many tool-use rounds a sub-agent takes; outliers indicate prompt regression.
Cost per request: derivable from token counts in the Anthropic response (not yet aggregated; v1.1 hook).

Cloudflare ops surface

wrangler pages deployment list for deploy history.
console.log in Workers ships to Workers Logs (1 hr retention free tier; can tee to Datadog/Logflare via Logpush).
One-click rollback to any prior deployment from the dashboard.

Reliability posture

No DBs, no queues, no caches → fewer things to be down.
Browser holds state → server restarts are invisible.
Edge deploy → no single-region outage takes us down.

12 · FinOps / Cost Architect

where does the money go?

"The LLM call is the cost. Everything else is rounding."

Per-briefing cost (rough estimate, Sonnet 4.6 list pricing)

Component	Tokens (in / out)	Est. cost / brief
Official sub-agent (5–6 iterations)	~25k in / ~3k out	~$0.12
Social sub-agent (1–2 iterations)	~8k in / ~2k out	~$0.05
Coordinator (2 iterations)	~10k in / ~1.5k out	~$0.05
Total	~$0.20–$0.30 per ticker

Non-LLM costs

Service	Plan	Cost
Cloudflare Pages + Functions	Free tier	$0 (well under 100k requests/day)
SEC EDGAR	Free	$0
Yahoo Finance	Undocumented (rate-limited)	$0
Stocktwits public API	Free	$0
Reddit OAuth	Free (rate-limited)	$0
NewsAPI	Free tier (100 req/day)	$0 — caps at 100 briefs/day for the social section

Cost levers (production)

Coordinator on Haiku 4.5. ~5× cheaper for the synthesis step; total drops to ~$0.10–$0.20/brief.
KV cache 15-min TTL. Repeat queries on the same ticker within 15 min reuse cached Reddit/Stocktwits/News; the official tools' filing fetches are also cacheable for longer (filings don't update intraday).
Cloudflare AI Gateway. Caches identical Anthropic requests; useful when multiple advisors hit the same hot ticker.
Per-advisor budget cap. Enforced at the API gateway (currently absent — see Security gaps).