Lumen AI
Health Coach:
An AI‑first rebuild.
If we were starting Lumen today — with the breath data we have, the user base we've built, and the AI tools available in 2026 — we probably wouldn't build a hardware company with an app attached. We'd build a metabolic intelligence product that happens to use a device, with community and gamification woven in to keep people coming back. Here's a sketch of what that could look like.
Three layers, one continuous loop.
Today, the device, the app, and the data each live in their own mental model. The proposal is to bring them into one stack, with each layer having a clear job: data flows down, insights flow back up.
Metabolic Intelligence Core
A continuously learning model of this user — identity, experiments, commitments, metabolic assessment, learned habits, and conversation state.
- Persistent user metabolic model — features grow with every breath, meal, sleep night, workout.
- Recommendation engine: impact × completion-probability × experiment relevance × state fit × time-of-day × intensity budget.
- Loop state machine — manages all five engagement loops simultaneously.
- Six physiological mediators (cortisol, glycogen, circadian, muscle mass, CoQ10, insulin sensitivity) as the explanatory language.
Conversational AI Coach
An LLM agent grounded in the user's data and the Lumen knowledge base. It works proactively, not just reactively — initiating breaths, following up on commitments, and acknowledging completion.
- Six communication mechanics: commitment request, loop closure, reward, prediction, expectation-setting, balance approach.
- Multi-modal: text, breath triggers, meal photo analysis (Tier 2).
- Always explains why through the relevant mediator before revealing the result.
Visual Body Interface
Card-based, real-time view of body state. Engagement is tied to action — the rule is no commit, no card. The day's cards accumulate, weekly summaries roll up, and the Journal keeps everything for drill-down.
- Three tabs: Coach (conversation) · Visual Body (execution) · Journal (persistent record).
- Drill-down hierarchy: weekly summary → daily summary → individual card.
- Mobile-first; web-first delivery for velocity (per the current sprint focus).
Three squads, an agent fleet, and a PM who owns the KPI.
Squads map 1:1 to the architecture layers, so engineers and the systems they own ship together. AI agents do the volume work; people do the judgment work. Headcount stays lean by design: 13 builders, 3 enablers, plus an agent fleet.
Metabolic Intelligence
Owns the user model, recommendation engine, mediator definitions, and the breath/lifestyle data pipeline.
AI Coach & Conversation
Owns the LLM agent stack, RAG over the user model, prompt & tool architecture, and the loop state machine that turns conversation into commitments.
Mobile & Visual Body
Owns the three-tab app, the card system, real-time body visualization, and the bridge from the agent's outputs to a touchable, shippable surface.
Enabling roles
Kept lean on purpose. The PM owns the North Star KPI, the data scientist closes the analysis loop, and QA holds the release bar.
What's automated, what isn't, and why.
| Function | AI Agent | Human | Why |
|---|---|---|---|
| Daily recommendation generation | ● Fully automated | Reviews edge cases~5% of plans |
Volume and scoring are mechanical; people only triage the outliers. |
| Conversation responses | ● LLM with grounding | Quality monitoring & red-team | Real-time conversation isn't realistic to human-staff; quality is sampled instead. |
| Experiment design | ● Generates candidates | Metabolic Scientist approves | The LLM proposes; the scientist signs off on anything that touches clinical claims. |
| Insight generation | ● Pattern detection | Validates clinical claims | Pattern-finding is statistical; framing it safely is a human call. |
| User support | ● Tier 1 deflection | Tier 2+ human | FAQ-shaped issues fit an LLM well. Hardware, billing, and medical cases go to a human. |
| Product decisions | ○ Not delegated | PM + Founders | Product strategy is a judgment call, not an optimization problem. |
| Safety / medical review | ○ Not delegated | Medical Officer | Liability and regulatory exposure should sit with a named human. |
PHER — Positive Health Engagement Rate.
One number, focused on quality over volume. We propose using it as the leading indicator for retention and outcomes, and as the shared metric across squads.
Read as a percentage: of every 100 Next Best Actions the coach issues, how many does the user actually complete? A PHER of 75% means three out of four recommendations landed.
Why this metric, and not DAU?
DAU (Daily Active Users) measures how often people open the app, which can quietly reward noisy notifications. PHER measures whether the coach is being useful. A coach that issues fewer NBAs (Next Best Actions) and gets each one done is generally healthier than one that floods the user. Every recommendation the LLM emits is judged on whether it landed.
Secondary metrics: actions per day, return-interaction rate, and Flex Score movement (the lagging outcome PHER should drive).
Foundation, then experience, then intelligence.
About five months end-to-end. Internal dogfooding gates the beta — the team should want to use it before anyone else does. Mobile web first, since speed of iteration matters more than native packaging at this stage.
3 weeks
- Persistent user model + ingestion pipeline.
- Loop state machine — MVP scope: Next Best Action + Daily Summary loops.
- LLM Coach with metabolic grounding (RAG over the user model + canonical KB).
- Mobile-web shell; three tabs in skeleton form.
- Internal dogfood gate: team uses it daily.
1 month
- All five engagement loops live.
- Visual Body tab shipped — full card system with no-commit / no-card rule.
- Intraday adjustment engine ("balance your day", easier-alternative replacement).
- PHER dashboards instrumented; cohort scaffolding stood up.
3 months
- Full six-signal recommendation scoring.
- Experiment engine — automated hypothesis → test → personalized insight.
- Journal + drill-down hierarchy (week → day → card).
- Cohort benchmarking surfaces in-product.
- Lumen Pro integrations & clinical dashboards lit up.
The advantages compound, the flywheel turns, and the data itself becomes a meaningful switching cost.
Proprietary breath data and longitudinal metabolic profiles.
76M+ measurements would be hard for anyone else to reproduce in a single venture cycle. Personal Flex Score histories get richer with every breath, and an LLM-first company with a generic wearable would have a hard time recreating this from scratch.
The user's switching cost is their own data.
The metabolic model gets personalized over months, not days. After 90 days, a user's coach should know them better than a fresh system would — even on the same hardware. That's a reasonable source of stickiness.
Designed for agents from day one, not bolted on after.
The user model, loop state machine, and card system are designed to give the agent a clean substrate. The intent is to build the app around the agent, rather than retrofit an LLM into an existing app.
The points loop can double as a revenue flywheel.
The coach issues NBAs, the user commits and completes, points accrue, and redemptions either deepen the product (advanced features → ARPU) or run through partners (gift cards → affiliate revenue). Gamification can pull double duty as engagement and monetization.
We expect this to lower churn and lift engagement — that's the whole point of the rebuild.
Today, most Lumen drop-off happens in the first 30–60 days, before users have built enough metabolic insight to feel the value. The AI coach, the points loop, and the community layer are designed to fix that window directly. Internal target: ≥ 20% reduction in 60-day churn versus the current cohort once the new experience reaches GA, with a corresponding lift in weekly return-interaction rate.