Live Greenhouse Data and Scorecards
Verdify should be judged by what the greenhouse actually did. This page is the public audit trail: live operations, planning quality, costs, the Baseline vs Iris (our OpenClaw AI agent) launch comparison, and the generated archives behind them.
Start with Baseline vs Iris if you want the shortest evidence path.
Evidence Summary
The strongest launch receipt: April 22-25 had zero normal Iris plans; April 26-May 2 restored planning and shows the operational contrast with caveats intact.
Use this to inspect whether plans moved the greenhouse toward target bands instead of only producing prose.
Use this to answer what the greenhouse is doing now and whether intervention is needed.
Use this to check the resource bill behind any climate-improvement claim.
Primary Evidence First
For launch traffic, start with Baseline vs Iris, then check the Planning Archive, Generated Lessons, and CSV samples. Grafana panels below are a richer browser view, but the claims should remain checkable even if a crawler, text fetcher, or locked-down browser does not load the dashboard app.
Operations Now
The first question is whether the system is healthy right now: current climate, controller state, connectivity, open alerts, and whether equipment behavior matches the weather pressure. The seven-day controller timeline shows the ESP32 surface mode plus the temperature and VPD/moisture axes so the multi-axis state machine is visible instead of collapsed into one label.
Open the full live view: Operations. Plan details live in the Planning Archive.
Planning Quality
The second question is whether the AI plan was useful. The scorecard centers on band compliance and cost efficiency, with stress hours, water, forecast error, and plan outcomes used to explain why the score moved. Forecast-vs-plan-vs-actual panels show whether the plan matched the world the plants experienced.
To inspect the exact knobs behind those plans, see AI-Writable Tunables.
Open the full decision audit: Planning Quality.
Baseline vs Iris
The strongest launch comparison is not a lab-perfect A/B test. It is the visible operational contrast between a real planner-offline window and the following Iris-online window. Treat it as observational evidence: same greenhouse, same controller, real weather, visible stress/compliance outcomes, and all caveats intact.
0 normal Iris plans/day, 20.1% both-axis compliance, 29.8 stress-axis hours/day.
2.9 Iris plans/day, 54.2% both-axis compliance, 12.6 stress-axis hours/day.
Open the baseline comparison: Baseline vs Iris.
Resource Use
The system spends electricity, gas, and water to reduce plant stress. Cost evidence matters because an optimizer that ignores resource use is not operating the greenhouse well.
Open the cost view: Resource Use.
Outages and Gaps
Verdify publishes the bad days too. The table below is from plan_journal and daily_summary: from April 22 through April 25, 2026, the planner archive shows zero normal Iris plans while the greenhouse still faced heavy dry-air stress.
Controller continued running, but the planning loop did not publish a normal daily plan.
VPD stayed outside the crop band for much of the day.
This is the failure mode the launch site should make legible, not hide.
The archive gap is evidence: software outages are visible in the plant-stress scorecard.
The lesson is not that an AI planner magically beats greenhouse physics. The useful claim is narrower and testable: the ESP32 keeps deterministic real-time control, the planner improves tactics when it is online, and the public data makes both successes and gaps inspectable.
The planner’s allowed tactical surface is documented in AI-Writable Tunables.
Archives
- Daily plans are the generated lab notebook: conditions, decisions, experiments, and outcomes.
- Baseline vs Iris compares the April 22-25 planner-offline window to the following online window.
- Generated lessons are the validated operational findings the planner reads before future plans.
- 7-day climate CSV, 30-day plan outcomes CSV, and dataset notes provide a launch-safe sample: enough to inspect claims, not a raw dump of every private operational feed.
If Verdify makes a claim, this section should make it checkable.