tutorialllmmicroapps

Build the Dining App Rebecca Yu Built: A Micro-App Tutorial with LLMs and Lightweight Stacks

UUnknown

2026-01-25

10 min read

Rebuild Rebecca Yu's dining micro‑app with a static frontend, serverless API, and LLMs—fast, safe, and privacy‑aware.

Stop getting stuck in group chat paralysis — rebuild Rebecca Yu's dining micro-app in a weekend

Decision fatigue about dinner is a surprisingly common pain point. For developers and IT leads building tools for non‑dev stakeholders, the challenge is to enable fast, safe prototyping without opening a security or maintenance nightmare. In this tutorial you'll recreate the dining recommendation micro-app Rebecca Yu built using a minimal, modern stack: a static frontend, a serverless API, and LLM calls. The goal: a repeatable pattern that lets non‑developers iterate quickly while you keep control of auth, cost, and data privacy.

The elevator summary (most important first)

Architecture: Static frontend (Vite + React or Svelte) + serverless API (Edge function) + LLM calls (OpenAI/Anthropic) and optional lightweight RAG.
Safety & privacy: All LLM calls run server‑side. Redact PII, enforce rate limits, and enable data deletion to keep prototypes safe for non‑dev users.
Cost & performance: Use a small candidate set + structured responses to cut tokens. Cache results for repeated queries.
Outcome: A 1–2 day micro‑app that non‑devs can use to prototype dining decisions without requiring a full engineering cycle.

Why build micro-apps like Rebecca's in 2026?

Late‑2025 and early‑2026 trends make this exact approach practical: instruction‑tuned LLMs are cheaper and faster, many providers expose structured function calling and reliable embeddings, and serverless edge runtimes (Vercel Edge, Cloudflare Workers) can host lightweight APIs with near zero cold starts. At the same time, organizations want prototypes that respect privacy and can be shut down or audited easily — exactly what a micro‑app architecture delivers.

“Once vibe‑coding apps emerged... I decided it was the perfect time to finally build my application.” — Rebecca Yu

High‑level design: keep it minimal and controllable

Design the micro‑app with three separable layers:

Frontend: Static site that collects preferences (group size, cuisine, price range, vibe) and shows suggestions.
Serverless API: One endpoint that validates input, enforces auth/rate limits, applies privacy rules, and calls the LLM.
Data & LLM layer: Small candidate dataset (JSON or vector DB), optional embeddings for local RAG, and calls to an LLM provider for ranking/formatting.

Step 0 — Plan features (what to ship first)

For an MVP you only need:

Group input form (names optional) and preferences
Server endpoint that returns 3 ranked restaurant suggestions with short reasons
Simple auth for sharing with friends (passwordless email or invite codes)

Step 1 — Set up the repo and static frontend

Use Vite for speed. Choose React or Svelte — both produce tiny static bundles. For this example we'll assume React with Vite.

// 1. Create project
npm create vite@latest where2eat -- --template react-ts
cd where2eat
npm install

Frontend responsibilities:

Render a compact form
Call /api/recommend with JSON
Show results and allow feedback (thumbs up/down)

// src/App.tsx (simplified)
import React, {useState} from 'react';

export default function App(){
  const [prefs, setPrefs] = useState({cuisine: '', price: 'mid', vibe: ''});
  const [results, setResults] = useState([]);

  async function submit(){
    const res = await fetch('/api/recommend', {
      method: 'POST',
      headers: {'Content-Type':'application/json'},
      body: JSON.stringify(prefs)
    });
    setResults(await res.json());
  }

  return (
    
      Where2Eat — quick prototype
      setPrefs({...prefs,cuisine:e.target.value})} />
      setPrefs({...prefs,vibe:e.target.value})} />
      
      {results.map(r=> {r.name} — {r.reason}
)}
    
  );
}

Step 2 — Serverless API: validate, secure, and call the LLM

The single API endpoint is the control plane. Keep all LLM calls here so you can:

Enforce authentication
Redact or refuse PII
Switch models without touching the frontend

Below is a minimal Vercel function (Node/TypeScript). It calls an LLM provider's chat completions API and returns structured suggestions.

// api/recommend.ts (Vercel / Netlify compatible)
import type {VercelRequest, VercelResponse} from '@vercel/node';

const MODEL = process.env.LLM_MODEL || 'gpt-4o-mini';
const LLM_API_KEY = process.env.LLM_API_KEY;

export default async function handler(req: VercelRequest, res: VercelResponse){
  if(req.method !== 'POST') return res.status(405).send('Only POST');

  // Basic validation
  const body = req.body || {};
  const {cuisine='', vibe='', price=''} = body;

  // Rate limit & auth would be inserted here

  // Redact PII if present (very small example)
  const sanitized = JSON.stringify({cuisine, vibe, price}).replace(/\b\d{3}-\d{2}-\d{4}\b/g, '[REDACTED]');

  // Build system + user prompt
  const system = `You are a concise assistant that recommends 3 restaurants from a short list. Return JSON array of {name, reason}. Keep reasons under 20 words.`;
  const user = `Prefs: ${sanitized}. Candidate list: use internal dataset.`;

  const resp = await fetch('https://api.openai.com/v1/chat/completions', {
    method: 'POST',
    headers: {
      'Content-Type':'application/json',
      'Authorization': `Bearer ${LLM_API_KEY}`
    },
    body: JSON.stringify({
      model: MODEL,
      messages: [ {role:'system', content: system}, {role:'user', content: user} ],
      max_tokens: 300,
      temperature: 0.2
    })
  });

  const data = await resp.json();
  // Prefer function calls or strict parsing in production
  try{
    const content = data.choices?.[0]?.message?.content || '[]';
    const suggestions = JSON.parse(content);
    return res.status(200).json(suggestions);
  }catch(e){
    // Fallback: simple heuristic ranking
    return res.status(200).json([{name:'Fallback Diner','reason':'Popular and nearby'}]);
  }
}

Notes on provider choice

OpenAI and Anthropic both offer strong chat models in 2026; pick the model that matches your goals:

ChatGPT-family: Reliable, broad usage, good tool/function calling support.
Claude-family: Helpful for safety-sensitive prompts and often better instruction-following for some use cases.
On‑prem / local LLMs: For strict privacy requirements, run smaller models locally or on private inference (emerging in 2025–2026).

Step 3 — Keep tokens and costs low: candidate sets + structured outputs

Don’t feed an entire city database to the LLM each call. Use a small candidate set. Two approaches work well:

Static candidate JSON: Ship a JSON file with 50–200 local restaurants. Server code filters the list by simple heuristics (cuisine, price) and sends 6–12 candidates to the LLM for ranking.
Lightweight RAG: Precompute embeddings for candidates with your provider and store in pgvector, Supabase, Pinecone, or a local in‑memory index. Use cosine similarity to fetch top candidates and then ask the LLM to rank and explain.

// Example: filter then call LLM
const candidates = loadCandidates();
const filtered = candidates.filter(c=>c.cuisine.includes(cuisine) || c.vibe.includes(vibe)).slice(0,10);
const userMsg = `Rank these candidates: ${JSON.stringify(filtered)} for prefs: ${sanitized}. Return JSON array of 3 items.`;

Non‑dev users need a simple path to get access. Use passwordless email invites (Magic, Clerk, or Supabase Auth) for frictionless login. Protect the LLM API key entirely on the server. For small groups, an invite code system can be enough — store codes in an env‑protected table.

Short‑lived session tokens (JWTs) signed by serverless functions
Rate limits per user to avoid runaway costs
Invite code rotation and audit logs for traceability

Step 5 — Privacy and data handling best practices

Prototype apps often collect sensitive input. Follow these rules:

Server‑side calls only: Never expose the LLM key in client JS.
Redact PII: Use regex and a small PII detection pass before sending prompts.
Minimize retention: Don’t log prompts by default; if you log for debugging, encrypt and set a short TTL.
Opt‑out & deletion: Provide a simple process to delete group data and revoke access.

For teams, add an audit log that records only metadata (user id, timestamp, model used) and stores prompts in an encrypted bucket accessible only to admins.

Step 6 — Quality controls: deterministic outputs and fallbacks

Non‑dev users expect repeatability. Use these mechanisms:

Low temperature (0–0.3) for deterministic ranking.
Structured response enforcement (JSON + schema validation). Use JSON Schema to validate returned suggestions.
Fallback heuristic ranking to handle LLM parsing errors.

// Simple JSON schema validation (pseudo)
const schema = {/* expect array of {name,reason} */};
if(!validate(schema, parsed)) return fallback();

Step 7 — Deploy, test, and iterate

Use Vercel / Netlify for the easiest flow: push to GitHub and enable preview deployments. For edge latency and pay‑as‑you‑go pricing, Cloudflare Workers or Vercel Edge are strong choices.

Enable environment variables for LLM API keys and model selection.
Ship a CI step that runs basic schema tests on the API response using mocked LLM responses.
Monitor cost and latency in the first 48 hours. Add caching for repeated queries.

Advanced strategies (when the prototype graduates)

If Where2Eat becomes a repeatable internal tool, consider:

Function calling: Use model function call features to get strongly typed outputs from the model (reduces parsing errors).
Hybrid ranker: Use embeddings + LLM re‑rank to improve recall and factual accuracy.
Local inference: For sensitive deployments, move inference to private inference endpoints or local GPU hosts using LLM orchestration tools (Emerging in 2025–2026). See architectures that focus on privacy-first, edge-forward orchestration.
Observability: Log model usage, token costs, and suggestion acceptance rate to measure utility.

Real‑world checklist before handing the app to non‑dev users

All LLM calls are server‑side and keys are not exposed
Invite and auth flow is set up and tested
PII redaction runs on inputs and logs
Rate limits and per‑user caps are in place
JSON schema validation for model outputs
Fallback heuristics are implemented for parse failures

Cost primers and token saving tactics

Token spend is the biggest ongoing cost for LLM micro‑apps. Reduce spend with:

Short prompts & candidate sets: Send only 6–12 candidates.
Structured outputs: Ask for JSON so you parse once and cache.
Use smaller models for ranking: Route heavy outputs to larger models only when needed.
Cache & TTL: Cache identical requests for N minutes depending on freshness requirements.

Example: Prompt template that balances structure and creativity

System: You are a concise assistant that ranks given restaurant candidates for a small friend group.
User: Prefs: {"cuisine":"Mexican","vibe":"casual","price":"mid"}. Candidates: [{"name":"TacoHouse","cuisine":"Mexican","notes":"outdoor"}, ...].
Instruction: Return JSON array of exactly 3 objects: {"name":string, "reason":string (max 20 words)}.

What Rebecca's approach teaches us about non‑dev prototyping

Rebecca Yu's 7‑day build demonstrates the power of constraints: a focused feature set, rapid iteration with LLMs, and shipping to actual users. For teams enabling non‑devs, the lesson is to provide a safe scaffold: a minimal frontend, server‑side model control, and guardrails for privacy and cost. That lets non‑technical creators prototype without turning your infra into an unmanageable product.

Future predictions for 2026 and beyond

Expect these trends through 2026:

Edge LLM orchestration: Seamless routing between local and cloud models for privacy & cost optimization. See work on edge-first, privacy-conscious orchestration.
More robust function calling: Models will produce verified structured outputs reducing parsing overhead.
Privacy-first toolchains: Built-in redaction, retention controls, and private inference offerings for teams.
Micro‑app marketplaces: Companies will standardize how micro‑apps are hosted and audited internally.

Wrap up — actionable takeaways

Ship an MVP in a weekend: Static frontend + one serverless endpoint + 50–200 candidate JSON list.
Keep control: All model calls go server‑side to enforce auth, redaction, and rate limits.
Optimize costs: Filter and shortlist candidates before calling the LLM and use structured JSON outputs.
Plan for graduation: Add embeddings, function calls, and private inference when the app grows beyond a micro prototype.

Next steps (quick checklist to run locally)

Create the Vite React project and add the example App.tsx.
Implement the serverless /api/recommend endpoint and wire your LLM key as an env var.
Seed a small candidates.json and implement the filtering logic.
Test with a low‑temperature model and validate JSON output with a schema check.
Deploy to Vercel/Netlify and invite one friend to test the flow.

Call to action

Rebuild Where2Eat this weekend and use it as a template to let non‑dev teammates prototype safely. Start with the scaffold above, then iterate on auth, privacy, and ranking. If you want a starter repo with deploy scripts, schema validations, and an optional Supabase/pgvector RAG layer, grab our open‑source scaffold and deploy in under 20 minutes.

Ready to ship your first micro‑app? Download the starter repo, deploy to Vercel, and share the invite code with your team. Keep prototypes small, safe, and auditable — and let LLMs handle the creative part.

Unknown

Contributor

Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.