Building with it

Vocabulary and the knowledge base

Vocabulary is what Nudge knows about your app: which elements exist, what they are called, and what they do. It is the difference between a walkthrough that generates itself and one that comes back empty.

When a user asks for something you have not pre-authored, the backend tries to generate a walkthrough. It does not invent selectors. It only uses elements it already knows about, which is the safe choice: a made-up selector points at nothing. That known set is your vocabulary, stored per tenant.

Empty vocabulary means empty generation

If you ask "where is the résumé" on a site Nudge has never seen, match finds no flow, generation has no elements to build from, and you get "no walkthrough matches". That is the system refusing to fabricate, not a bug. Populate vocabulary and the same question generates a real walkthrough.

How vocabulary feeds a walkthrough

Your appelements
Capturecrawl / import
Vocabularyper tenant
Generateon a goal
Walkthrough
Generation reads from vocabulary. No vocabulary, nothing to generate.

Three ways to populate it

1. Direct upsert

The fastest way to teach Nudge a handful of elements. Post them to the vocabulary endpoint with a secret key.

bash
curl -X POST https://your-backend.com/api/admin/vocabulary/upsert \
  -H "Authorization: Bearer sk_…" \
  -H "Content-Type: application/json" \
  -d '{
    "source": "dev_yaml",
    "elements": [
      {"elementId":"resume-link","canonicalSelector":"a[href=\"/resume\"]","ariaLabel":"Résumé","tag":"a","text":"Résumé"},
      {"elementId":"contact-link","canonicalSelector":"a[href=\"/contact\"]","tag":"a","text":"Contact"}
    ]
  }'

After this, "where is the résumé" can generate a step that points at the résumé link.

2. Crawl the app

The hosted crawler visits your pages with Playwright and extracts elements automatically. This is the broad option, but two things have to be true for it to help:

  • The crawler worker has to be deployed and running, sharing the backend's Redis.
  • It only records elements that carry a stable identifier (data-nudge-id, data-testid, id, or name). Plain <a href> links with none of those are skipped.

So a crawl pays off best on an app you control, where you have added data-nudge-id to the elements that matter. On a site with no stable attributes the crawl comes back thin.

3. Import from a file

If you keep your element definitions in a nudge.yaml in your repo, push them through the same upsert endpoint with "source": "dev_yaml". This is the highest-trust source, so it wins over crawl-derived entries when they describe the same element.

Enrichment

Raw captured elements get a follow-up pass that adds synonyms, a short description, and a category. That is what lets a user say "download my data" and match an element labelled "Export CSV". Enrichment runs as a background job; until you schedule it, trigger it by hand against the enrichment endpoint. Generation still works on un-enriched vocabulary, it just matches on fewer words.

Trust order

When two sources describe the same element, the higher-trust one wins:

text
dev_yaml  >  doc  >  flow  >  crawl_enriched  >  crawl  >  bootstrap

Synonyms and hints always merge across sources, since more vocabulary never hurts matching. Only the single-value fields (the description, the canonical selector) follow the trust order.