Webelves is a local-first search engine. Type a query — QAi fetches real web pages, finds the most relevant passages, and generates a grounded answer using an AI model running entirely in your browser. No account. No cloud AI. Nothing leaves your machine.
Every query goes through a four-stage pipeline — all running in your browser tab, all without a server.
SearXNGAdapter queries your local SearXNG instance. Pluggable — swap in any SearchAdapter implementation. No tracking, no personalization.@mozilla/readability strips chrome, extracts clean article text. Chunked into passage windows.@xenova/transformers. Cosine similarity ranks passages against the query. Top 5 chunks forwarded to the LLM context.wllama (llama.cpp WASM+SIMD) streams a grounded answer. Prompted to cite sources as [N] — citations are interactive and scroll to the source card.Built around the constraint that loading a GGUF model once and sharing it across sessions is both faster and more honest than pretending each page is independent.
wllama Worker per page load. All browser tabs share
it via runExclusive — no duplicate model loads, no RAM duplication.
Pending tabs show a "waiting" badge and queue cleanly. One model download,
many parallel searches.
[N]. Click a citation — the
source card highlights and scrolls into view. Every answer traces back to a
real URL you can inspect.
memories.json in OPFS. The full list is visible,
editable, and wipeable in Settings — no hidden context, no surprise
personalization, no cloud profile.
SearchAdapter interface ships with
SearXNGAdapter. CORS-proxy and browser-extension adapters
are planned for v1.5. Self-host SearXNG, point Webelves at it, keep
every query off third-party search infrastructure.
wllama runs llama.cpp compiled to WASM with SIMD extensions. Any quantized GGUF model works. Start with a small, fast model — upgrade when your hardware supports it.
Webelves behaves like a browser. Familiar shortcuts work as expected.
Open Webelves in any modern browser. Set your SearXNG URL in Settings. The model downloads once to OPFS — cached locally, available offline from then on.