Context

qrst is a hybrid search engine that runs entirely on-device. No API calls, no cloud, just a Rust binary and an embedding model. It combines BM25 keyword search (SQLite FTS5) with vector semantic search (ONNX embeddings + HNSW) and fuses results using configurable strategies including Reciprocal Rank Fusion and learned-to-rank models.

Architecture

The engine indexes local files into two parallel stores: a full-text index backed by SQLite FTS5 for keyword search, and an HNSW vector index (usearch) for semantic nearest-neighbor lookup. At query time, both stores return ranked results that are combined through a pluggable fusion layer.

File parsing uses tree-sitter grammars for language-aware chunking across multiple file types, including Rust, JavaScript, TypeScript, HTML, CSS, and markdown. Each chunk is embedded using an ONNX model (currently EmbeddingGemma 300M, 768 dimensions) via the ort runtime.

The trait-based architecture keeps components swappable: Embed for the embedding backend, VectorStore for the ANN index, and FusionStrategy for result fusion. Adding a new embedding model or fusion method means implementing a single trait.

Search Modes

  • Keyword: BM25 ranking via SQLite FTS5, fast and precise for exact term matches
  • Semantic: Cosine similarity over dense embeddings, handles synonyms and paraphrases
  • Hybrid: Fused ranking combining both signals, configurable via ConvexFusion (weighted blend), RRF, or a learned linear model

Evaluation

On a 42-query evaluation corpus with graded relevance judgments:

Strategy nDCG@5 P@3 MRR
BM25 only 0.431 0.246 0.534
RRF (k=60) 0.794 0.476 0.903
Semantic only 0.827 0.500 0.880

The benchmark suite (qrst-bench) supports bootstrap confidence intervals, inter-annotator agreement metrics, and automated SVG report generation.