Why I'm building SQLRite — an embedded SQL and vector database in Rust

An origin story for SQLRite: the design tenets behind a SQLite-style engine rebuilt from scratch in Rust, what's shipped, and what comes next.

By Joao Henrique Machado SilvaApril 8, 20267 min read

There is a particular kind of software that you can use for a decade without ever really seeing. SQLite is one of those. It is everywhere — on your phone, in your browser, inside Photoshop, behind your favorite editor — and yet most of the people who depend on it have never opened the file format spec, never read the page cache, never traced what actually happens between INSERT and the green light on your SSD.

I started SQLRite because I wanted to see it. Not just use it: own it. Build the thing, type the B-tree split, watch the WAL frames roll past. SQLRite is what you get when you take the constraints that made SQLite great — single file, embedded, zero configuration, ACID — and try to rediscover them from first principles, in Rust, with one set of eyes on the AI-shaped present.

This post is the manifesto. The why before the what.

What SQLRite is#

SQLRite is a from-scratch SQLite-style embedded database written in Rust. It ships as a Rust crate (sqlrite-engine), a REPL binary, a Tauri desktop app, an MCP stdio server, a C ABI shim, and SDKs for Python, Node, Go, and the browser via WASM. One engine, one file format, six surfaces.

The core is small enough to read in a weekend:

A 4 KiB-page on-disk format with cell-encoded rows.
A B-tree per table and per index, rebuilt bottom-up on commit.
A write-ahead log with crash-safe checkpointing.
A SQL surface — CREATE / INSERT / SELECT with predicates, JOINs in all four flavors, aggregates and GROUP BY, prepared statements, transactions, ALTER / DROP / VACUUM, PRAGMA.
A vector type with HNSW indexing for sub-linear k-NN.
A BM25 full-text index that composes with the vector path for hybrid retrieval.

Here is what one session looks like:

$ cargo install sqlrite-engine
$ sqlrite app.sqlrite
sqlrite> CREATE TABLE notes (id INTEGER PRIMARY KEY, body TEXT);
sqlrite> INSERT INTO notes (body) VALUES ('the only embedded sql in rust');
sqlrite> SELECT * FROM notes;
+----+-------------------------------+
| id | body                          |
+----+-------------------------------+
|  1 | the only embedded sql in rust |
+----+-------------------------------+

The same engine runs underneath every other surface. You can open the same .sqlrite file from the REPL, from Python, from Node, from a desktop GUI, or from an MCP server — and the bytes on disk don't care.

Why rebuild SQLite#

The honest answer is: to learn the things that you cannot learn by reading. There is a whole class of database concepts — page splits, WAL replay, free lists, write amplification, cache eviction heuristics — that read like trivia until you've shipped a buggy version of them. Once you have, you don't forget.

But there is a more interesting answer too, and it has to do with now.

Embedded databases have a moment again. The reasons are different from the ones SQLite was born into. The wave this time is local, private, model-shaped: agents that need a working memory; desktop apps that ship a vector store and a knowledge graph by default; mobile RAG; offline-first sync; LLM tools that want to run a SQL query against your project without phoning a server. The classic SQLite recipe — single file, embedded, zero config — fits all of that beautifully. What does not fit is "wait for an extension."

SQLite supports vectors today only through extensions (sqlite-vss, sqlite-vec) — fine when you control the binary, awkward when you ship to users. Full-text search is FTS5, an opt-in module. Both are excellent in their domain, but the integration story for embedded apps that want both, plus hybrid retrieval, plus a desktop installer, plus six SDKs, plus an MCP server, is "good luck."

SQLRite's bet is that those things should be in the engine. Vectors are a column type. FTS is CREATE INDEX … USING FTS. The desktop GUI, the MCP server, and the SDKs all link the same Rust crate. There's no "extension story" because there's no extension.

Design tenets#

A few principles fall out of that bet, and they have shaped almost every implementation choice so far:

One file, end of story. A SQLRite database is one .sqlrite file plus a sidecar WAL during writes. No directories, no config, no daemon. You can cp it to back it up and rsync it to a peer.
Crash safety is a feature, not an afterthought. Every release has a torn-write test, a partial-WAL test, and a header mismatch test. The pager refuses to commit unless the WAL frame landed.
The lib is the engine. No unsafe you can avoid. Single Connection API. Tauri can embed it directly. WASM gets the same surface stripped of POSIX locks.
Phase-by-phase, public. SQLRite is built in numbered phases, each with a written plan in docs/phase-*.md. The roadmap is open; the design discussions are open; this blog is open. You can read exactly why a decision was made.
Don't reinvent the parser. SQLRite uses sqlparser (SQLite dialect) and only narrows the AST. Inventing grammar is rarely the useful part of building a database.

There is also a tenet that's mostly aesthetic but I think matters: every error returns a typed SQLRiteError, no panics ever. It is shocking how much effort that takes, and how much trust it buys.

What's shipped#

As of mid-2026, SQLRite is in version 0.9.x. The roadmap is broken into phases; phases 1–7 are shipped, phase 8 is in flight. The short version:

Phase 1 — REPL + parser scaffolding.
Phase 2 — typed errors, meta commands.
Phase 3 — B-tree storage.
Phase 4 — pager, WAL, transactions, persistence.
Phase 5 — JOINs (all four flavors), aggregates, GROUP BY, prepared statements, ALTER / DROP / VACUUM, PRAGMA, free-list reuse with auto-VACUUM.
Phase 6 — desktop GUI (Svelte 5 + Tauri 2), prebuilt installers for macOS / Windows / Linux.
Phase 7 — multi-language story: C FFI, Python (PyO3), Node (napi-rs), Go (cgo), WASM (wasm-bindgen). Plus the vector column type and HNSW index in 7d.
Phase 8 — full-text search (FTS5-style inverted index with BM25) and hybrid retrieval. In progress.

The thing I'm proudest of is not any one feature. It's that all six surfaces exist at once. You can talk to the same database from a Rust unit test, a Python notebook, a node REPL, a Go binary, a browser tab, and a Claude Code session — and they all see the same B-tree.

What's coming next#

The roadmap continues past Phase 8:

Subqueries, then HAVING, then CTEs. The executor is ready for them; the AST narrowing isn't.
Hash and merge joins for equi-join shapes. The current driver is a plain nested-loop. Adequate for small embedded workloads; silly for any join above a few thousand rows on each side.
Better persistence for HNSW. The graph is rebuilt on open today. It needs to live in the file format alongside everything else.
More pragmas. journal_mode, synchronous, cache_size, page_size should all be reachable.
Online migrations. ALTER is single-op per statement; the long-running case (rewrite a column under load) deserves better.

Beyond features, the bigger goal is performance. SQLRite's benchmark suite (which I'll write about in a later post) compares against rusqlite-backed SQLite head to head. The point isn't to beat SQLite; SQLite has 25 years of micro-optimization behind it. The point is to know exactly where SQLRite is slow, so the curve bends in the right direction over time.

Why open-source it#

There is a version of this project that lives on my laptop and never ships. I would have learned roughly the same things from it. But that version doesn't have a roadmap, doesn't have to defend a feature gate, doesn't have to write down why the B-tree commits bottom-up instead of in place. I am writing SQLRite in public partly because it is more useful to me that way: the act of explaining a decision is the act of testing it.

It is also, frankly, more fun. The most rewarding bug reports I have ever read came in on a database side project. The internet has a small but excellent population of people who care about WAL frames at 3 a.m., and I would like to keep finding them.

How to follow along#

If you want to play with what's there:

cargo install sqlrite-engine          # CLI / REPL
cargo add sqlrite-engine              # Rust crate
pip install sqlrite                   # Python
npm install @joaoh82/sqlrite          # Node

The desktop installer, MCP server, and the docs all live at sqlritedb.com. The next post in this series digs into how SQLRite actually stores rows on disk — pages, B-trees, and the diff-based pager.

If you build something on it, even something small, I would love to hear about it. The repo is at github.com/joaoh82/rust_sqlite, and if SQLRite is useful to you, the most helpful thing you can do is ⭐ it — visibility is the bottleneck for almost every dev tool.

The whole project is the result of a simple bet: that the embedded database deserves a fresh take, that Rust is a good language to make that take in, and that the AI-shaped era we're entering will value "local, private, single-file, vectors and FTS in the box" more, not less. We will see. Either way, the journey is worth writing about.