Sharding

DocStore writes one file per record. Without sharding, a large collection would pile every file into one directory, and directory operations degrade as entry counts grow. storh therefore buckets record files by two characters of the record id:

pages/data/2d/018bcfe5-6800-7000-8000-2d5dba6274da.jsonc

The bucket is taken from the start of the UUID's final group (characters 24 and 25), which is random entropy. Two hex characters give 256 buckets, so a 100k-record collection averages under 400 files per directory.

Why the tail and not the head

UUIDv7 ids start with a 48-bit millisecond timestamp, so ids created close together share their leading characters. Sharding by prefix would send every write burst into the same hot directory and only rotate buckets as time passes. The tail bytes are random per record, so writes spread evenly from the first record on.

Sharding is invisible to reads and queries: get() computes the bucket from the id, and stream() returns records sorted by id across all buckets, so cursor pagination behaves as if the collection were one ordered list.

At scale

The bucket count never grows: two hex characters is 256 directories at any collection size. What grows is files per bucket:

Records	Files per bucket
10k	~40
100k	~390
1M	~3,900
10M	~39,000

Modern filesystems index directory entries, so a few thousand files per bucket keeps point lookups fast; 1M records is comfortable for get() and indexed queries. Opening a store is constant-time regardless of collection size: the orphaned temp-file sweep only runs when a dead writer marker shows a previous process crashed mid-write. What does grow linearly at scale are whole-collection operations: unindexed scans, reindex(), and verify() each walk every record file, so treat them as batch jobs. Past roughly 10M records per collection, per-bucket counts and total inode usage start to matter: split collections, or use the segmented log for scan-shaped data, which keeps file counts low by construction.

The other engines

SegmentedLog and Queue do not shard because they do not create one file per record. The log appends to segment files under segments/, bounded by max_segment_bytes (1 MiB by default), and the queue appends events to a single queue.log. File counts stay low by construction.

Sharding

Why the tail and not the head

At scale

The other engines

On this page