Skip to content

YantrikDB Cluster Setup — Raft Replication, mTLS, Witness HA

YantrikDB Server has hardened clustering (v0.8.x — substrate-batch, live on a 3-node Proxmox cluster with multiple tenants) with:

  • openraft consensus — proper Raft (leader election, log replication, snapshots). Replaces raft-lite from the v0.5.x line. Automatic failover in seconds, chaos-tested (leader kill, network partition, kill-9 mid-write).
  • Mutation commit log — total-ordered, content-addressed substrate behind every write (RFC 010-A). Tombstone-shaped mutations from day one for forget/audit.
  • Cluster mTLS — mutually-authenticated, encrypted cluster transport (RFC 014-A). Self-signed CA + per-node certs; rotate without downtime.
  • Witness daemon — safe HA with only 2 data nodes
  • Control plane replication — tokens and databases sync to followers within 30 seconds
  • Wire protocol versioning — prevents silent drift during rolling upgrades
  • Read-only enforcement — followers reject writes and return the current leader’s address so clients redirect
  • Multi-database — each database replicates independently
  • Per-tenant admission control — quotas (max memories, batch size, RPS) + circuit breaker, fail-degraded-conservative
  • Online backupsPOST /v1/admin/snapshot with BLAKE3 checksums

Recommended setup: 2 voters + 1 witness.

┌──────────────────┐ heartbeats ┌──────────────────┐
│ data node 1 │ ◄───────────▶ │ data node 2 │
│ (voter) │ oplog sync │ (voter) │
│ full storage │ │ full storage │
└────────┬─────────┘ └────────┬─────────┘
│ │
│ ┌──────────────────┐ │
└────▶│ witness │◄────────┘
│ (vote-only) │
│ ~10 MB RAM │
└──────────────────┘

The witness is a tiny daemon (~3 MB binary, no disk storage) whose only job is to vote in elections. It breaks ties so 2 data nodes can run safe HA without needing a 3rd full node.

This is the same pattern as Azure SQL (witness instance), MongoDB (arbiter), Redis Sentinel, and MariaDB Galera (garbd).

Terminal window
yantrikdb cluster init \
--node-id 1 \
--output /etc/yantrikdb.toml \
--data-dir /var/lib/yantrikdb \
--peers 192.168.1.2:7440 \
--witnesses 192.168.1.3:7440

Output:

config written to /etc/yantrikdb.toml
cluster_secret: ydb_cluster_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
(use this as the auth token from any client to access the default database)

Save the cluster_secret. You’ll need it on every other node and as the auth token from clients.

2. On node2, generate config with the same secret

Section titled “2. On node2, generate config with the same secret”
Terminal window
yantrikdb cluster init \
--node-id 2 \
--output /etc/yantrikdb.toml \
--data-dir /var/lib/yantrikdb \
--peers 192.168.1.1:7440 \
--witnesses 192.168.1.3:7440 \
--secret <PASTE_SECRET_FROM_NODE1>
Terminal window
yantrikdb db --data-dir /var/lib/yantrikdb create default
Terminal window
yantrikdb-witness \
--node-id 99 \
--port 7440 \
--cluster-secret <PASTE_SECRET_FROM_NODE1> \
--state-file /var/lib/yantrikdb-witness/state.json

The witness needs no database, no config file, no embedding model — just the secret and a state file.

On node1 and node2:

Terminal window
yantrikdb serve --config /etc/yantrikdb.toml

After ~5 seconds, an election runs and one voter becomes leader.

Terminal window
yql --host 192.168.1.1 -t <cluster_secret>
yantrikdb> \cluster
node #1 — Leader
term: 1
leader: 1
healthy: yes | writable: yes
quorum: 2
+---------+-------------------+---------+-----------+------+----------+
| node_id | addr | role | reachable | term | last_seen|
+---------+-------------------+---------+-----------+------+----------+
| 2 | 192.168.1.2:7440 | voter | ✓ | 1 | 0.5s ago |
| 99 | 192.168.1.3:7440 | witness | ✓ | 1 | 0.5s ago |
+---------+-------------------+---------+-----------+------+----------+

Kill the leader (Ctrl+C or systemctl stop yantrikdb).

Within 5–10 seconds:

  1. The other voter detects missed heartbeats
  2. Runs an election
  3. The witness grants its vote
  4. The follower promotes itself to leader
Terminal window
curl -s http://192.168.1.2:7438/v1/cluster | jq .role
# "Leader"

When the old leader rejoins, it sees the higher term and demotes itself to follower automatically.

FailureBehavior
Leader voter diesOther voter + witness elect new leader in <10s
Follower voter diesLeader keeps writing (still has quorum with witness)
Witness diesBoth voters keep going, no new elections allowed
Witness + follower dieLeader becomes read-only (no quorum)
Network partition isolates a voterIsolated voter loses quorum, becomes read-only
All nodes dieRestart any node — it loads persistent state, rejoins cluster

To force a specific node to become leader (e.g. for maintenance):

Terminal window
yantrikdb cluster promote --url http://192.168.1.2:7438 -t <cluster_secret>

This triggers an election from that node.

When clustering is enabled, the cluster_secret doubles as a master Bearer token that works on any node in the cluster:

Terminal window
TOKEN=ydb_cluster_xxxxxxxx...
# This works whether node1 or node2 is leader
curl http://192.168.1.1:7438/v1/stats -H "Authorization: Bearer $TOKEN"
curl http://192.168.1.2:7438/v1/stats -H "Authorization: Bearer $TOKEN"

Per-node tokens (created with yantrikdb token create) still work for fine-grained access.

Full [cluster] section:

[cluster]
node_id = 1 # unique integer per node
role = "voter" # voter | read_replica | witness | single
cluster_port = 7440 # peer-to-peer port
heartbeat_interval_ms = 1000 # leader → follower heartbeat rate
election_timeout_ms = 5000 # follower → candidate transition delay
cluster_secret = "ydb_cluster_..."
replication_mode = "async" # async (default) or sync
[[cluster.peers]]
addr = "192.168.1.2:7440"
role = "voter"
[[cluster.peers]]
addr = "192.168.1.3:7440"
role = "witness"

Under the hood, every write is recorded as an oplog entry with a hybrid logical clock (HLC) timestamp. Followers continuously pull new ops from the leader and apply them locally via the same CRDT semantics that the engine already uses.

  • Add-wins set for memories (UUIDv7 keys, no collisions)
  • LWW for graph edges (HLC tiebreaker)
  • Set-union for consolidation
  • Forget always wins (tombstones are absolute)

This means the cluster converges naturally even after network partitions — there’s no manual conflict resolution needed.

For a deeper dive, see the Raft-lite design.