io.github.YonasValentin/mcp-memory-graph icon

mcp-memory-graph

by YonasValentin

io.github.YonasValentin/mcp-memory-graph

Local-first memory for Claude Code and any MCP client: hybrid search + knowledge graph, $0/token.

mcp-memory-graph · v2.10.1

by YonasValentin

60

MCP Memory Graph

CI

A memory server for Claude Code and any other MCP client. It gives your AI assistant a permanent, searchable memory that lives in one SQLite file on your machine. Store a decision today, ask about it next month, and the answer comes back. Everything runs locally: the embedding model, the search index, the knowledge graph. No cloud account, no API key, no per-token cost.

License: source-available and free for noncommercial use (PolyForm Noncommercial 1.0.0): personal projects, hobby, study, research, charity, education, and government. Commercial use requires a paid license (COMMERCIAL.md).

Who it's for: developers who want Claude (or Cursor, Codex, any MCP client) to remember decisions across sessions. Solo builders and hobbyists use it free. Teams share a knowledge base over git. And anyone who wants to replace a cloud memory service (mem0, Zep, Letta, Supermemory) with something that runs entirely on their own machine.

What it looks like

Run npx mcp-memory-graph serve and you get a local web dashboard for browsing and searching your memory outside Claude.

The dashboard: memory counts, breakdowns by scope, department, and type, and the most recent memories

Search works by meaning, not keywords. The query below ("how do we handle payments") finds the Stripe, GDPR, and Postgres notes even though none of them contains that phrase — each result carries a confidence score and a match-type badge:

Semantic search results with confidence and match-type badges

Browse and sort the whole store in one table — scope, type, tags, quality score, and how often each memory has been read:

Sortable table of all stored memories

How it compares

mem0, Zep, Letta, and Supermemory are the usual names for AI memory, and several of them have open-source cores. This one is built around a different default: nothing leaves your machine and there's no infrastructure to run.

MCP Memory Graph Typical hosted memory service
Where it runs One SQLite file on your machine A managed cloud service (some also self-host)
Embeddings Local model in Node (MiniLM), no API key Usually a cloud embedding API
Cost per token $0 — nothing is metered Usage-based, or a server you operate
Extra infrastructure None Often Postgres/pgvector, Redis, or a Python service
Claude Code integration First-class: hooks auto-capture and recall Manual wiring
Benchmarks Committed corpus + runner, reproducible locally Mostly self-reported

The trade-off is honest: a single-process SQLite server tops out in the low hundreds of thousands of vectors (see Limitations), and a hosted service will scale past that without you thinking about it. If you're a solo developer or a small team who wants memory that's private, free, and zero-ops, that ceiling is rarely the thing you hit first.

Why this exists

AI assistants forget everything between sessions. Your decisions, your patterns, the bug you fixed last Tuesday: all gone when the conversation ends. This server fixes that.

  • Knowledge stored today is searchable tomorrow, next week, next year.
  • Search works by meaning, not just keywords. "contract notice period" finds "90-day renewal clause".
  • It improves itself. It tracks what gets used, scores quality, extracts learnings from your sessions, and cleans itself up on a schedule.
  • It stays private. Local embeddings, no cloud APIs, no telemetry. The one exception is the optional Stop hook, which sends your session transcript to your own locally installed Claude Code (claude -p) for learning extraction. You can turn that off with review_on_stop: false.
  • It works for any kind of knowledge. Engineers store architecture decisions, lawyers store contract patterns, accountants store audit procedures.

Quick start (about 5 minutes)

You need Node.js 20 or newer and Claude Code installed.

1. Get the server. From npm (easiest):

npm install -g mcp-memory-graph

Or from source:

git clone https://github.com/YonasValentin/mcp-memory-graph.git
cd mcp-memory-graph
npm install
npm run build

2. Register the server with Claude Code (optional — init in step 3 does this for you at user scope):

# npm install:
claude mcp add memory-server -- npx -y mcp-memory-graph

# from source:
claude mcp add memory-server node /path/to/mcp-memory-graph/dist/index.js

3. Install the hooks (recommended):

npx mcp-memory-graph init

This is the one command that wires everything up: it registers the MCP server (user scope), installs the auto-capture/recall hooks and the usage skill, writes config, and schedules a nightly cleanup. Answer the prompts, or pass --yes to accept the defaults. (Skip the auto-registration with --no-register if you manage claude mcp yourself.)

4. Try it. Open a Claude Code session and say:

Remember this: we use Postgres for the main app database. Decided 2026-06-01,
because we need JSONB and full-text search in one place.

Then, in a later session:

What database did we decide to use, and why?

Claude searches its memory and answers with the stored decision. That's the whole loop.

5. Verify the install. Ask Claude:

What memory tools do you have available?

It should list all 51 tools (45 memory_*, 3 vault_*, 3 core_memory_*).

The first time a memory tool runs, the embedding model (about 30 MB) downloads from HuggingFace and is cached at ~/.cache/huggingface/. Every start after that is instant.

To undo everything: npx mcp-memory-graph uninstall.

Upgrading

npm install -g mcp-memory-graph@latest   # or just let `npx -y mcp-memory-graph` pull it
npx mcp-memory-graph init                # re-run to refresh on-disk hooks + the nightly schedule

Upgrading the package updates the code that runs each session (hooks, tools, the server), so server-side fixes apply the next time a tool runs — nothing else needed for those.

But files that init wrote earlier are not rewritten by a package upgrade: the Claude Code hook registrations in settings.json and the macOS launchd plist at ~/Library/LaunchAgents/com.mcp-memory.consolidate.plist. If you installed before 2.6.3, that plist used a bare node that launchd (whose minimal PATH excludes nvm) could not run — so the nightly consolidation silently never fired. Re-run npx mcp-memory-graph init once after upgrading to regenerate it with an absolute node path and an output log. Verify it then runs:

launchctl start com.mcp-memory.consolidate
cat ~/.mcp-memory/consolidation.log      # should show a "Consolidation complete" report

To clear conflict noise that accumulated while the job wasn't running: npx mcp-memory-graph consolidate.

How it works, in plain terms

When you store a memory, the server turns the text into a vector (a list of 384 numbers that captures its meaning) using a small model that runs inside Node.js. It also indexes the text for keyword search. Both live in one SQLite file, by default at ~/.mcp-memory/memory.db.

When you search, the server runs both kinds of search at once, merges the rankings, and returns the best matches with a confidence label. A second model can then re-sort the top results for better precision (this is the reranker, on by default for MCP clients, and it costs about 200 ms).

On top of that sits a knowledge graph: memories link to entities and to each other, so the server can answer questions that need more than one hop, like "what does the payment service depend on?". A nightly "dream cycle" deduplicates, re-scores, prunes, and reports gaps.

The benchmarks, and how to read them

Every number below was produced locally: real embedding model, real production handlers, no network. You can rerun all of them on your own machine.

A quick primer if benchmarks are new to you. A gold set is a list of questions where the right answer is known in advance. Precision@1 asks: was the top result the right one? Recall@5 asks: was the right answer anywhere in the top 5? MRR (mean reciprocal rank) rewards putting the right answer near the top. The reranker is a second model that re-sorts the top 50 results; it is slower but noticeably more accurate.

Local gold set

precision@1 precision@3 MRR search p95
Hybrid (RRF) 0.563 0.750 0.704 ~4 ms
+ cross-encoder rerank (MCP default) 0.813 0.875 0.867 ~230 ms

Reproduce with npm run bench. Full methodology, the gold set itself, and every miss are printed and documented in docs/BENCHMARKS.md.

Scale

With the real embedder and a file-backed SQLite database, retrieval p95 is 9.1 ms at 10,000 vectors and 30 ms at 50,000. The rerank pass adds a roughly constant 200 ms on top. Most memory products publish self-reported, cloud-hosted numbers; these are measured locally and reproducible from a committed corpus and runner.

Public benchmarks

Four public memory benchmarks, run untuned (stock MiniLM embedder, production handlers, zero benchmark-specific tweaks), matching or beating MemPalace on all four:

Benchmark Our result Comparison
LongMemEval-S R@5 = 97.8% vs 96.6% published
ConvoMem R@10 = 93.5% vs 92.9%
LOCOMO session R@10 = 82.2%, R@50 = 100% vs 60.3% baseline
MemBench hit@5 = 78.7% vs their 80.3% tuned

Run them yourself: npm run bench:longmemeval, bench:locomo, bench:convomem, bench:membench. The honest notes (where the reranker helps and where it hurts, the dedup floor on MemBench, gold-set size caveats) are in docs/BENCHMARKS.md.

Features

Core

  • 51 MCP tools: CRUD and retrieval, a confidence-tagged knowledge graph, a self-correcting write gate, signed provenance and verification, an event bus with SSRF-guarded webhooks, change propagation and advisor surfaces, resumable session state, expertise profiles, memory tiers, Obsidian vault round-tripping, and GDPR-grade forget and history. Full list below.
  • Hybrid search: vector similarity (meaning) plus keyword matching (exact terms), merged with Reciprocal Rank Fusion. rerank: true adds the cross-encoder pass. use_graph: true blends in HippoRAG Personalized PageRank multi-hop scores. as_of: <timestamp> searches the graph as it stood at a past moment.
  • Local embeddings: Transformers.js running all-MiniLM-L6-v2 (384 dimensions) inside Node.js. No Python, no cloud API, no GPU.
  • SQLite storage: one file, using better-sqlite3 with two extensions: sqlite-vec for vector nearest-neighbor search, FTS5 for keyword search with BM25 ranking.
  • Structure-aware chunking: text splits on paragraphs, markdown on headings (heading context preserved in each chunk), code on function and class boundaries, legal on sentences.
  • Scopes: organize memories into global, project, user, team, department.
  • Version history: every update saves the previous version. Full audit trail of who changed what, when.
  • Temporal decay: optional time-based scoring that favors recent memories (exponential or linear).
  • Confidence scoring: every result carries a 0 to 1 confidence and a plain label (high, medium, low).
  • Expiration: time-sensitive memories can carry an expiry date and drop out of search automatically.

Self-improvement

  • Access tracking: every search, get, and related-memory call records which memories were touched.
  • Quality scoring: automatic importance_score and confidence_score on every memory, from access frequency, recency, and content signals.
  • Learning extraction: at session end, a headless claude -p reviews the transcript and stores zero to five curated learnings. (This replaces the older type: "agent" Stop hook, which is silently broken on macOS; see anthropics/claude-code#39184.)
  • Dream cycle: scheduled or on-demand deduplication, re-scoring, pruning, expiry enforcement, and knowledge-gap detection.
  • Gap detection: searches that return nothing are logged, so you can see what knowledge is missing.

Claude Code hooks

Five opt-in hooks, installed by init:

Hook When it fires What it does
SessionStart session begins Status check (memory count, expired, stale docs) and surfaces the top memories for the project
UserPromptSubmit each prompt that carries a task signal (a ticket/PR id or ≥2 keywords) Keyword-searches the store and surfaces matching memories so you recall prior work before re-deriving it; stays silent on trivial prompts
PostToolUse after a memory search Tracks hits and misses to search-log.jsonl
PreCompact before context compression Optional learning extraction (off by default)
Stop session ends Spawns headless claude -p to review the session and store learnings

The Stop hook detaches in about 30 ms and reviews in the background for 10 to 60 seconds. It needs the claude CLI on $PATH (or $CLAUDE_BIN), authenticated. Turn it off with review_on_stop: false in ~/.mcp-memory/config.json.

Metadata on every memory

Field Purpose Examples
scope Isolation level global, project, user, team, department
namespace Sub-scope grouping "my-project", "legal-team", "q4-audit"
department Organizational unit legal, engineering, hr, sales, finance
document_type Content classification contract, policy, code, incident, decision, report
access_level Data sensitivity public, internal, confidential, restricted
tags Flexible categorization ["renewal", "notice-period", "compliance"]
language Content language (ISO 639-1) "en", "da", "de"
source Origin file path, URL, system name
author Creator person or system name
metadata Domain-specific JSON {contract_type: "NDA", parties: ["A","B"]}
expires_at Auto-expiration date ISO 8601 timestamp

scope and namespace group content within one database. A shared-database MCP_API_NAMESPACE pin gives supported per-namespace multi-tenant isolation (schema v14); a separate database file per tenant is the strongest boundary. See docs/MULTI-TENANCY.md.

Knowledge graph and bi-temporal model

  • Bi-temporal validity: every memory carries valid-time (valid_from, valid_to) alongside transaction-time. Updates invalidate rather than delete: the prior fact gets a valid_to stamp instead of being overwritten, so history is never lost. Reads default to currently valid rows but accept as_of: <timestamp> for point-in-time recall. memory_history returns one memory's full timeline.
  • Confidence-tagged links: memories connect via wikilink, co-occurrence, and similarity edges, each with a confidence weight. memory_graph traverses entities and relationships up to 3 hops. memory_extract_entities stores LLM-extracted entities and relationships.
  • HippoRAG multi-hop: use_graph: true on search runs Personalized PageRank over the entity and link graph for associative retrieval.
  • Token-budgeted answers: memory_query answers a question with a tight subgraph. It seeds from hybrid search, walks the graph up to max_hops while avoiding hubs, and returns a token-budgeted context string instead of flooding the window.
  • Communities: memory_communities finds densely connected entity clusters, for "what are the main themes in here?" questions.

Self-correcting writes

  • Write gate: stores route through an ADD, UPDATE, DELETE, or NOOP decision (on_conflict), so new facts reconcile with existing ones instead of piling up duplicates.
  • Contradiction detection: a cross-encoder NLI model flags when an incoming memory contradicts something already stored.
  • Forgetting curve: memories carry a stability signal, so rarely reinforced knowledge slowly sinks in ranking, the way human memory fades.

Agent-OS memory

  • Core memory block: a small, bounded, always-in-context note per (scope, namespace) that the agent maintains itself (core_memory_get, core_memory_append, core_memory_replace). Appends that would overflow are refused, which forces deliberate compaction.
  • Tiers: memory_tiers reports a MemGPT-style hot / recall / archival distribution and lists the hot working set.
  • Reflection: memory_reflect gathers the most reflection-worthy memories and, in store mode, persists synthesized insights linked back to their sources.

Obsidian vault

  • Bidirectional sync: vault_sync reads a vault in. memory_export_vault writes memories out as .md files with YAML frontmatter that round-trips losslessly for every authored field (id, scope, namespace, tags, access_level, importance, timestamps). Two derived scores are not in the frontmatter and reset on re-import: confidence_score (to 0.6) and stability (to 1.0). Use memory_export (JSON) for a byte-perfect backup. One metadata key is reserved: metadata._vault holds internal sync bookkeeping and never appears in tool output or exported files.
  • JSON Canvas: memory_canvas exports the graph as a JSON Canvas 1.0 .canvas file that opens as a spatial board in Obsidian.
  • Read-only wiki: serve exposes /publish/:namespace (index, page, search, graph) as a read-only wiki. It is deliberately not behind bearer auth, but is hard-scoped to published access levels (MCP_PUBLISH_ACCESS_LEVELS, default public).
  • Session notes and templates: memory_session_note appends to one "daily note" per session. memory_template returns structured note scaffolds per document type.

Team and solo sharing (git)

  • memory init wizard: interactive setup (or --yes for defaults) that writes ~/.mcp-memory/config.json (or project-scoped config) plus the Claude Code wiring.
  • Committable graph artifact: memory export-graph writes a deterministic memory-graph.json you can commit and share. memory git-setup installs a .gitattributes entry and the memory-union merge driver so parallel commits merge instead of conflict.
  • Attribution: set MCP_AGENT_ID (or pass agent_id per store) and memory_attribution reports how many valid memories each agent wrote.

Trust and governance

  • Questions to ask: memory_questions surfaces what the graph is well placed to find: ambiguous links to confirm, frequently mentioned but under-documented entities, orphaned and stale memories.
  • GDPR-grade forget: memory_forget soft-deletes by default (a tombstone via valid_to, recoverable, still visible via as_of). With hard: true it returns a portability export first, then permanently erases. memory_delete is unchanged.
  • Output sanitization: every tool result passes through one chokepoint that strips ANSI and VT escapes, control characters, and zero-width or BiDi Trojan-Source spoofing before it leaves the server. Stored content stays raw at rest.
  • Hot reload: config changes apply without a restart.

Web dashboard

The server ships a browser dashboard for viewing and managing memories outside Claude. It runs on the same Express server as the MCP HTTP transport, so there is no separate process.

Six pages:

  • Dashboard: memory counts, content size, breakdowns by scope, department, and type, plus the 10 most recently updated memories.
  • Search: hybrid search with confidence and match-type badges, and instant fuzzy suggestions as you type.
  • Browse: sortable, paginated table of all memories with scope filtering and quality indicators.
  • Memory detail: full content, metadata, version history, related memories, inline edit and delete.
  • Knowledge graph: D3 force-directed view. Nodes sized by importance, colored by scope. Zoom, pan, drag, double-click to navigate.
  • Tools: a console for the full tool surface. It lists every tool the server advertises, renders a form from each schema, and runs it over the authenticated MCP endpoint. Destructive tools ask for confirmation first.

Tech: React 19, Vite, Tailwind CSS v4, shadcn/ui, Fuse.js, D3, Recharts.

Run it:

# Development (hot reload)
npm run build && npm run serve   # Terminal 1: server on :3100
npm run dev:web                   # Terminal 2: Vite on :5173 (proxies /api to :3100)

# Production (single process)
npm run build:all                 # Builds server + frontend
npm run serve                     # http://localhost:3100 serves both API and UI

Docker: the image includes the built frontend. After docker compose up, the dashboard is at http://<host>:3200 alongside the MCP endpoint. Team members can browse the shared store from any browser, no Claude Code required.

REST API (16 endpoints)

The REST surface is for reading and managing. Creating memories goes through MCP (memory_store over POST /mcp); there is deliberately no POST /api/memories.

Method Path Description
GET /api/stats Memory counts and breakdowns
GET /api/search?q=... Hybrid search with filters
GET /api/memories List with pagination and sorting
GET /api/memories/:id Single memory with metadata
GET /api/memories/:id/versions Version history
GET /api/memories/:id/related Semantically related memories
PATCH /api/memories/:id Update content or metadata
DELETE /api/memories/:id Delete a memory
GET /api/graph Nodes and edges for graph visualization
GET /api/manifest Integrity manifest (merkle root plus per-memory hashes)
GET /api/insights Trends and themes summary
GET /api/health Knowledge-gap report (recurring zero-result searches)
GET /api/webhooks List webhook targets (gated by MCP_WEBHOOKS)
POST /api/webhooks Register an SSRF-validated outbound target
DELETE /api/webhooks/:id Remove a webhook target
POST /api/webhooks/dispatch Drain the durable, HMAC-signed delivery queue

The first nine are what the dashboard uses. All REST endpoints call the same handlers as the MCP tools; no business logic is duplicated.

Self-improvement in detail

The server tracks how knowledge is used, scores quality, learns from sessions, and consolidates itself over time.

The learning loop

 ┌──────────────────────────────────────────────────────────┐
 │                    SESSION                                │
 │  Claude searches → access_count++ on matched memories     │
 │  Claude stores   → new memory with initial scores         │
 │  Zero results    → knowledge gap recorded                 │
 └─────────────┬────────────────────────────────────────────┘
               │
               ▼
 ┌──────────────────────────────────────────────────────────┐
 │            SESSION END (Stop command hook)                │
 │  Hook spawns detached `claude -p` headless review         │
 │  --allowedTools restricts to memory_store only            │
 │  Claude judges → 0-5 curated entries via memory_store     │
 │  Deduplicates against existing memories                   │
 └─────────────┬────────────────────────────────────────────┘
               │
               ▼
 ┌──────────────────────────────────────────────────────────┐
 │              DREAM CYCLE (nightly or manual)              │
 │  1. Score    : Recalculate importance from access data    │
 │  2. Expire   : Enforce expiration dates                   │
 │  3. Prune    : Remove low-quality, never-accessed items   │
 │  4. Dedup    : Merge near-duplicate memories              │
 │  5. Gaps     : Surface zero-result search patterns        │
 └──────────────────────────────────────────────────────────┘

Quality scoring

Every memory gets an importance_score between 0 and 1:

importance = 0.3 * current_score + 0.4 * normalized_access_frequency + 0.3 * recency_factor

Recency factor:

Age Factor
< 7 days 1.0
< 30 days 0.7
< 90 days 0.4
> 90 days 0.1

Memories that are never accessed gradually lose importance. Auto-extracted memories start lower and get pruned if they never prove useful.

Note on access reinforcement. The formula above is the periodic recompute run by the consolidate Score stage. Each read (memory_get, memory_search, memory_related) also applies a small immediate boost (importance_score += 0.03, capped at 1.0), and search uses importance as a mild rank multiplier (1 + importance * 0.5). A memory read 20 or more times approaches the ceiling from reads alone, and consolidate re-baselines it on the next run. This popularity weighting is intentional. If you want a fixed value that reads don't drift, set an explicit importance_score on memory_store or memory_update.

Knowledge gap detection

When a search returns nothing, the query is logged. The dream cycle's gap stage surfaces these, so you can see what's missing from the store.

Installation reference

Prerequisites

  • Node.js 20+, for any client.
  • An MCP client. Claude Code is the first-class experience; the automatic capture and recall hooks are Claude-Code-only. Other MCP clients (Codex, Cursor, and the rest) get all 51 tools but drive them manually. See "Other MCP clients" below.
  • For the Stop hook only: the claude binary on $PATH (or $CLAUDE_BIN), authenticated without prompting. Optional; disable with review_on_stop: false.

What init does

npx mcp-memory-graph init                  # user scope: hooks apply to all projects
npx mcp-memory-graph init --scope project  # this project only

User scope writes hooks to ~/.claude/settings.json, so they fire in every Claude Code session. Project scope writes hooks to .claude/settings.json in the current directory and creates .mcp.json for automatic server discovery; collaborators who clone the project get the memory server registered automatically.

Init does seven things:

  1. Verifies the hook scripts exist in dist/hooks/.
  2. Registers the five hooks in settings.json.
  3. Creates the config file with sensible defaults: ~/.mcp-memory/config.json (user scope) or <project>/.mcp-memory/config.json (project scope; the generated .mcp.json pins it via MCP_MEMORY_CONFIG_PATH).
  4. Writes memory usage instructions to .claude/CLAUDE.md (project scope) or prints a snippet (user scope).
  5. Registers the MCP server with Claude Code — user scope runs claude mcp add -s user memory-server -- npx -y mcp-memory-graph for you (idempotent; best-effort — warns with the manual command if the claude CLI isn't on PATH; skip with --no-register). Project scope is registered via the committable .mcp.json instead. This makes step 2 of the Quick Start optional.
  6. Installs the mcp-memory-graph usage skill into ~/.claude/skills/ so Claude Code has inline guidance for all 51 tools, gotchas, and workflows. Skip with --no-skill.
  7. Sets up the nightly consolidation schedule (macOS: launchd, loaded immediately so it runs without a relogin; Linux: prints a cron suggestion; skipped for project scope).

Under a non-interactive shell (agent/CI) the wizard is bypassed: defaults are applied and a report is printed showing what was set and how to change each value. Passing --yes applies the defaults silently (no report).

Key flags: --scope user|project, --schedule HH:MM[,HH:MM] (nightly consolidation time, default 03:00), --vault <path> (enable Obsidian vault round-trip), --no-review-on-stop (disable the end-of-session learning review), --no-skill (skip skill install), --no-register (skip the user-scope claude mcp add), --remote <url> (team server mode).

npx mcp-memory-graph uninstall reverses everything init did: removes hooks, the nightly schedule, the CLAUDE.md block, and the installed skill.

Unattended setup (CI, provisioning, agents)

Every step is scriptable. There is no interactive-only path:

git clone https://github.com/YonasValentin/mcp-memory-graph.git
cd mcp-memory-graph
npm install && npm run build
npx mcp-memory-graph init --scope project --yes   # local: hooks + .mcp.json, no prompts
# or point at a shared self-hosted server instead:
# npx mcp-memory-graph init --remote https://memory.example.com --token-env MEMORY_MCP_TOKEN

Other MCP clients (Codex, Cursor, and more)

Claude Code gets the hooks; everyone else gets the same 51 tools, driven manually. The server is a standard MCP server, so any client works. A line in the client's rules file makes usage near-automatic.

Register the server. Example for Codex, in ~/.codex/config.toml (global) or .codex/config.toml (project, trusted only):

[mcp_servers.memory-graph]
command = "node"
args = ["/abs/path/to/mcp-memory-graph/dist/index.js"]
tool_timeout_sec = 180   # the first call downloads the ~30 MB model once; the 60s default can be tight

[mcp_servers.memory-graph.env]
MCP_MEMORY_DB_PATH = "/abs/path/to/.mcp-memory/memory.db"

# or a shared self-hosted server over HTTP (see Self-hosting below):
# url = "https://memory.example.com/mcp"
# bearer_token_env_var = "MEMORY_MCP_TOKEN"

Or codex mcp add memory-graph -- node /abs/path/to/mcp-memory-graph/dist/index.js. Cursor, Windsurf, and other clients use their own MCP config format, but the server command (node .../dist/index.js) and the HTTP option are the same.

Then nudge the agent in its instructions file (Codex: AGENTS.md; Cursor: project rules):

Before answering questions about architecture, decisions, patterns, or past
fixes, call memory_search on the memory-graph server first; store new
decisions, patterns, and fixes with memory_store.

Self-hosting and sharing a memory base

The server runs three ways, from a single-user cache to a knowledge base shared across many machines. All three are local-first: nothing leaves the machines you choose to run it on.

1. Local (single user), the default

npx mcp-memory-graph init registers a local stdio server plus the hooks. Memory lives in one SQLite file on your machine. Nothing else to run. Right choice for solo use.

2. Shared server (multiple machines or a group)

Run one server that many clients connect to over HTTP. Everyone shares the same memory base, live.

Start the server (pick one):

# From source: build the server (and the dashboard, if you want it) first
npm run build:all
MCP_AUTH_TOKEN=$(openssl rand -hex 32) MCP_BIND=0.0.0.0 npm run serve
# MCP at /mcp, REST at /api, dashboard at /, all on :3100

# Or with Docker (frontend included; publishes host port 3200 by default)
MCP_AUTH_TOKEN=$(openssl rand -hex 32) docker compose up -d

Set MCP_AUTH_TOKEN whenever the server is reachable beyond loopback. It is a shared bearer token, one secret for all clients. The server refuses to start unauthenticated on a non-loopback bind unless you set MCP_AUTH_OPTIONAL=1. Terminate TLS at a reverse proxy or tunnel for anything off-host.

Connect a client, one command per machine:

npx mcp-memory-graph init --remote https://memory.example.com --token-env MEMORY_MCP_TOKEN
export MEMORY_MCP_TOKEN=<the server token>     # in your shell or .env

For Claude Code this writes a project .mcp.json pointing at the shared server. The token is stored as an env-var reference ("Authorization": "Bearer ${MEMORY_MCP_TOKEN}"), so the committed .mcp.json never contains the secret. Non-Claude clients point at the same server through their own MCP config.

Flag Effect
--token-env <NAME> Reference this env var for the token (default MEMORY_MCP_TOKEN)
--token <value> Inline a literal token instead (avoid committing it)
--no-auth Omit the auth header (loopback or trusted network only)

In remote mode the local capture and recall hooks are not installed. The memory lives on the server, not in a local file the hooks could read. The agent uses memory_search and memory_store directly (the CLAUDE.md guidance is still written).

3. Git vault (async, version-controlled sharing)

Prefer your knowledge base in git, reviewed through pull requests, with no server to run? Export memories to plain Markdown and share the folder as a git repo:

npx mcp-memory-graph vault-init                    # make the vault a git repo (union merge driver + rebuild hook)
git add -A && git commit -m "memory snapshot" && git push
# collaborators, once after cloning:
#   npx mcp-memory-graph vault-init                # registers the union merge driver + post-merge hook in THEIR clone
# collaborators, thereafter: git pull && npx mcp-memory-graph rebuild

Each collaborator must run vault-init once in their own clone. The merge driver and post-merge rebuild hook live in local git config (.git/), not in the repo. A fresh clone without vault-init will hit raw conflict markers in .memory/graph.json on its first concurrent pull. Re-running vault-init is idempotent and does not clobber the committed sidecar.

Two recovery notes for team vaults:

  • After a merge you resolved by hand (the post-merge hook only fires on clean merges), memory rebuild can refuse with VaultIntegrityError because .memory/manifest.json is stale. Delete that file and re-run rebuild; it is derived state and regenerates.
  • Hand-edited a .md while your database has newer state? Import first (vault_sync or rebuild), then export (memory sync). A full export from a stale database overwrites vault files, including your hand edit.

Security notes

  • MCP_AUTH_TOKEN is a single shared secret, fine for a trusted group; rotate it by restarting the server with a new value. For per-key RBAC (one server, N keys, each pinned to a namespace set and an access-level ceiling) use memory keys create|list|revoke (schema v16). The legacy shared token still works and is checked first. See docs/MULTI-TENANCY.md.
  • Never commit a token. The --remote default keeps it in an env var by design.
  • Bind to 127.0.0.1 (the default) unless you front the server with a proxy that terminates TLS; then set MCP_BIND=0.0.0.0.

Building an org-wide AI brain? One server, a key per employee, an org chart the AI can traverse (people, teams, SOPs, and tools as typed graph nodes), with enforced who-sees-what. The recipe, built on existing primitives, is in docs/ENTERPRISE-BRAIN.md.

Configuration

Environment variables

Variable Default Description
MCP_MEMORY_DB_PATH ~/.mcp-memory/memory.db Database file location. The directory is created automatically.
MCP_MEMORY_MODEL Xenova/all-MiniLM-L6-v2 HuggingFace embedding model name. Must be an ONNX model compatible with Transformers.js.
MCP_MEMORY_DIMENSIONS 384 Embedding vector dimensions. Must match the model's output.
MCP_MEMORY_CONFIG_PATH ~/.mcp-memory/config.json Override location for the configuration file.

The full env reference (auth, rate limits, webhooks, vault, publish) is in docs/ENV.md.

Custom database location

claude mcp add memory-server --env MCP_MEMORY_DB_PATH=/path/to/project/.memory.db node /path/to/dist/index.js

Alternative embedding models

# Swap the embedding model (same 384 dimensions; drop-in for the existing index
# AFTER a re-embed; see the warning below)
claude mcp add memory-server \
  --env MCP_MEMORY_MODEL=Xenova/bge-small-en-v1.5 \
  --env MCP_MEMORY_DIMENSIONS=384 \
  node /path/to/dist/index.js

Model identity is recorded and enforced. The database remembers which embedding model built it (schema_meta.embedding_model). Starting the server with a different MCP_MEMORY_MODEL fails loudly instead of silently degrading every search (same dimension does not mean same vector space). To switch models: set the new model and run memory rebuild (re-embeds from the vault), or export and re-import.

Configuration file

The config file controls self-improvement behavior, hook settings, and per-project overrides. Resolution order: MCP_MEMORY_CONFIG_PATH env, then <cwd>/.mcp-memory/config.json (project-scope init writes this), then ~/.mcp-memory/config.json. Created by npx mcp-memory-graph init, or write it by hand:

{
  "defaults": {
    "scope": "project",
    "namespace": "auto"
  },
  "projects": [
    {
      "path": "~/Documents/MyApp",
      "namespace": "my-app",
      "watch": ["README.md", "docs/**/*.md"]
    }
  ],
  "consolidation": {
    "similarity_threshold": 0.85,
    "prune_after_days": 30,
    "min_importance_to_keep": 0.1,
    "max_operations": 100,
    "schedule": [
      { "hour": 11, "minute": 30 },
      { "hour": 16, "minute": 0 }
    ]
  },
  "hooks": {
    "extract_on_compact": false,
    "extract_on_session_end": false,
    "track_searches": true,
    "review_on_stop": true
  },
  "extraction": {
    "categories": ["decision", "pattern", "error_fix", "convention"],
    "min_confidence": 0.4
  }
}
Section Key Default Description
defaults scope "project" Default scope for new memories
defaults namespace "auto" Default namespace ("auto" derives from project directory name)
projects[] path Project root directory
projects[] namespace Namespace override for this project
projects[] watch Glob patterns for files to track for changes
consolidation similarity_threshold 0.85 Cosine similarity threshold for deduplication (0.5-1.0)
consolidation prune_after_days 30 Days before pruning low-quality memories
consolidation min_importance_to_keep 0.1 Minimum importance score to survive pruning
consolidation max_operations 100 Max operations per consolidation run
consolidation schedule [{ "hour": 3, "minute": 0 }] One or more { hour, minute } entries (24-hour). Re-run init after changing to regenerate the launchd plist.
hooks extract_on_compact false Mine transcript before context compression (regex-based, off by default)
hooks extract_on_session_end false Extract learnings when session ends (regex-based, off by default)
hooks track_searches true Log search hits and misses to search-log.jsonl
hooks review_on_stop true Spawn headless claude -p at session end to review the transcript and store learnings. Set false to disable without removing the hook.
extraction categories ["decision", "pattern", "error_fix", "convention"] Learning categories to extract
extraction min_confidence 0.4 Minimum confidence for extracted learnings
storage db_path scope-dependent SQLite file location (~/.mcp-memory/memory.db for user scope, <project>/.mcp-memory/memory.db for project scope). MCP_MEMORY_DB_PATH overrides.
vault path unset Obsidian vault root used by vault_sync, memory_export_vault, and rebuild when no explicit path is passed. MCP_VAULT_PATH and --vault <path> override.
vault write_through true Mirror memory writes out to the vault as .md files when a vault is configured. MCP_VAULT_WRITE_THROUGH=0 overrides.

CLI commands

Command Description
npx mcp-memory-graph Start the MCP server on stdio (default)
npx mcp-memory-graph serve Start the HTTP server: MCP transport, REST API, web dashboard
npx mcp-memory-graph init Interactive setup wizard: hooks, config, nightly schedule (user scope). Add --yes/-y for non-interactive
npx mcp-memory-graph init --scope project Setup for the current project only (creates .mcp.json and .claude/settings.json)
npx mcp-memory-graph uninstall Reverse init: remove hooks and schedule
npx mcp-memory-graph consolidate Run the dream cycle manually
npx mcp-memory-graph export-graph [--out <path>] [--scope <s>] [--namespace <n>] Write a committable, deterministic memory-graph.json for git sharing
npx mcp-memory-graph git-setup Install the .gitattributes entry and memory-union merge driver for conflict-free graph sharing
npx mcp-memory-graph merge-graphs <ours> <theirs> <out> Git union merge driver for memory-graph.json (invoked by git, not by hand)
npx mcp-memory-graph vault-init [--vault <path>] Make the vault a git repo: union merge driver, pull.rebase=false, post-merge and post-checkout rebuild hooks
npx mcp-memory-graph sync Export all valid memories plus the graph sidecar to the vault (.md files)
npx mcp-memory-graph rebuild [--vault <path>] Rebuild the SQLite index from the vault's .md files (collaborators run this after git pull)
npx mcp-memory-graph migrate Upgrade the database to the current schema version
npx mcp-memory-graph backup [--out <path>] WAL-safe online snapshot (retention: MCP_MEMORY_MAX_BACKUPS, default 10)
npx mcp-memory-graph keys create|list|revoke Per-key RBAC: mint, inspect, revoke API keys (namespace set plus access ceiling)

Tools reference

1. memory_store

Store a new memory. The vector embedding is generated automatically.

Parameter Type Required Default Description
content string Yes The text content to store
title string No Short title for the memory
scope enum No global¹ global, project, user, team, department
namespace string No ¹ Sub-scope (e.g., project name)
importance_score number No computed 0-1 manual importance override
agent_id string No MCP_AGENT_ID env Attribution for memory_attribution rollups
on_conflict enum No add add, supersede, skip: write-gate behavior on near-duplicates
document_type string No contract, policy, code, incident, decision, etc.
source string No Where this content came from
author string No Who created it
department string No legal, engineering, hr, sales, finance
tags string[] No Tags for categorization
access_level enum No internal public, internal, confidential, restricted
language string No en ISO 639-1 language code
metadata object No Domain-specific key-value pairs
expires_at string No ISO 8601 expiration date

¹ When omitted, a loaded config file's defaults.scope and defaults.namespace ("auto" = project directory name) apply first; the hardcoded fallback is global with no namespace.

Example prompt:

Store this memory with department=legal and tags=["compliance","gdpr"]:
"All customer data processing agreements must include a GDPR Article 28 addendum effective January 2025."

2. memory_search

Hybrid vector plus keyword search across stored memories.

How it works:

  1. Your query is embedded and compared against all stored vectors (semantic similarity).
  2. Your keywords are matched against memory text via FTS5 (exact matching).
  3. Both result lists merge using Reciprocal Rank Fusion.
  4. Optional temporal decay favors recent memories.
  5. Results get a confidence score and label.
  6. The access is recorded for quality scoring.
Parameter Type Required Default Description
query string Yes Natural language query or keywords
scope enum No Filter by scope
namespace string No Filter by namespace
department string No Filter by department
document_type string No Filter by document type
tags string[] No Filter: must contain ALL specified tags
access_level enum No Filter by access level
language string No Filter by language
limit number No 10 Max results (1-100)
offset number No 0 Pagination offset
search_mode enum No hybrid hybrid, vector, or keyword
temporal_decay object No {type: "exponential", half_life_days: 30} or {type: "linear", max_age_days: 365}
date_from string No Only memories after this date
date_to string No Only memories before this date
min_confidence number No Minimum confidence threshold (0-1)

Example prompts:

Search memories for "contract renewal notice requirements" in the legal department

Search memories for "authentication" with search_mode=keyword

Search memories for "deployment patterns" with temporal_decay={type:"exponential", half_life_days:60}

Each result includes the memory content and metadata, the combined RRF score, a normalized confidence (0-1), a confidence_level label (high at 0.7 and above, medium at 0.4 and above, low below that), and a match_type (hybrid, vector, or keyword).

The default detail_level: "summary" projection returns confidence_level but omits the numeric confidence and the full content, to save tokens. Pass detail_level: "full" when you need them.

3. memory_get

Retrieve a specific memory by ID. For ingested documents, optionally include all child chunks.

Parameter Type Required Default Description
id string Yes Memory UUID
include_chunks boolean No false Include child chunks for ingested documents

4. memory_update

Update an existing memory. If content changes, the embedding regenerates automatically. The previous version is saved to history.

Parameter Type Required Default Description
id string Yes Memory ID to update
content string No New content (triggers re-embedding)
title string No New title
metadata object No Replacement metadata
tags string[] No Replacement tags
expires_at string/null No New expiry, or null to remove
changed_by string No Who made this change

5. memory_delete

Delete memories by ID or by filter. At least one of id or filter is required.

Parameter Type Required Description
id string No Delete a specific memory
filter.scope enum No Delete all in scope
filter.namespace string No Delete all in namespace
filter.department string No Delete all in department
filter.before_date string No Delete older than date
filter.expired_only boolean No Only delete expired memories

6. memory_list

Browse memories with filtering, pagination, and sorting.

Parameter Type Default Description
scope enum Filter by scope
namespace string Filter by namespace
department string Filter by department
document_type string Filter by type
limit number 20 Max results (1-100)
offset number 0 Pagination offset
sort_by enum created_at created_at, updated_at, or title
sort_order enum desc asc or desc

7. memory_ingest

Ingest a full document: it is chunked by content type, each chunk is embedded, and everything is stored with parent-child relationships. Use this for large documents.

Parameter Type Default Description
content string Full document text (required)
title string Document title
content_type enum text Chunking strategy: text, markdown, code, legal, structured
chunk_size number 512 Target chunk size in characters (~4 chars per token)
chunk_overlap number 50 Overlap between chunks for context
source string Origin file or URL
document_type string Document classification
department string Department
author string Author
tags string[] Tags
metadata object Domain-specific metadata

Chunking by content type:

Type Strategy Splits on
text Paragraph Double newlines (\n\n)
markdown Heading-aware #, ##, ### headings
code Function-aware function, class, const, interface boundaries
legal Sentence Period, exclamation, question marks
structured Paragraph Double newlines (same as text)

8. memory_related

Find memories semantically related to a given one. Uses vector similarity, so it finds connections keyword search misses.

Parameter Type Default Description
id string Memory ID to find related for (required)
limit number 5 Max results (1-50)
min_similarity number Minimum similarity threshold (0-1)

9. memory_versions

View a memory's version history. Every update creates a version record.

Parameter Type Default Description
id string Memory ID (required)
limit number 10 Max versions (1-50)

10. memory_stats

Usage statistics about stored memories.

Parameter Type Description
scope enum Filter stats by scope
namespace string Filter stats by namespace
department string Filter stats by department

Returns totals for memories, documents, and chunks, breakdowns by scope, department, and type, storage size, and the expired count.

11. memory_export

Export current memory content as JSON for portability or migration. This is not a full backup: it serializes only currently live, top-level memories. It omits edit history, the knowledge graph, condense-undo originals, ingested child chunks, and soft-forgotten rows. For disaster recovery, copy the SQLite file (cp ~/.mcp-memory/memory.db ..., see the RUNBOOK); embeddings recompute deterministically on import.

Parameter Type Default Description
scope enum Filter export
namespace string Filter export
department string Filter export

Max 1000 records per export.

12. memory_import

Import memories from JSON. Each item is embedded and stored.

Parameter Type Default Description
data array Array of memory objects (required)
overwrite boolean false Overwrite existing IDs

13. vault_sync

Scan an Obsidian vault, parse the markdown, embed and store. See Obsidian Vault Integration below.

14. vault_status

Sync status for a vault: files synced, pending, changed, and the last sync time.

15. vault_search

Hybrid search scoped to one vault's memories.

By default this searches the namespace named after the vault's folder name. Memories exported from another namespace keep their original namespace in frontmatter. If a search over a freshly synced vault returns nothing, pass an explicit namespace (and/or scope) override.

16. memory_consolidate

The dream cycle: deduplicate, score, prune, expire, and detect knowledge gaps.

Parameter Type Required Default Description
scope enum No Limit consolidation to a scope
namespace string No Limit consolidation to a namespace
similarity_threshold number No 0.85 Cosine similarity for dedup (0.5-1.0)
prune_expired boolean No true Remove expired memories
prune_low_quality boolean No false Remove memories below min importance
dry_run boolean No false Preview changes without applying
max_operations number No 100 Cap on total operations per run

Five stages run in order: Score (recalculate importance), Expire (enforce expires_at), Prune (drop low-quality when enabled), Dedup (merge near-duplicates), Gaps (surface zero-result searches). Returns a report with counts per stage.

Example prompts:

Run a dream cycle consolidation with dry_run=true to preview what would change

Consolidate memories in namespace=my-project with similarity_threshold=0.9

Run consolidation with prune_low_quality=true to clean up unused memories

17. memory_extract_learnings

Mine a session transcript for decisions, patterns, error fixes, and conventions using heuristic pattern matching. No external LLM needed.

Parameter Type Required Default Description
transcript string Yes Session transcript text to mine
scope enum No Scope for extracted memories
namespace string No Namespace for extracted memories
department string No Department for extracted memories
tags string[] No Additional tags
source string No Source attribution
categories enum[] No all Filter to decision, pattern, error_fix, convention
auto_store boolean No true Automatically store extracted learnings

Extraction looks for decision language ("we decided", "the fix was"), pattern language ("always use", "never do"), error fixes ("the problem was", "solved by"), and conventions ("our convention is", "standard practice"). Each hit is deduplicated against existing memories and stored with a lower initial confidence.

18-42. Graph, Agent-OS, vault round-trip, and governance tools

Parameters for the remaining tools are validated by Zod schemas in src/schemas/; each registration's full description lives in src/server.ts.

# Tool Purpose
18 memory_tiers MemGPT-style hot / recall / archival tier distribution plus the hot working set
19 memory_export_vault Write memories out to an Obsidian vault as .md files with YAML frontmatter (reverse of vault_sync)
20 memory_canvas Export the graph as a JSON Canvas 1.0 .canvas for Obsidian
21 memory_manifest Lightweight content-free index (titles, types, tags, scores) to discover what exists
22 memory_graph Query the knowledge graph: entities, relationships, linked memories, multi-hop traversal (depth 1-3)
23 memory_extract_entities Store LLM-extracted entities and relationships for a memory
24 memory_condense Apply agent-generated summaries to condense old memories (original preserved)
25 memory_restore Restore a condensed memory to its original content and re-embed
26 memory_query Answer a question with a tight, token-budgeted subgraph instead of flooding context
27 core_memory_get Read the pinned, always-in-context core-memory block for a (scope, namespace)
28 core_memory_append Append to the core-memory block (refused if it would overflow char_limit)
29 core_memory_replace Replace text in the core-memory block (used to update or compact it)
30 memory_reflect Generative-Agents-style reflection: gather material, or store a synthesized insight
31 memory_communities GraphRAG community detection over the entity graph for corpus-level themes
32 memory_template Fetch a structured note scaffold per document type
33 memory_session_note Per-session "daily note" (appends to one memory per session_id)
34 memory_attribution Roll up how many valid memories each agent_id wrote
35 memory_questions "Questions to ask" digest: ambiguous links, under-documented entities, orphans
36 memory_forget GDPR-grade forget: soft-delete (recoverable) by default, or hard erase-after-export
37 memory_history Point-in-time bi-temporal timeline plus edit-version history for one memory
38 memory_unlinked_mentions Entity names mentioned in memory text with no graph edge yet (suggested links)
39 memory_query_structured Exact metadata filter query over top-level memories (no semantic ranking)
40 memory_version_diff Line-level diff between two stored versions of a memory
41 memory_version_restore Roll a memory back to a previous version (snapshots the current one first)
42 memory_verify Verify the signed provenance envelope of memories (ed25519 over content_hash plus origin): per-memory ok/unsigned/content_mismatch/bad_signature/untrusted plus a summary. Opt-in signing via MCP_SIGN_MEMORIES; multi-machine allowlist via MCP_TRUSTED_PUBKEYS / trusted_pubkeys

43-50. Active infrastructure and typed shapes

# Tool Purpose
43 memory_webhook Manage the event bus (gated by MCP_WEBHOOKS): register, list, delete SSRF-validated outbound targets, or dispatch the durable, HMAC-signed delivery queue (retry, circuit breaker, dead letter). Mutations emit created/updated/superseded/deleted/forgotten events
44 memory_insights Advisor digest: unresolved conflicts, stale memories, most-contradicted facts, evidence-less decisions
45 memory_health Store health roll-up: live/retired/stale counts, aging buckets, unresolved conflicts, webhook delivery health
46 memory_revalidate Change propagation: list stale memories, preview a change's blast radius (dry run), or confirm a memory is current
47 memory_session_state Resumable "where was I" session state, save and resume (versioned)
48 memory_expertise Per-user expertise profile: observe a topic, get the profile
49 memory_export_dataset Export learnings and reflections as JSONL training pairs (pairs/chatml/alpaca) for fine-tuning
50 memory_lesson Capture a structured lesson or incident in one call: fills the matching section template (incident → Symptom/Root Cause/Fix/Prevention; lesson → What/Why it matters/How to apply) from your field values and stores it through the normal deduped write path

Architecture

System overview

Claude Code ──stdio──> MCP Memory Graph
                            │
                    ┌───────┴───────┐
                    │               │
              Transformers.js   SQLite DB
              (embeddings)    (~/.mcp-memory/memory.db)
                                    │
                       ┌────────────┼────────────┐
                       │            │            │
                   memories    memories_fts  memories_vec
                   (data +     (FTS5 index)  (vec0 index)
                    scores)
                       │
              ┌────────┼────────┐
              │        │        │
        memory_    memory_    ingest_
        versions   access_    source_
                   log        tracking

Claude Code Hooks (opt-in)
    │
    ├── SessionStart ──> memory_stats (status check)
    ├── PostToolUse ───> search-log.jsonl (hit/miss tracking)
    ├── PreCompact ────> learning extraction (disabled by default)
    └── Stop ──────────> spawn detached `claude -p` headless review
                              │
                              └─> --allowedTools mcp__memory-server__memory_store
                                  Claude reviews transcript → memory_store calls

Nightly Schedule (opt-in)
    └── 3:00 AM ───────> memory_consolidate (dream cycle)

How hybrid search works

Query: "contract renewal notice"
         │
    ┌────┴────┐
    │         │
 Embed     Tokenize
    │         │
    ▼         ▼
 sqlite-vec  FTS5
 (semantic)  (keyword)
    │         │
    │  rank   │  rank
    │  1: A   │  1: A
    │  2: C   │  2: B
    │  3: B   │  3: D
    │         │
    └────┬────┘
         │
   Reciprocal Rank Fusion
   RRF(d) = Σ 1/(60 + rank)
         │
         ▼
   [A: 0.033, B: 0.026, C: 0.016, D: 0.016]
         │
   Temporal Decay (optional)
         │
   Confidence Scoring
         │
   Access Tracking (record hit)
         │
         ▼
   Final ranked results

Database schema

The SQLite database is at schema version 18, with automatic forward migration from any earlier version. The core tables:

  • memories: all memory data, TEXT primary key (UUIDs), parent-child support for document chunks, plus access_count, last_accessed_at, importance_score, and confidence_score.
  • memories_fts: FTS5 virtual table for keyword search with BM25 ranking, synced with the memories table.
  • memories_vec: vec0 virtual table for vector search. 384-dimension float32 embeddings with scope and namespace metadata for pre-filtering.
  • memory_versions: version history for every change.
  • memory_access_log: every search, get, and related-memory access, with timestamps and query context.
  • ingest_source_tracking: ingested files, for change detection on re-ingestion.

Later schema versions add the knowledge-graph tables (entities, links, conflicts, communities), webhooks, session state, and the RBAC api_keys table. Every mutation keeps the three core tables in sync atomically inside a SQLite transaction; the repository.ts layer enforces this, and nothing else touches the tables directly.

Project layout

src/
├── index.ts        # Entry point (stdio transport)
├── server.ts       # All 51 tool registrations
├── config/         # Config file loading + validation
├── db/             # Connection, schema, migrations, repository (three-table sync)
├── embeddings/     # Embedding providers (Transformers.js, registry, Ollama)
├── search/         # Hybrid search, reranker, temporal decay, scoring
├── chunking/       # Per-content-type chunking strategies
├── graph/          # Entities, links, PageRank, communities
├── vault/          # Obsidian round-trip, write-through, bookkeeping
├── tools/          # One handler per MCP tool
├── api/            # REST routes + security middleware
├── events/         # Webhook bus (SSRF guard, HMAC, retry)
├── cli/            # init, serve, vault, backup, keys, migrate, ...
├── hooks/          # Claude Code lifecycle hooks
└── schemas/        # Zod schemas for every tool input

Use cases by department

Engineering:

Store memory: "We chose event sourcing over CRUD for the order service because
we need full audit trail and the ability to replay events for debugging.
ADR-042, decided 2026-03-15."
department=engineering, document_type=decision, tags=["architecture","event-sourcing"]

Legal:

Ingest this contract template with content_type=legal, department=legal,
document_type=contract, tags=["template","nda","standard"]

Finance:

Store memory: "Q4 2025 revenue recognition policy change: SaaS contracts
over 12 months now recognized ratably per ASC 606 guidance."
department=finance, document_type=policy, tags=["revenue-recognition","asc-606"]

HR:

Ingest the employee handbook with department=hr, content_type=text,
document_type=policy, tags=["handbook","onboarding"]

Sales:

Store memory: "When prospect objects on price vs CompetitorX, lead with
our 99.9% uptime SLA and dedicated support. This converted 3 deals in Q1."
department=sales, document_type=pattern, tags=["objection-handling","pricing","competitorx"]

Obsidian Vault Integration

Point the server at a vault folder and every markdown file becomes a searchable memory, with frontmatter, tags, and wiki-links extracted. No Obsidian app needed; it reads the files straight from disk.

Tool Description
vault_sync Scan vault, parse files, embed and store. Incremental (mtime-based).
vault_status Sync status: files synced, pending, changed, last sync time.
vault_search Hybrid search scoped to a vault's memories.

What gets extracted:

Obsidian feature Memory field
YAML frontmatter title: title
YAML frontmatter tags: [...] tags (merged with inline)
YAML frontmatter author: author
YAML frontmatter (all fields) metadata.frontmatter
Inline #tags in content tags (merged with frontmatter)
[[wiki-links]] metadata.links array
File path relative to vault source
Vault directory name namespace

Usage examples:

Sync my Obsidian vault at ~/Documents/my-vault

Check vault sync status for ~/Documents/my-vault

Search my vault for "meeting action items about hiring"

Sync vault but only the notes/ and projects/ folders:
  vault_sync with include_patterns=["notes/**", "projects/**"]

Force re-sync everything (ignore modification times):
  vault_sync with force=true

vault_sync parameters:

Parameter Type Default Description
vault_path string Absolute path to vault directory (required)
chunk_size number 1024 Target chunk size for large files
chunk_overlap number 50 Overlap between chunks
force boolean false Re-sync all files regardless of mtime
include_patterns string[] Only sync matching globs (e.g., ["notes/**"])
exclude_patterns string[] Skip matching globs (e.g., ["templates/**"])

How sync works: it scans the vault recursively for .md files (skipping .obsidian/, .trash/, .git/), compares modification times against the last sync, extracts frontmatter, wiki-links, and tags from new or changed files, embeds, and stores. Deleted files have their memories removed. Files larger than the chunk size are split with markdown-aware chunking. A second sync of an unchanged vault takes under a millisecond.

Security and privacy

  • No network calls after the one-time model download (cached locally).
  • No telemetry, no analytics, no tracking.
  • Hooks are opt-in. They are only installed when you run npx mcp-memory-graph init.
  • The nightly schedule is opt-in too, and removed by npx mcp-memory-graph uninstall.
  • Everything is one SQLite file: easy to back up, move, or delete.
  • access_level metadata (public, internal, confidential, restricted) for organizational awareness.
  • Data never leaves your machine.

Backup:

# WAL-safe online snapshot with retention
npx mcp-memory-graph backup

# Or a simple file copy
cp ~/.mcp-memory/memory.db ~/.mcp-memory/memory.db.backup

Reset:

# Delete the database to start fresh
rm ~/.mcp-memory/memory.db

Nightly consolidation

When installed via npx mcp-memory-graph init, a nightly job runs all five dream-cycle stages plus access-log rotation (entries older than 90 days are dropped).

On macOS, a launchd plist is created at ~/Library/LaunchAgents/com.mcp-memory.consolidate.plist, scheduled for 3:00 AM. On Linux, init prints a cron suggestion:

# Add to crontab -e
0 3 * * * /usr/local/bin/npx mcp-memory-graph consolidate

Run it manually any time:

npx mcp-memory-graph consolidate

Limitations

  • Scale ceiling. Vector search is an exact scan: 9.1 ms p95 at 10K vectors, about 30 ms at 50K, and it degrades linearly from there. Comfortable into the low hundreds of thousands; past that you want a dedicated ANN index, which this server does not have yet.
  • English-optimized. The default MiniLM model is English-only in practice; cross-language matching is weak. A multilingual model can be configured via MCP_MEMORY_MODEL (with a rebuild), but the shipped benchmarks only validate the default.
  • First-call cold start. Three to five seconds on first use while the embedding model loads. Cached after that.
  • Heuristic extraction. memory_extract_learnings uses pattern matching, not an LLM. It catches common phrasings and misses subtle ones. (The Stop hook's claude -p review is the LLM-quality path.)
  • One process. RBAC keys and revocation live in the server process. For horizontal scale you shard tenants across processes or give each tenant their own database file.

Roadmap

What's actually next, in rough order:

  • Multilingual embeddings, opt-in. Ship a multilingual ONNX model option (the embedder registry and the model-identity guard already exist, so a swap is safe and loud).
  • Office document ingestion. PDF, DOCX, XLSX, and friends as an ingest mode, with local extraction only.
  • Vault file watcher. Auto-rebuild on .md changes instead of manual rebuild.
  • as_of content reconstruction. Point-in-time queries currently reconstruct validity (which facts were live); reconstructing the content of edited memories at that instant is the remaining half.
  • ANN index for corpora past a few hundred thousand vectors.
  • Windows test suite port. The server runs on Windows, but the test suite carries POSIX path assumptions; the Windows CI leg is non-blocking until that's done.

Tech stack

Component Package Purpose
MCP SDK @modelcontextprotocol/sdk ^1.29 Model Context Protocol server framework
Embeddings @huggingface/transformers ^3.8 Local ONNX model inference in Node.js
Database better-sqlite3 ^12.10 Synchronous SQLite with native bindings
Vector search sqlite-vec 0.1.10-alpha.4 vec0 virtual table for KNN search
Validation zod ^3.25 Schema validation for tool inputs
TypeScript typescript ^5 Strict mode, ES2022 target
Frontend React 19, Vite, Tailwind CSS v4 Web dashboard SPA
UI components shadcn/ui Accessible component primitives
Fuzzy search fuse.js ^7 Client-side autocomplete suggestions
Graph viz d3-force, d3-zoom, d3-drag Knowledge graph layout

License

Source-available, not open source. Licensed under the PolyForm Noncommercial License 1.0.0: free for any noncommercial purpose (personal projects, hobby, study, research, charitable, educational, public-research, and government use). Commercial use requires a paid license; see COMMERCIAL.md.

If you're unsure whether your use counts as commercial, check the safe harbors in the license or just ask: yonasmougaard@gmail.com.

Keywords

MCP memory server · Model Context Protocol · Claude Code memory · persistent AI memory · LLM long-term memory · AI agent memory · local-first memory · $0/token memory · hybrid vector + keyword search · semantic search · knowledge graph · bi-temporal memory · HippoRAG / Personalized PageRank · cross-encoder reranking · RAG memory · SQLite vector database · sqlite-vec · FTS5 / BM25 · local embeddings (all-MiniLM-L6-v2, Transformers.js) · Obsidian vault sync · JSON Canvas · GDPR forget · signed provenance · self-hosted memory.

Also searched as: a self-hosted, privacy-first alternative to mem0, Zep, Letta, Cognee, and Supermemory · long-term memory for Claude / Cursor / Codex · an Obsidian-backed knowledge base for AI agents · a local knowledge-graph memory that never leaves your machine.