Claude Code & MCP agents
Give your coding agent a memory.
Mnemix is the memory layer for MCP agents — Claude Code, Codex, Cursor, Cline, Gemini CLI, or any MCP host. Voice is the wedge; this is the umbrella. By the end of this guide your agent recalls what it learned in past sessions before it acts, and records what it learns as it works — so session 2 isn't a cold start.
Get an API key
Mnemix is in curated beta: you get a key via an invite code (create an account in the invite-gated flow, then generate a key in the dashboard) or a concierge-provisioned key from your welcome email. Instant self-serve is in flight. Keys look like sk_live_….
export MNEMIX_API_KEY="sk_live_xxxxxxxxxxxxxxxx"1 · Add the remote MCP server
Mnemix runs a remote MCP server — no install, no clone. One line for Claude Code:
claude mcp add --transport http mnemix https://mcp.mnemix.ai/mcp \
--header "Authorization: Bearer $MNEMIX_API_KEY"Or drop this into any MCP host config (Cursor, Cline, or your own):
{
"mcpServers": {
"mnemix": {
"type": "http",
"url": "https://mcp.mnemix.ai/mcp",
"headers": { "Authorization": "Bearer sk_live_xxxxxxxxxxxxxxxx" }
}
}
}| Tool | When | What it does |
|---|---|---|
| recall_and_enrich | before a voice turn | Returns caller identity, memory, and enrichment for a phone number. |
| calls_end | after a completed call | Writes the completed call transcript and outcome back for future recall. |
| caller_lookup | when you need a known-caller profile | Reads the caller profile for a phone number without triggering enrichment. |
On the frozen public voice surface, the stable identity is the caller's phone_number. Reuse the same E.164 phone number across recall_and_enrich, calls_end, and caller_lookup; keep session_id unique per call.
2 · The loop — recall before the first turn, write back after hangup
For the frozen public voice surface, the pattern is simple: use recall_and_enrich before your agent speaks, then calls_end once the call is complete.
Rule A — recall before the first turn
Before your voice agent says anything, call recall_and_enrich and read the returned caller, memory, and enrichment packet before you act.
// Tool: recall_and_enrich — before the first voice turn
{
"phone_number": "+15551234567",
"trigger": "answered",
"session_id": "call_demo1"
}Rule B — end the call with a valid write-back
After hangup, call calls_end once with the transcript, duration_s, and a valid outcome enum so the next call has an updated caller profile and memory summary.
// Tool: calls_end — once after hangup
{
"phone_number": "+15551234567",
"session_id": "call_demo1",
"transcript": [
{ "role": "user", "text": "I'd like a callback tomorrow morning.", "ts_ms": 1719861600000 },
{ "role": "agent", "text": "Done — we'll call tomorrow morning.", "ts_ms": 1719861608000 }
],
"duration_s": 84,
"outcome": "callback_requested"
}Keep the write-back honest. Send the real call transcript and the closest matching outcome enum. If you only need a read-only caller profile later, use caller_lookup instead of replaying enrichment.
3 · The reader rule (recommended default)
Retrieval only helps if the agent reasons over the packet instead of abstaining. Add this block to your system prompt / CLAUDE.md. It's the reader policy Mnemix tuned against its own eval set — the common failure mode is an agent that has the answer in the packet but abstains because it needs to combine two facts. This rule tells it not to.
## Using Mnemix memory
When you call the public recall surface, you get a packet mixing caller identity, extracted facts, and conversation history.
Reason over ALL of it: combine multiple items, do date/time arithmetic, and connect information stated across
different sessions to derive the answer. If items conflict or one updates another, prefer the MOST RECENT
information. Act on the recalled context directly and concisely. Treat memory as absent ONLY if the answer
genuinely cannot be determined from the packet — do NOT ignore it merely because it requires combining facts
or light inference.4 · Verify the public voice flow yourself
Don't trust the docs — prove the frozen public surface. Run a recall, close the call with a valid transcript and outcome, then fetch the caller profile back over plain HTTP — no agent wrapper required:
# 1) recall a caller
curl -sS -X POST https://mcp.mnemix.ai/v1/recall_and_enrich -H "Authorization: Bearer $MNEMIX_API_KEY" -H "Content-Type: application/json" -d '{"phone_number":"+15551234567","trigger":"answered","session_id":"call_demo1"}'
# 2) write the completed call back
curl -sS -X POST https://mcp.mnemix.ai/v1/calls/end -H "Authorization: Bearer $MNEMIX_API_KEY" -H "Content-Type: application/json" -d '{"phone_number":"+15551234567","session_id":"call_demo1","transcript":[{"role":"user","text":"I need a callback tomorrow morning.","ts_ms":1719861600000},{"role":"agent","text":"Done — we will call tomorrow morning.","ts_ms":1719861608000}],"duration_s":84,"outcome":"callback_requested"}'
# 3) read the caller profile back
curl -sS "https://mcp.mnemix.ai/v1/caller/%2B15551234567" -H "Authorization: Bearer $MNEMIX_API_KEY"Is
Persistent caller memory for an MCP agent — recall_and_enrich before the first turn, calls_end after hangup, and caller_lookup when you need a read-only profile.
In flight
Instant self-serve keys; published latency and a reproducible accuracy harness. We don't quote numbers we haven't published.
Isn't
An agent framework, a vector DB you run, or a context-window manager. Mnemix is the memory API — you keep your agent runtime.
Latency is a design target — Mnemix is designed for sub-300ms voice recall — not a published benchmark. Verify the loop on your own account with the check above.
Last updated: .