ββββ ββββββββββββββββββ βββββββ βββββββ ββββββ βββββββ ββββββ βββ
βββββ βββββββββββββββββββββββββββ βββββββββββββββββββββββββββββββββββ
ββββββ βββββββββ βββββββββββ βββ βββ βββββββββββββββββββββββββββ
ββββββββββββββββ βββββββββββ βββ βββ βββββββββββββββββββββββββββ
βββ βββββββββββββββββ βββββββββββ βββββββββββ ββββββββββββββ βββββββββββ
βββ ββββββββββββββββ ββββββββββ ββββββββββ ββββββββββ βββ βββββββββββ
A modular MCP server implementing 14 specialized AI agents for research operations, security, design, and organizational governance.
NerdCabalMCP is a Model Context Protocol (MCP) server that provides a co-scientist platform for AI-assisted research, operations, and creative work. Think of it as your personal team of 14 specialized AI experts, each with deep domain knowledge and the ability to collaborate seamlessly.
Traditional AI tools give you one-size-fits-all assistants. NerdCabalMCP gives you granular control over specialized agent personas, allowing you to:
Role: Evaluation Framework Designer Expertise: Creates comprehensive rubrics for LLM evaluation, benchmark design, and quality criteria
Use Cases:
Example:
{
"tool": "llm-rubric-architect",
"task": "Create a rubric for evaluating code generation models",
"criteria": ["correctness", "efficiency", "readability", "security"],
"output_format": "markdown"
}
Role: Research Methodology Specialist Expertise: Hypothesis formulation, experimental design, statistical power analysis
Use Cases:
Example:
{
"tool": "experimental-designer",
"research_question": "Does chain-of-thought improve math reasoning?",
"constraints": {
"budget": 1000,
"timeframe": "2 weeks"
}
}
Role: Neural Forensics Specialist Expertise: DSMMD taxonomy (Data, Semantics, Methods, Metadata, Discourse)
Use Cases:
Role: Financial Strategist Expertise: Grant budgets, investor projections, ROI analysis
Use Cases:
Example:
{
"tool": "budget-agent",
"project": "Language Model Training",
"funding_target": 500000,
"timeline_months": 18
}
Role: Operations Manager Expertise: Iron Triangle optimization (Speed β· Cost β· Quality)
Use Cases:
Key Concept: The Iron Triangle
SPEED
/ \
/ \
/ βοΈ \
/________\
COST QUALITY
You can optimize two, but not all three simultaneously.
Role: Organizational Architect Expertise: SOPs, team structures, timezone optimization
Use Cases:
Role: Chief Information Security Officer Expertise: STRIDE threat modeling, Zero Trust architecture
Use Cases:
STRIDE Framework:
Role: Experiment Tracking Specialist Expertise: MLflow queries, trace analysis, run comparisons
Use Cases:
Role: Training Data Engineer Expertise: SFT, DPO, HuggingFace dataset creation
Use Cases:
Supported Formats:
Role: Data Quality Analyst Expertise: FiftyOne visualization, mistakenness detection
Use Cases:
Role: Design Systems Architect Expertise: Color theory, typography, CSS frameworks, UI/UX
Use Cases:
Supported Styles:
Example:
{
"tool": "creative-director",
"style": "cyberpunk-brutalist-bauhaus",
"colors": ["black", "white", "red"],
"components": ["buttons", "cards", "navigation"]
}
Role: Multi-Agent Coordinator Expertise: ADK patterns (Sequential, Parallel, Loop, Coordinator)
Use Cases:
ADK Patterns:
Sequential: A β B β C
Parallel: A β B β C β Merge
Loop: A β B β [condition] β A
Coordinator: A β· C β· B
Role: Agent Lifecycle Management Expertise: Creating, deploying, and monitoring agents
Use Cases:
# 1. Clone the repository
git clone https://github.com/Tuesdaythe13th/NerdCabalMCP.git
cd NerdCabalMCP
# 2. Install dependencies
cd mcp-server
npm install
# 3. Build the TypeScript code
npm run build
# 4. Configure Claude Desktop
# Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS)
# or %APPDATA%/Claude/claude_desktop_config.json (Windows)
{
"mcpServers": {
"nerdcabal": {
"command": "node",
"args": [
"/absolute/path/to/NerdCabalMCP/mcp-server/dist/index.js"
],
"env": {}
}
}
}
# 5. Restart Claude Desktop
# Your 14 agents are now available!
Verify your system meets the requirements:
# Check Node.js version (need 18+)
node --version # Should show v18.x.x or higher
# Check npm
npm --version
# Install pnpm (optional, but faster)
npm install -g pnpm
# Clone with all history
git clone https://github.com/Tuesdaythe13th/NerdCabalMCP.git
cd NerdCabalMCP
# Or clone with shallow history (faster)
git clone --depth 1 https://github.com/Tuesdaythe13th/NerdCabalMCP.git
cd NerdCabalMCP
cd mcp-server
# Using npm
npm install
# Or using pnpm (faster)
pnpm install
Dependencies Installed:
@modelcontextprotocol/sdk (v1.0.4): Core MCP protocol implementationtypescript (v5.7.2): Type system and compiler@types/node (v22.0.0): Node.js type definitions# Full build
npm run build
# Development mode with auto-rebuild
npm run watch
# Development server with hot reload
npm run dev
Build Output: Compiled JavaScript files in mcp-server/dist/
# Test the server standalone
node dist/index.js
# You should see:
# MCP server running on stdio
The server configuration is in mcp-server/mcp-config.json:
{
"server": {
"name": "nerdcabal-mcp",
"version": "1.0.0"
},
"tools": [
{
"name": "llm-rubric-architect",
"enabled": true
},
{
"name": "experimental-designer",
"enabled": true
}
// ... all 14 agents
]
}
Location: ~/Library/Application Support/Claude/claude_desktop_config.json
{
"mcpServers": {
"nerdcabal": {
"command": "node",
"args": [
"/Users/yourname/NerdCabalMCP/mcp-server/dist/index.js"
],
"env": {
"LOG_LEVEL": "info"
}
}
}
}
Location: %APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"nerdcabal": {
"command": "node",
"args": [
"C:\\Users\\yourname\\NerdCabalMCP\\mcp-server\\dist\\index.js"
],
"env": {}
}
}
}
Location: ~/.config/Claude/claude_desktop_config.json
{
"mcpServers": {
"nerdcabal": {
"command": "node",
"args": [
"/home/yourname/NerdCabalMCP/mcp-server/dist/index.js"
],
"env": {}
}
}
}
You can configure behavior with environment variables:
# Logging
LOG_LEVEL=debug|info|warn|error
# Agent-specific settings
MLFLOW_TRACKING_URI=http://localhost:5000
FIFTYONE_DATABASE_URI=mongodb://localhost:27017
claude_desktop_config.json as shown above@ to see available toolsnerdcabal tools to use agentsfrom anthropic import Anthropic
import json
client = Anthropic(api_key="your-api-key")
# Use the MCP tool
response = client.messages.create(
model="claude-sonnet-4-5-20250929",
max_tokens=4096,
tools=[
{
"name": "llm-rubric-architect",
"description": "Creates comprehensive evaluation rubrics",
"input_schema": {
"type": "object",
"properties": {
"task": {"type": "string"},
"criteria": {"type": "array", "items": {"type": "string"}}
},
"required": ["task", "criteria"]
}
}
],
messages=[
{
"role": "user",
"content": "Create a rubric for evaluating code quality"
}
]
)
print(response.content)
A cyberpunk brutalist bauhaus interface for easy agent interaction:
# Future command
streamlit run ui/app.py
Deploy as a public or private space for team collaboration.
Every agent follows this pattern:
Input β Agent Processing β Output
Goal: Evaluate a chatbotβs performance
Input:
{
"tool": "llm-rubric-architect",
"task": "chatbot-evaluation",
"dimensions": [
"response_quality",
"context_retention",
"safety",
"personality"
],
"scale": "1-5",
"output_format": "markdown"
}
Output:
# Chatbot Evaluation Rubric
## Response Quality (1-5)
- **5**: Accurate, complete, directly addresses query
- **4**: Mostly accurate, minor gaps
- **3**: Partially correct, some misunderstandings
- **2**: Significant errors or irrelevance
- **1**: Completely wrong or nonsensical
## Context Retention (1-5)
...
Goal: Test if prompt engineering improves accuracy
Input:
{
"tool": "experimental-designer",
"hypothesis": "Chain-of-thought prompting increases accuracy on math problems",
"baseline": "direct-answer",
"intervention": "chain-of-thought",
"metric": "accuracy",
"sample_size": 1000
}
Output:
# Experiment Design: Chain-of-Thought vs Direct Answer
## Hypothesis
H0: No difference in accuracy between methods
H1: Chain-of-thought produces higher accuracy
## Methodology
- **Design**: Between-subjects randomized controlled trial
- **Sample**: 1000 math problems (balanced across difficulty)
- **Groups**:
- Control: Direct answer prompts (n=500)
- Treatment: Chain-of-thought prompts (n=500)
## Power Analysis
- Effect size (expected): d=0.3
- Power: 0.80
- Alpha: 0.05
- Required n per group: 352 (achieved)
## Success Criteria
- p < 0.05 on two-tailed t-test
- Effect size d > 0.2 (small practical significance)
Goal: Threat model for an API
Input:
{
"tool": "ciso-agent",
"system": "LLM API",
"components": ["API gateway", "model inference", "user database"],
"framework": "STRIDE"
}
Output:
# STRIDE Threat Model: LLM API
## Component: API Gateway
### Spoofing
- **Threat**: Attacker impersonates legitimate user
- **Mitigation**: JWT with short expiration, API key rotation
- **Risk**: HIGH
### Tampering
- **Threat**: Request modification in transit
- **Mitigation**: HTTPS/TLS 1.3 only, request signing
- **Risk**: MEDIUM
...
Goal: Create a complete research project
Input:
{
"tool": "orchestrator",
"workflow": {
"pattern": "sequential",
"agents": [
{
"name": "experimental-designer",
"input": {"hypothesis": "..."}
},
{
"name": "budget-agent",
"input": {"project": "from_previous", "timeline": 6}
},
{
"name": "administrator",
"input": {"team_size": 3, "timeline": "from_previous"}
}
]
}
}
Output: Coordinates all three agents sequentially, passing context between them.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β MCP Client β
β (Claude Desktop, Custom UI, etc.) β
ββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββ
β MCP Protocol (JSON-RPC)
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β NerdCabalMCP Server β
β (index.ts) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Tool Router β
β ββ llm-rubric-architect β
β ββ experimental-designer β
β ββ budget-agent β
β ββ comptroller-agent β
β ββ administrator-agent β
β ββ mlflow-agent β
β ββ dataset-builder β
β ββ ciso-agent β
β ββ orchestrator β
β ββ creative-director β
β ββ visual-inspector β
β ββ forensic-analyst β
β ββ paper2agent-infrastructure (2 tools) β
ββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β External Integrations β
β ββ MLflow (experiment tracking) β
β ββ FiftyOne (dataset visualization) β
β ββ HuggingFace (dataset hosting) β
β ββ GitHub (repository analysis) β
β ββ Google Colab (notebook execution) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Each agent implements the Agent Card specification:
interface AgentCard {
name: string; // e.g., "llm-rubric-architect"
version: string; // Semantic versioning
description: string; // Human-readable purpose
capabilities: string[]; // What the agent can do
input_schema: JSONSchema; // Structured input format
output_schema: JSONSchema; // Structured output format
dependencies: string[]; // Required external services
adk_patterns: ADKPattern[]; // Supported execution patterns
}
ADK Execution Patterns:
1. Sequential: A β B β C
Use when: Output of A is required input for B
2. Parallel: A β B β C
Use when: Tasks are independent
3. Loop: A β [condition] β A or B
Use when: Iterative refinement needed
4. Coordinator: A β· C β· B
Use when: Central agent manages communication
mcp-server/src/
βββ index.ts # Main MCP server (tool routing)
βββ types.ts # TypeScript interfaces
βββ utils.ts # Shared utilities
βββ agents/
β βββ rubric-architect.ts
β βββ experimental-designer.ts
β βββ budget-agent.ts
β βββ comptroller-agent.ts
β βββ administrator-agent.ts
β βββ mlflow-agent.ts
β βββ dataset-builder.ts
β βββ ciso-agent.ts
β βββ orchestrator.ts
β βββ creative-director.ts
β βββ visual-inspector.ts
β βββ forensic-analyst.ts
βββ infrastructure/
βββ create-agent.ts
βββ check-agent.ts
βββ launch-mcp.ts
sequenceDiagram
participant User
participant MCPClient
participant MCPServer
participant Agent
participant ExternalService
User->>MCPClient: Request task
MCPClient->>MCPServer: MCP tool call (JSON-RPC)
MCPServer->>Agent: Route to appropriate agent
Agent->>Agent: Process with domain logic
Agent->>ExternalService: Optional external call
ExternalService-->>Agent: Return data
Agent-->>MCPServer: Structured output
MCPServer-->>MCPClient: MCP response
MCPClient-->>User: Display result
llm-rubric-architectPurpose: Generate evaluation rubrics for LLM capabilities
Input Schema:
{
"type": "object",
"properties": {
"task": {
"type": "string",
"description": "The evaluation task"
},
"dimensions": {
"type": "array",
"items": {"type": "string"},
"description": "Aspects to evaluate"
},
"scale": {
"type": "string",
"enum": ["1-3", "1-5", "1-7", "1-10"],
"default": "1-5"
},
"output_format": {
"type": "string",
"enum": ["markdown", "json", "csv"]
}
},
"required": ["task", "dimensions"]
}
Output: Markdown rubric or JSON structure
experimental-designerPurpose: Design controlled experiments for AI research
Input Schema:
{
"type": "object",
"properties": {
"hypothesis": {
"type": "string",
"description": "Research hypothesis to test"
},
"baseline": {
"type": "string",
"description": "Control condition"
},
"intervention": {
"type": "string",
"description": "Treatment condition"
},
"metric": {
"type": "string",
"description": "Primary evaluation metric"
},
"sample_size": {
"type": "integer",
"minimum": 30
},
"constraints": {
"type": "object",
"properties": {
"budget": {"type": "number"},
"timeframe": {"type": "string"}
}
}
},
"required": ["hypothesis", "metric"]
}
Output: Experimental design document (Markdown)
budget-agentPurpose: Financial planning and budget generation
Input Schema:
{
"type": "object",
"properties": {
"project": {
"type": "string",
"description": "Project name/description"
},
"funding_target": {
"type": "number",
"description": "Target funding amount (USD)"
},
"timeline_months": {
"type": "integer",
"minimum": 1
},
"categories": {
"type": "array",
"items": {
"type": "string",
"enum": ["personnel", "compute", "equipment", "travel", "indirect"]
}
},
"format": {
"type": "string",
"enum": ["NIH", "NSF", "investor_pitch", "generic"]
}
},
"required": ["project", "funding_target", "timeline_months"]
}
Output: Detailed budget spreadsheet (JSON/CSV/Markdown)
orchestratorPurpose: Coordinate multi-agent workflows
Input Schema:
{
"type": "object",
"properties": {
"workflow": {
"type": "object",
"properties": {
"pattern": {
"type": "string",
"enum": ["sequential", "parallel", "loop", "coordinator"]
},
"agents": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {"type": "string"},
"input": {"type": "object"}
}
}
}
}
}
},
"required": ["workflow"]
}
Output: Workflow execution plan and coordination strategy
creative-directorPurpose: Design system and UI/UX generation
Input Schema:
{
"type": "object",
"properties": {
"style": {
"type": "string",
"enum": ["cyberpunk-brutalist-bauhaus", "material", "tailwind", "custom"]
},
"colors": {
"type": "array",
"items": {"type": "string"},
"description": "Color palette (hex or named colors)"
},
"components": {
"type": "array",
"items": {
"type": "string",
"enum": ["buttons", "cards", "navigation", "forms", "typography"]
}
},
"output_format": {
"type": "string",
"enum": ["css", "tailwind", "styled-components", "figma_tokens"]
}
},
"required": ["style", "components"]
}
Output: Design system specification (CSS/JSON)
forensic-analystPurpose: Neural forensics for LLM transcript analysis
Input Schema:
{
"type": "object",
"properties": {
"transcript": {
"type": "string",
"description": "LLM conversation transcript"
},
"taxonomy": {
"type": "string",
"enum": ["DSMMD"],
"default": "DSMMD"
},
"detect": {
"type": "array",
"items": {
"type": "string",
"enum": [
"confabulation",
"context_collapse",
"metadata_leakage",
"semantic_drift",
"method_confusion"
]
}
}
},
"required": ["transcript"]
}
Output: Forensics report with detected issues (Markdown/JSON)
Want to add your own agent? Hereβs the template:
// mcp-server/src/agents/my-custom-agent.ts
import { AgentCard, AgentInput, AgentOutput } from '../types';
export const myCustomAgent: AgentCard = {
name: 'my-custom-agent',
version: '1.0.0',
description: 'What your agent does',
capabilities: [
'capability-1',
'capability-2'
],
input_schema: {
type: 'object',
properties: {
// Define your input structure
task: { type: 'string' }
},
required: ['task']
},
output_schema: {
type: 'object',
properties: {
result: { type: 'string' }
}
},
dependencies: [],
adk_patterns: ['sequential', 'parallel']
};
export async function executeMyCustomAgent(
input: AgentInput
): Promise<AgentOutput> {
// Your agent logic here
return {
success: true,
data: {
result: 'Agent output'
}
};
}
Then register it in index.ts:
import { myCustomAgent, executeMyCustomAgent } from './agents/my-custom-agent';
server.setRequestHandler(CallToolRequestSchema, async (request) => {
if (request.params.name === 'my-custom-agent') {
const result = await executeMyCustomAgent(request.params.arguments);
return { content: [{ type: 'text', text: JSON.stringify(result) }] };
}
// ... other agents
});
from langchain.agents import Tool
from langchain.llms import Anthropic
import requests
def call_nerdcabal_agent(agent_name: str, input_data: dict) -> str:
"""Call a NerdCabal MCP agent"""
response = requests.post(
'http://localhost:3000/mcp',
json={
'tool': agent_name,
'input': input_data
}
)
return response.json()
# Create LangChain tool
rubric_tool = Tool(
name="Rubric Architect",
func=lambda task: call_nerdcabal_agent('llm-rubric-architect', {'task': task}),
description="Creates evaluation rubrics"
)
# Use in agent
from langchain.agents import initialize_agent, AgentType
agent = initialize_agent(
tools=[rubric_tool],
llm=Anthropic(model='claude-3-opus-20240229'),
agent=AgentType.STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION
)
result = agent.run("Create a rubric for evaluating chatbot empathy")
Import the nerdcabal-langflow.json configuration:
{
"nodes": [
{
"type": "MCPTool",
"data": {
"server": "nerdcabal",
"tool": "llm-rubric-architect"
}
}
]
}
Solution:
claude_desktop_config.json path is correct~/Library/Logs/Claude/mcp*.log (macOS)Solution:
cd mcp-server
rm -rf node_modules package-lock.json
npm install
npm run build
Solution:
mcp-config.jsonLOG_LEVEL=debug node dist/index.js
Solution:
# Update TypeScript
npm install -D typescript@latest
# Clear build cache
rm -rf dist/
npm run build
Solutions:
MLflow:
# Start MLflow server
mlflow server --host 0.0.0.0 --port 5000
# Set environment variable
export MLFLOW_TRACKING_URI=http://localhost:5000
FiftyOne:
# Start FiftyOne app
fiftyone app launch
# Verify database
fiftyone migrate --info
HuggingFace:
# Login to HuggingFace
huggingface-cli login
# Verify token
huggingface-cli whoami
Model Context Protocol (MCP) is Anthropicβs standard for connecting AI models to external tools and data sources.
Key Concepts:
Learn More:
Agent-to-Agent (A2A) protocol enables structured communication between AI agents.
Key Concepts:
Anthropic Design Kit (ADK) provides patterns for multi-agent workflows.
Pattern Details:
import streamlit as st
import requests
st.title("𧬠NerdCabal MCP Interface")
agent = st.selectbox("Select Agent", [
"llm-rubric-architect",
"experimental-designer",
"budget-agent",
"creative-director"
])
if agent == "creative-director":
style = st.selectbox("Style", [
"cyberpunk-brutalist-bauhaus",
"material",
"tailwind"
])
colors = st.multiselect("Colors", ["black", "white", "red", "blue"])
if st.button("Generate Design System"):
result = requests.post('http://localhost:3000/mcp', json={
'tool': agent,
'input': {'style': style, 'colors': colors}
})
st.code(result.json()['data'], language='css')
Create app.py in your Space:
import gradio as gr
from anthropic import Anthropic
client = Anthropic()
def call_agent(agent_name, task_description):
response = client.messages.create(
model="claude-sonnet-4-5-20250929",
max_tokens=4096,
tools=[{"name": f"nerdcabal:{agent_name}"}],
messages=[{"role": "user", "content": task_description}]
)
return response.content[0].text
iface = gr.Interface(
fn=call_agent,
inputs=[
gr.Dropdown(["llm-rubric-architect", "experimental-designer"], label="Agent"),
gr.Textbox(label="Task Description")
],
outputs=gr.Markdown(),
title="NerdCabal MCP Agents"
)
iface.launch()
Create .replit file:
run = "npm run dev"
language = "nodejs"
[nix]
channel = "stable-22_11"
[deployment]
deploymentTarget = "cloudrun"
Each agent tracks:
# Enable debug logging
LOG_LEVEL=debug node dist/index.js 2>&1 | tee mcp-server.log
# View agent-specific logs
grep "llm-rubric-architect" mcp-server.log
# Monitor real-time
tail -f mcp-server.log | grep ERROR
# Test server is responding
curl -X POST http://localhost:3000/mcp \
-H "Content-Type: application/json" \
-d '{"method": "health"}'
# Should return: {"status": "ok", "agents": 14}
We welcome contributions! See our contributing guide for:
Quick Contribution:
git checkout -b feature/my-new-agent
# Make your changes
npm run build
npm test
git commit -m "Add: My new agent for X"
git push origin feature/my-new-agent
# Open a pull request
MIT License - see LICENSE file for details
Built with β€οΈ by the NerdCabal community
End of MCP Server Guide
Last Updated: January 2026
Version: 1.0.0