Chapter 9: RAG Architecture - The Traitors Thesis

9.1 Introduction: The RAG Approach to Social Simulation

Simulating The Traitors requires AI agents that can:

Maintain consistent personalities across extended gameplay (see Strategic Archetypes)
Access relevant knowledge while respecting information boundaries
Generate authentic dialogue reflecting character voice
Make strategic decisions based on game state
Manage deception, secrets, and emotional states

Traditional approaches (rule-based systems, finite state machines, simple chatbots) fail to capture the nuanced social dynamics the format requires. RAG-based systems offer a solution by grounding generation in retrieved knowledge while enabling contextual response synthesis.

9.2 System Architecture Overview

9.2.1 Core Components

┌────────────────────────────────────────────────────────────┐
│                    TRAITORS SIMULATION ENGINE              │
├────────────────────────────────────────────────────────────┤
│                                                            │
│  ┌─────────────┐    ┌─────────────┐    ┌─────────────┐     │
│  │   PLAYER    │    │    GAME     │    │   VIEWER    │     │
│  │   AGENTS    │    │   ENGINE    │    │  INTERFACE  │     │
│  └──────┬──────┘    └──────┬──────┘    └──────┬──────┘     │
│         │                  │                  │            │
│         ▼                  ▼                  ▼            │
│  ┌──────────────────────────────────────────────────────┐  │
│  │                   RAG PIPELINE (V5)                  │  │
│  ├──────────────────────────────────────────────────────┤  │
│  │  Query          Vector        Graph        Prompt    │  │
│  │  Classifier  →  Search   →   Enrichment → Builder    │  │
│  │       │            │             │            │      │  │
│  │       ▼            ▼             ▼            ▼      │  │
│  │  ┌────────┐  ┌──────────┐  ┌─────────┐  ┌─────────┐  │  │
│  │  │ 7 Type │  │ pgvector │  │ Dgraph  │  │ Modular │  │  │
│  │  │ Routes │  │   k-NN   │  │ Client  │  │ Prompts │  │  │
│  │  └────────┘  └──────────┘  └─────────┘  └─────────┘  │  │
│  └──────────────────────────────────────────────────────┘  │
│                            │                               │
│                            ▼                               │
│  ┌──────────────────────────────────────────────────────┐  │
│  │                  KNOWLEDGE STORES                    │  │
│  ├──────────────────────────────────────────────────────┤  │
│  │  PostgreSQL     │    Dgraph      │    RAPTOR         │  │
│  │  (Sections,     │  (Entities,    │  (Hierarchical    │  │
│  │   Embeddings)   │  Relationships)│   Summaries)      │  │
│  └──────────────────────────────────────────────────────┘  │
│                                                            │
└────────────────────────────────────────────────────────────┘

9.2.2 Technology Stack

Component	Technology	Purpose
Language	Go 1.21+	Core implementation
Database	PostgreSQL 15+ with pgvector	Vector storage, sections, metadata
Knowledge Graph	Dgraph v23.1.0	Entity relationships
LLM	OpenAI GPT-4o-mini	Response generation
Embeddings	text-embedding-3-small	Vector representations
Communication	MCP (Model Context Protocol)	Agent integration

9.3 The V5 RAG Pipeline

9.3.1 Pipeline Stages

The V5 pipeline processes queries through seven stages:

Query → Classification → Embedding → Vector Search →
        Temporal Filter → Graph Enrichment →
        RAPTOR Context → Prompt Building → LLM Generation

9.3.2 Query Classification

The classifier routes queries to appropriate processing paths:

Classification Types:

Type	Description	Example Query
`factual`	Direct information request	"What is [character]'s occupation?"
`temporal`	Time-bound questions	"What happened in Episode 3?"
`relationship`	Entity connections	"How does [X] know [Y]?"
`synthesis`	Requires inference	"Why might [X] suspect [Y]?"
`comparison`	Evaluating alternatives	"Who is more trustworthy?"
`narrative`	Story-based	"Describe the events of the murder"
`entity_list`	Enumeration	"Who are the remaining players?"

Implementation:

type QueryClassifier struct {
    llmClient *llm.Client
    patterns  []ClassificationPattern
}

func (c *QueryClassifier) Classify(ctx context.Context, query string) (QueryType, float64) {
    // LLM-based classification with confidence score
    // Falls back to pattern matching for efficiency
}

9.3.3 Vector Search with pgvector

Embedding Process:

Query text vectorized using text-embedding-3-small
Similarity search against section embeddings
k-NN retrieval with configurable k

SQL Query Pattern:

SELECT section_id, body_text,
       1 - (embedding <=> $1::vector) as similarity
FROM content_section_definition cs
JOIN section_embeddings se ON cs.section_id = se.section_id
WHERE expert_id = $2
ORDER BY embedding <=> $1::vector
LIMIT $3

9.3.4 Temporal Filtering

For game-state-aware retrieval:

type QueryOptions struct {
    TemporalCutoff *time.Time  // Only retrieve content before this point
    GamePhase      GamePhase   // Current game phase
    EpisodeNumber  int         // Episode context
}

This enables queries that respect player knowledge boundaries; a character should only "know" events that have occurred in their timeline.

9.3.5 Graph Enrichment

Dgraph provides relationship context:

Entity Query:

query GetEntityWithRelationships($id: string) {
    entity(func: eq(entity_id, $id)) {
        name
        type
        properties
        relationships @facets {
            target {
                name
                type
            }
            relationship_type
            strength
        }
    }
}

Enrichment Process:

Identify entities mentioned in query
Retrieve entity details and relationships
Inject relationship context into prompt

9.3.6 RAPTOR Hierarchical Summarization

RAPTOR (Recursive Abstractive Processing for Tree-Organized Retrieval) provides multi-scale context:

Tree Structure:

Level 0: Base sections (942 for main expert)
Level 1: Cluster summaries (~100 nodes)
Level 2: Super-cluster summaries (~10 nodes)
Level 3: Root summary (1 node)

Retrieval Logic:

func (r *RAPTORService) GetContextForQuery(ctx context.Context,
    query string, chunkIDs []int, maxDepth int) (*RAPTORContext, error) {

    // Traverse tree from base chunks upward
    // Gather summaries that contextualize retrieved content
    // Return hierarchical context for prompt inclusion
}

9.4 Expert-Based Knowledge Organization

9.4.1 The Expert Concept

Each AI agent is backed by an "expert", a knowledge domain with:

Defined personality and voice
Knowledge boundaries (what they can know)
Citation sources
Relationship to other experts

Expert Schema:

CREATE TABLE expert_definition (
    expert_id UUID PRIMARY KEY,
    expert_name TEXT NOT NULL,
    description TEXT,
    personality JSONB,
    knowledge_scope TEXT[],
    voice_configuration JSONB,
    created_at TIMESTAMP
);

9.4.2 Traitors-Specific Expert Configuration

For game simulation, each player requires an expert configuration based on player archetypes:

{
    "expert_id": "player_1_uuid",
    "expert_name": "Sarah",
    "personality": {
        "archetype": "detective",
        "traits": ["analytical", "cautious", "observant"],
        "speaking_style": "precise_questioning",
        "risk_tolerance": 0.3
    },
    "role": "faithful",
    "knowledge_scope": ["public_events", "personal_observations", "alliance_info"],
    "secret_knowledge": ["own_role"]
}

9.4.3 Knowledge Boundaries

Critical for game integrity:

Faithful Knowledge:

Own role (Faithful)
Public events (murders revealed, banishments)
Personal observations (who they've talked to)
Alliance information (shared within trusted group)

Traitor Knowledge:

Own role (Traitor)
Fellow Traitor identities
Murder decisions and rationale
Public events
Recruitment plans (see Secret Traitor analysis)

Forbidden Knowledge:

Other players' private thoughts (confessionals, as explored in Audience Psychology)
Future events
Production information

9.4.4 Multi-Expert Interaction

When players converse, the system:

Maintains separate expert contexts
Routes queries to appropriate expert
Ensures no cross-contamination of private knowledge
Generates contextually appropriate responses

9.5 The PromptBuilder System

9.5.1 Modular Architecture

The PromptBuilder assembles prompts from components based on context:

type PromptBuilder struct {
    modes    PromptModes
    identity IdentityConfig
    voice    VoiceConfig
    context  RetrievedContext
}

type PromptModes struct {
    StrictKnowledgeBoundary bool  // Only speak from retrieved context
    AllowSynthesis         bool  // Can speculate beyond context
    CiteSynthesis          CitationStyle
    IncludeCitations       bool
    HasLowConfidence       bool  // Retrieved context is weak
}

9.5.2 Component Assembly

Standard Assembly Order:

Identity (always) - "You are [Name], a [role]"
Voice (if configured) - Personality, expertise, language variant
Knowledge Boundary (if strict mode) - What you can/cannot know
Synthesis Permission (if allowed and not strict) - How to handle gaps
Context (always) - Numbered chunks from retrieval
Citation Instructions (if enabled) - How to cite sources
Query (always) - The actual question/prompt

9.5.3 Mode-Based Behavior

Strict Knowledge Boundary Mode:

Expert speaks ONLY from retrieved context
No speculation or synthesis
"I don't know" for gaps
Maximum citation compliance

Synthesis Allowed Mode:

Can speculate from character perspective
Must mark synthesized content
Maintains voice while extending

Game Context Mode (Traitors-specific):

Integrates game state (who's alive, current phase)
Respects role-based knowledge
Enables strategic reasoning

9.5.4 Prompt Efficiency

V5 prompts are approximately 80% smaller than legacy approaches:

Old Approach (V4):

You are an AI assistant playing the role of...
[500 words of instructions]
[100 words of personality]
[200 words of constraints]
[Retrieved context]
[Query]

New Approach (V5):

You are Sarah, analytical and cautious.
Your knowledge: [Retrieved context with citations]
Question: [Query]

9.6 Citation System

9.6.1 Citation Types

Type	Description	Use Case
`canonical`	Direct quote from source	Stating known facts
`synthesized`	Generated from character knowledge	Opinions, inferences
`background`	Timeless character facts	Personality, history

9.6.2 Citation Generation

Citations are extracted post-generation:

type Citation struct {
    SourceID     string
    SourceName   string
    ChunkID      int
    Quote        string
    Confidence   float64
    MatchType    MatchType  // direct, paraphrased, synthesized
    Position     TextRange
}

func (p *CitationProcessor) ExtractCitations(
    response string,
    context *RetrievedContext,
) ([]Citation, error) {
    // Match response segments to source chunks
    // Score confidence based on overlap
    // Return structured citations
}

9.6.3 Citation Workflow

LLM generates response with inline citation markers
Post-processor extracts citation positions
Validator confirms citations match source material
Formatter creates final output with proper attribution

9.7 Sections vs. Chunks

9.7.1 Design Decision

The system uses sections (not chunks) for RAG retrieval:

Sections:

Preserve semantic boundaries (~1,500 words average)
Natural content divisions (chapters, scenes, dialogues)
Context-rich units that stand alone

Chunks (traditional approach):

Fixed-size segments (512-1024 tokens)
May break mid-sentence or mid-thought
Require overlap for coherence

9.7.2 Rationale

For Traitors simulation, section-based retrieval provides:

Complete conversations (not fragments)
Full strategic context (not partial)
Character voice preservation
Natural citation boundaries

9.7.3 Chunk Usage

Chunks are still used for:

Fine-grained citation attribution
Detailed similarity scoring
RAPTOR tree building (clustering)

9.8 Game State Integration

9.8.1 State Management

The system maintains comprehensive game state:

type GameState struct {
    Episode      int
    Phase        GamePhase
    PrizePool    int

    Players      []PlayerState
    Traitors     []PlayerID  // Known to system, not all players

    Murders      []MurderEvent
    Banishments  []BanishmentEvent
    Votes        []VoteRecord

    Alliances    []Alliance
    Conversations []ConversationRecord
}

type PlayerState struct {
    ID           PlayerID
    Name         string
    Role         Role
    Status       PlayerStatus  // alive, murdered, banished
    EmotionState EmotionState
    Knowledge    KnowledgeState
    Relationships map[PlayerID]RelationshipState
}

9.8.2 Query Context Enhancement

Before RAG query, game state provides context:

func (e *GameEngine) EnhanceQuery(
    playerID PlayerID,
    baseQuery string,
) *EnhancedQuery {

    player := e.State.GetPlayer(playerID)

    return &EnhancedQuery{
        Query:           baseQuery,
        AskerRole:       player.Role,
        KnownPlayers:    player.Knowledge.KnownPlayers,
        CurrentPhase:    e.State.Phase,
        RecentEvents:    e.getRecentEventsFor(playerID),
        AllowedKnowledge: e.getKnowledgeBoundary(playerID),
    }
}

9.8.3 Response Validation

Generated responses are validated against game state:

func (v *ResponseValidator) Validate(
    response string,
    playerID PlayerID,
    gameState *GameState,
) *ValidationResult {

    // Check for forbidden knowledge leaks
    // Verify mentioned players exist
    // Confirm events referenced have occurred
    // Ensure role-appropriate content
}

9.9 Integration Points

9.9.1 Expert Adapter Layer

Translates between game concepts and RAG queries:

type ExpertAdapter struct {
    ragService *RAGService
    gameEngine *GameEngine
}

func (a *ExpertAdapter) GetPlayerResponse(
    ctx context.Context,
    playerID PlayerID,
    situation Situation,
) (*PlayerResponse, error) {

    // Translate situation to query
    // Apply player-specific filters
    // Call RAG pipeline
    // Post-process for game context
}

9.9.2 Conversation Handler

Manages multi-turn interactions:

type ConversationHandler struct {
    adapter     *ExpertAdapter
    history     *ConversationHistory
    turnManager *TurnManager
}

func (h *ConversationHandler) HandleTurn(
    ctx context.Context,
    speaker PlayerID,
    listeners []PlayerID,
    content string,
) (*TurnResult, error) {

    // Generate speaker's contribution
    // Update all listeners' knowledge
    // Record for history
    // Trigger follow-up if needed
}

9.9.3 Event Processing

Game events trigger RAG queries:

func (e *GameEngine) ProcessMurderReveal(murder MurderEvent) {
    // Update game state
    e.State.RecordMurder(murder)

    // Generate reactions from each living player
    for _, player := range e.State.LivingPlayers() {
        reaction := e.adapter.GetReaction(player, murder)
        e.recordReaction(player, reaction)
    }

    // Update relationship states
    e.updateRelationshipsAfterMurder(murder)
}

9.10 Performance Considerations

9.10.1 Caching Strategy

Query Cache:

Cache frequent queries (player descriptions, common situations)
TTL-based expiration
Game-state-aware invalidation

Embedding Cache:

Pre-computed embeddings for all sections
Query embedding cached per session
Reduces API calls significantly

9.10.2 Parallel Processing

func (e *GameEngine) GenerateAllReactions(event Event) {
    var wg sync.WaitGroup
    reactions := make(chan PlayerReaction, len(e.State.LivingPlayers()))

    for _, player := range e.State.LivingPlayers() {
        wg.Add(1)
        go func(p PlayerID) {
            defer wg.Done()
            reaction := e.adapter.GetReaction(p, event)
            reactions <- reaction
        }(player)
    }

    wg.Wait()
    close(reactions)

    // Collect and process reactions
}

9.10.3 Token Budget Management

type TokenBudget struct {
    MaxTotal     int
    ContextLimit int
    ResponseLimit int
}

func (b *PromptBuilder) BuildWithBudget(budget TokenBudget) (string, error) {
    // Prioritize essential components
    // Truncate context if needed
    // Reserve response space
}

9.11 Conclusion: Foundation for Social Simulation

The RAG architecture provides the foundation for authentic Traitors simulation:

Knowledge Grounding: Responses rooted in retrieved content
Personality Consistency: Expert configuration maintains character voice
Information Boundaries: Role-appropriate knowledge access
Game Integration: State-aware query and response processing
Scalability: Parallel processing and caching for performance

Chapter 10 extends this foundation with the emotional and deception modelling that transforms factual retrieval into believable social behaviour. For advanced memory management techniques, see the Cognitive Memory Architecture chapter.

Abstract