Experimental

Knowledge Graph

Hybrid Vector + Graph Semantic Search

Ingest documents, extract typed entities and relationships, and retrieve with precision — combining vector similarity with SQL Server native graph traversal, all behind mandatory scope-based access control.

What It Does

The Knowledge Graph is a production-grade RAG (Retrieval-Augmented Generation) system built on FabrCore. It combines two retrieval strategies that are usually separate — vector-based semantic search and structured graph traversal — into a single hybrid search surface.

Ingest

Documents are chunked, embedded, and processed through LLM-driven entity extraction to build a typed knowledge graph.

Structure

Entities, relationships, domains, and categories form a navigable graph with provenance tracking back to source documents.

Retrieve

Four search modes — entity, chunk, relationship, and hybrid — return ranked results with scope enforcement and domain provenance.

This is not general-purpose search. It is designed for focused, high-accuracy retrieval over domain-specific data where precision matters more than breadth — compliance documents, engineering specs, policy libraries, operational procedures.

The Graph Structure

A SQL Server native graph with three node types, two edge types, and vector embeddings on every searchable surface.

ComponentTypePurpose
KnowledgeEntity Graph Node Concept nodes — people, organizations, processes, policies, equipment, technologies, events, and more. Each carries a vector embedding, scope key, and entity type.
KnowledgeChunk Table Content fragments with their own embeddings. One or more per entity. Enables fine-grained retrieval below the entity level.
KnowledgeRelationship Graph Edge Typed, weighted, directed connections between entities. 12 relationship types including RELATED_TO, PART_OF, CAUSES, DEPENDS_ON, USES, PRODUCES, and more.
KnowledgeDomain Graph Node Top-level knowledge domains with priority weights. Entities are classified into domains via categories.
KnowledgeCategory Graph Node Mid-level groupings within domains. Each category carries its own embedding for category-level search. Connected to domains and entities via BelongsTo edges.
SourceDocument Table Provenance tracking. Every ingested document is registered with status tracking (Pending → Processing → Completed/Failed) and linked to extracted entities.

Every search result includes its Domain > Category provenance path, so the consuming agent always knows where the information came from and how it was classified.

Ingestion Pipeline

Documents flow through a multi-stage pipeline that transforms raw text into a structured, searchable knowledge graph.

  1. Document registration — source document recorded with status tracking for monitoring and error recovery.
  2. Content chunking — text split on paragraph and sentence boundaries with configurable overlap for context preservation at chunk edges.
  3. Embedding generation — each chunk and entity embedded as a 1536-dimension vector for semantic search.
  4. LLM entity extraction — chunks processed in batches to extract typed entities (people, organizations, concepts, equipment, policies, etc.) and inter-entity relationships.
  5. Taxonomy classification — extracted entities automatically assigned to Domain > Category hierarchy.
  6. Provenance linking — EXTRACTED_FROM relationships connect every entity back to its source document.

The pipeline handles batching automatically — large documents are split into manageable batches to stay within LLM context limits while maintaining extraction quality.

Four Search Modes

Different questions need different retrieval strategies. The Knowledge Graph supports all four.

Entity Search

Vector similarity on knowledge entities. Returns ranked concept nodes with descriptions, types, and domain provenance. Best for finding specific concepts or facts.

Chunk Search

Vector similarity on content fragments. Fine-grained retrieval below the entity level, useful for finding specific passages or details within larger documents.

Relationship Traversal

Starting from a named entity, traverse typed edges using SQL MATCH queries. Multi-hop support (1–3 levels deep) with context scoring based on domain priority and relationship weights.

Hybrid Search

Two-phase combination: vector search discovers initial matches, then graph expansion traverses relationships from each result. Returns entities, chunks, and relationships in a single consolidated response.

Scope-Based Access Control

Access control is not optional. Every entity carries a scope key, every query requires a list of allowed scopes, and every relationship edge is validated at both endpoints.

  • Mandatory on every query. Search requests that omit scopes are rejected. The LLM never sees scope parameters — they come from agent configuration or authentication.
  • Ingestion-time assignment. Scope is set when a document is ingested. A document cannot be re-ingested into a second scope.
  • Edge validation. Relationship traversal validates scope at both endpoints. No cross-scope edge walking, even through intermediate nodes.
  • Multi-tenant without complexity. One physical database, multiple logical access boundaries. Scope is a WHERE filter, not a separate schema. Scales from single-domain to enterprise multi-tenancy without data duplication.

This design means sensitive data never leaks through graph traversal edges, even in complex multi-hop queries that cross domain boundaries.

Key Capabilities

LLM Entity Extraction

Automatic extraction of 10+ entity types and 12 relationship types from ingested documents. Entities are typed, described, and linked — not just raw text chunks.

Domain Intent Classification

Query-time classifier maps user questions to the best-matching knowledge domain using LLM understanding of domain descriptions, not just keyword matching.

SQL-Native Graph Traversal

Relationship queries use SQL Server’s native MATCH clause — traversal happens in the database engine, not in application code. Fast, indexed, and scope-enforced.

VECTOR(1536) Embeddings

SQL Server 2025 native vector type on entities, chunks, and categories. Cosine distance ranking computed in SQL for both entity-level and chunk-level search.

Taxonomy Without Lock-In

Domain > Category hierarchy is a shared taxonomy. Entities are auto-assigned during extraction. Easy to reorganize without data migration — BelongsTo edges are flexible.

Provenance Transparency

Every result includes its Domain > Category path and source document link. Agents always know where information came from and how it was classified.

Experimental

Interested in the Knowledge Graph?

The Knowledge Graph is experimental and actively evolving. If you need focused, high-accuracy semantic search over domain-specific data, we’d love to talk about how it fits your use case.