Enterprise RAG Pipeline

A complete Retrieval-Augmented Generation pipeline built into the automation engine. Chunk documents, generate embeddings, store vectors, retrieve by semantic similarity, and generate grounded AI answers — all as schedulable, auditable, governed job tools.

No LangChain. No Python scripts. No glue code. Just configure and run.

The Problem with DIY RAG

Every RAG tutorial starts the same way: install LangChain, install ChromaDB, write a Python script, run it manually, hope it works. When it breaks at 3am, nobody knows. When the intern who wrote it leaves, nobody can fix it. When the auditor asks who accessed what data, nobody can answer.

Scripts Break Silently

A Python RAG script that fails at 3am sends no alert, logs no error, and nobody knows until Monday. InTouch detects the failure instantly and alerts across every configured messaging channel (up to 8 on Department/Enterprise).

Credentials in .env Files

Pinecone API keys, OpenAI keys — sitting in plaintext .env files on someone's laptop. InTouch stores them in an AES-256 encrypted vault with RBAC access control.

No Governance

Who re-indexed the knowledge base? When? With what settings? A script has no audit trail. InTouch logs every execution with timestamps, parameters, token counts, and outcomes.

Pipeline Architecture

Three native tools that chain together in any multi-step job. Output from one step pipes directly to the next via published properties.

1. Document Chunker

Split documents into right-sized pieces for embedding. Embedding models have token limits (~8K) — a 50-page document won't fit. The chunker creates focused pieces so vector search returns precise matches, not entire documents.

6 strategies: tokens (with overlap), sentences, paragraphs, lines, pages, custom regex. Configurable chunk size, overlap, metadata, and max chunks.

Formats: text, markdown, HTML, CSV, JSON.

2. Embeddings

Convert text into dense numerical vectors that capture semantic meaning. Texts with similar meaning produce vectors close together in vector space — "cancel my subscription" matches "stop my monthly plan" even though no words overlap.

Models: text-embedding-3-small (fast), text-embedding-3-large (accurate, cross-language). Compatible with any OpenAI-compatible API.

Alternative: Hugging Face feature-extraction for free embeddings.

3. Vector Store

Store, query, and manage vectors across 7 providers. Upsert embeddings with metadata, query by semantic similarity with top-K retrieval and metadata filtering, delete by ID or filter.

Operations: upsert, query, delete, list_indexes.

The local provider uses JSON files + cosine similarity. Zero dependencies. The entire RAG pipeline runs on a single machine with nothing installed beyond Java.

How It Chains

Ingestion: Document Chunker → Embeddings → Vector Store (upsert). Attach to a trigger file and new documents are automatically indexed.
Retrieval: Embeddings (query) → Vector Store (query) → Any LLM (Claude, GPT, Gemini, Mistral, Groq, DeepSeek, xAI, Ollama) generates an answer grounded in your actual documents.

7 Vector Store Providers

From zero-dependency local storage to enterprise-scale managed services. Same tool properties, same governance — change the provider by changing one field on the connection.

Local (File-Based)

JSON files + cosine similarity. Zero dependencies. Perfect for Personal edition, testing, and small datasets. No Docker, no external database.

Pinecone

Managed cloud vector DB. Production at scale with serverless infrastructure. Api-Key authentication.

Chroma

Open-source, self-hosted. On-premise deployments and privacy-sensitive environments. Optional auth.

Qdrant

High-performance vector search engine. Large-scale similarity search with advanced filtering.

Weaviate

Multi-modal with GraphQL API. Complex queries, hybrid search, built-in vectorization.

pgvector

PostgreSQL extension. Teams already using PostgreSQL get vector search without new infrastructure.

Milvus

Distributed vector database. Massive-scale deployments with horizontal scaling.

What This Unlocks

AI Assistant Grounded in Your Docs

Load handbooks, runbooks, policies, and SOPs into the vector store. Every question the AI assistant receives — across all 8 messaging channels — gets answered with citations from your actual documents instead of hallucinations. A Slack question about PTO policy retrieves the exact HR paragraph.

Job Failure Diagnosis

Embed historical job error messages with metadata (job name, date, resolution). When a job fails, embed the error, query for similar past failures, and automatically notify subscribers with the diagnosis: "This looks like the credential expiration from March 3 — resolution was to rotate the API key."

Intelligent File Routing

Files land in a watched folder via trigger file. Instead of routing to a fixed job, chunk and embed the incoming file, query against known document types. The system classifies the file semantically — sales CSV triggers the sales ETL, inventory CSV triggers the inventory ETL. Same folder, intelligent routing.

Compliance Audit Search

Embed audit trail summaries into a vector store. An auditor asking "show me all credential changes in Q1" gets semantic results — finding entries logged as "updated API key," "rotated token," and "changed password." Different words, same meaning. Keyword search would miss two of three.

Automation Impact Analysis

Embed descriptions of all jobs, schedules, and their purposes. Query "what jobs touch the CRM database?" to get all relevant automation — even jobs named differently or using different connection types. Essential for change management.

Multi-Language Knowledge Base

Models like text-embedding-3-large encode meaning across languages. A Spanish query retrieves English documents if the meaning matches. One vector index serves multilingual teams without translation overhead.

InTouch RAG vs. Script-Based RAG

Python + LangChain

  • Install LangChain, ChromaDB, write Python
  • External scheduler for orchestration
  • API keys in .env files
  • Custom logging or none
  • No access control
  • Rewrite code to change providers
  • No audit trail

InTouch RAG Pipeline

  • Configure tools in UI or YAML — no code
  • Built-in schedules, triggers, events
  • AES-256 encrypted credential vault
  • Full execution logs + multi-channel alerting (up to 8 channels)
  • RBAC per job, project, publisher
  • Change one field to swap providers
  • Timestamped execution history for every operation

Enterprise Governance on Every Step

Every tool in the RAG pipeline inherits InTouch's full governance framework — the same framework that governs SQL queries, FTP transfers, and AWS tool runs.

Scheduling

Automated re-indexing daily, hourly, or on any of 7 schedule types. Trigger file for event-driven ingestion.

RBAC

Control who can index documents, who can query, who can delete vectors. Project-level isolation.

Audit Trail

Every chunk, embedding, upsert, and query logged with timestamps, token counts, and outcomes.

Credentials

API keys for OpenAI, Pinecone, Qdrant — all in AES-256 encrypted vault. Never in .env files.

Alerting

Ingestion fails? Alert across every configured messaging channel instantly (up to 8). Query latency too high? Threshold alerts.

Multi-Provider

Swap embedding model or LLM without changing the pipeline. Test with OpenAI, deploy with Ollama.

Cost Tracking

Token usage published as properties per execution. Track embedding and LLM costs per job.

Free to Start

Personal edition includes the entire RAG pipeline with the local vector store. Zero cost to get started.

Build Your Knowledge Base Today

The Personal edition includes the complete RAG pipeline with the local vector store. No external dependencies, no API keys required for the vector store, no cost. Upgrade to cloud providers when you're ready to scale.

Explore All AI Features Compare Editions