← Back to insights
Field note6 min readAgentify

Data Access Control for RAG Systems — Row-Level Security and Beyond

Your RAG system just answered an intern's question using the board's confidential M&A documents.

Your RAG system just answered an intern's question using the board's confidential M&A documents.

Not a hypothetical. I've seen it happen.

When you centralise enterprise knowledge into a vector database and let an LLM retrieve from it, you've effectively created a system that bypasses every access control your organisation spent years building. OWASP flags this as Sensitive Information Disclosure — one of the top 10 security risks in LLM applications.

The thing is, this is a solved problem. There are well-established patterns for it. Most teams just don't implement them because they're racing to ship the demo and figure "we'll add security later."

Let's walk through the patterns.


Why RAG breaks your existing access controls

In a traditional app, access control is baked into the UI layer. A sales rep sees their accounts. An HR manager sees their team. The database enforces permissions, the app respects them. Everyone's happy.

RAG breaks this in three ways:

Ingestion strips permissions. When you chunk a PDF, embed it, and store it in Pinecone, the original permissions from SharePoint or Google Drive don't travel with the vectors. They're gone.

Semantic search ignores boundaries. Cosine similarity returns the most relevant chunks — regardless of department, team, or classification level.

The LLM connects dots across permission walls. Even if no single chunk is confidential, an LLM can combine fragments to infer something no individual user should know.

As Cerbos puts it: the challenge is how to give the LLM enough context without violating authorisation policies. Companies need to ensure that AI agents can't expose sensitive data to unauthorised users.

So how do you fix it?


Pattern 1: Metadata filtering (the 80% solution)

The simplest approach, and honestly sufficient for most teams getting started.

When you ingest documents, attach metadata — department, classification level, role, owner. At query time, filter the vector search to only return chunks the current user can see.

AWS demonstrates this with Bedrock Knowledge Bases: you tag documents with access attributes before ingestion, and the system only returns results matching the user's role during retrieval.

Here's what it looks like in practice:

def query_with_access_control(user_query, user_roles, user_department):
    metadata_filter = {
        "$or": [
            {"access_level": "public"},
            {"department": user_department},
            {"allowed_roles": {"$in": user_roles}}
        ]
    }
    results = vector_db.query(
        query_embedding=embed(user_query),
        filter=metadata_filter,
        top_k=10
    )
    return results

Works great when: You have clear department/role boundaries and a manageable number of access groups.

Breaks down when: You have thousands of users with individual document permissions, or permissions change frequently and metadata goes stale.


Pattern 2: Row-level security with pgvector

If you're running PostgreSQL with pgvector (and honestly, for most Indian mid-market companies, this is the right starting point), you can use Postgres's built-in Row-Level Security.

Supabase's docs show this clearly: because pgvector sits on Postgres, you can restrict which documents come back from a vector similarity search based on who's asking. RLS policies evaluate the current user against a permissions table.

CREATE POLICY "Users can view own document sections"
ON document_sections FOR SELECT
USING (
    document_id IN (
        SELECT document_id FROM document_permissions
        WHERE user_id = auth.uid()
    )
);

The beauty here is that permissions are enforced at the database engine level. Your application code literally cannot bypass them. Permission changes are instant — update the permissions table and the next query respects them, no re-indexing needed.

The downside: this only works with pgvector. If you're on Pinecone, Weaviate, or Qdrant, you'll need a different approach.

Milvus has a similar pattern using bitmap-indexed array columns where you store permission data directly alongside the vectors.


Pattern 3: External authorisation service

For enterprise deployments where permission logic gets hairy — inherited permissions, team hierarchies, cross-department sharing — you want a dedicated authorisation service.

Pinecone's guide demonstrates this using SpiceDB (the open-source auth database inspired by Google's Zanzibar). There are two approaches:

Pre-filter: Before searching, ask the auth service "what can this user see?" and pass that list as a filter to the vector search.

Post-filter: Search without filters, then check each result against the auth service before passing it to the LLM.

The choice depends on your hit rate. Pinecone says: if most documents are accessible to most users, post-filter is fine. If you have a massive corpus and each user can only see a fraction, pre-filter is more efficient.

Auth0's implementation with Okta FGA follows the same idea — model permissions as relationships (user → viewer → document) and evaluate them at retrieval time.

The big win here is separation of concerns. Your RAG pipeline doesn't need to know how permissions work. It just asks the auth service "can this user see this document?" and gets a yes or no.


Pattern 4: Per-chunk ACL hashes

This is the paranoid option, and I mean that as a compliment.

The idea: embed access control directly into each chunk so permissions travel with the content, even after reindexing.

Petronella's enterprise RAG playbook recommends this approach for compliance-heavy deployments: use metadata tags for department, geo, sensitivity, and legal hold. For multi-tenant setups, embed tenant IDs in both metadata and index namespaces. Where feasible, include per-chunk ACL hashes so entitlements travel with content even after reindexing.

You compute a hash of each chunk's permission set, store it as metadata, and filter on it at query time. When permissions change in the source system, you recompute the hash — if it's different, re-index that chunk.

More work to maintain, but it eliminates the class of bugs where your auth service says one thing and your vector DB metadata says another.


Don't forget: PII in the response

Access control at retrieval is necessary but not sufficient. Two risks remain even with perfect permissions:

The LLM leaks PII. Your agent retrieves an authorised document that contains someone's phone number. The user didn't ask for it, but the LLM helpfully includes it in the response anyway. AWS's Bedrock implementation handles this with guardrails that detect and mask PII in the generated output.

Inference across chunks. A user can't see the salary document, but they can see a hiring approval ("approved for Level 7") and a compensation structure document (Level 7 = ₹X). The LLM connects the dots. This is harder to solve — you need to think about which chunk combinations are sensitive, not just individual chunks.

What to do: Run PII detection (Microsoft Presidio or custom models) on both retrieved context and generated responses. For Indian data, make sure you're catching Aadhaar numbers, PAN, phone numbers, and UPI IDs.


What I'd recommend for most Indian companies

If you're getting started:

  1. Start with metadata filtering. Covers 80% of use cases with 20% of the complexity. Tag chunks with department, classification, and role.

  2. Add PII detection on input and output. Catch Aadhaar, PAN, phone numbers, and names before they reach the user.

  3. Plan for external auth. As your RAG system scales to more data sources and user types, you'll outgrow metadata filtering. SpiceDB or Cerbos are good options.

  4. Log everything. Every query, every retrieved chunk, every response. You'll need this for DPDP compliance, and it's invaluable for debugging.

Don't ship a RAG system to production without access control. The demo doesn't need it. Production does.