How to Map State-Specific FOIA Exemptions to Python Dictionaries

This task sits inside FOIA Request Taxonomy Design: once a request is classified, the system must look up which statutory exemptions apply, and that lookup is only as trustworthy as the dictionary behind it. Here you build that dictionary as a validated, versioned compliance artifact rather than a hand-edited literal.

Scenario & Compliance Stakes

A records team automating responses across several states cannot hardcode exemptions as a flat {"code": "exempt"} literal. A single record segment routinely triggers overlapping carve-outs — personnel privacy, an open law-enforcement investigation, critical-infrastructure security, and inter-agency deliberative process — and each state words and numbers those exemptions differently. When that nuance is flattened into one lookup table, jurisdictional drift creeps in: a citation typo, a repealed subsection still marked active, or a legislative amendment that silently changes scope. Each of those defects propagates straight into the Security Boundary Configuration layer and the redaction engine, where it becomes either an unlawful disclosure or an over-withholding that fails a public-interest balancing test.

The statutory stakes are concrete. State open-records acts impose short, hard response windows — mirroring the federal 20-business-day clock under 5 U.S.C. § 552(a)(6)(A)(i) — so an automated decision must resolve the correct exemption deterministically and on the first pass. A wrong answer is not a cosmetic bug: releasing material an investigative-records exemption protects can compromise an active case, while withholding a record no active statute covers is an appealable denial. Treating the exemption map as versioned, validated configuration is what makes every disclosure decision defensible on audit.

Prerequisites

Python 3.11+ — for modern typing, enum.StrEnum, and datetime.date comparisons.
pydantic v2.5+ — declarative validation, frozen models, and model_validator(mode="after") for cross-field checks.
A source manifest in JSON or YAML — one file per jurisdiction, holding the raw exemption entries (statute reference, scope, dates, weight, boundary). Keep these under version control alongside the code.
Write access to a structured audit log sink — stdout JSON in development, forwarded to your SIEM or append-only store in production.
A controlled vocabulary for exemption_category — reuse the category labels established in your request taxonomy so categories line up with how requests are classified, not invented per state.

Architecture Overview

State exemption statutes are multidimensional, so the dictionary is nested three levels deep — jurisdiction_code -> exemption_category -> exemption_rule — and every terminal rule carries the immutable metadata an automated decision needs.

Each terminal rule must encode its statutory citation and legislative version, the applicability conditions tied to record type and originating agency, redaction thresholds and partial-disclosure allowances, a security-boundary classification, and the temporal validity window. Enforcing strict typing at ingestion means a malformed entry is rejected before it can ever reach the request-evaluation queue.

Implementation

The module below loads a jurisdiction manifest, validates it against a strict schema, and resolves overlapping exemptions deterministically. It uses pydantic v2 for declarative validation and structured logging for audit traceability; the numbered comments mark the compliance-critical lines.

python

import json
import logging
from datetime import date
from enum import Enum
from typing import Dict, List, Optional
from pydantic import BaseModel, Field, field_validator, model_validator, ConfigDict

# Structured audit logging: every exemption decision must be reconstructable on appeal.
logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s | %(levelname)s | %(name)s | %(message)s",
)
logger = logging.getLogger("exemption_mapper")


class SecurityBoundary(str, Enum):
    PUBLIC = "public"
    RESTRICTED = "restricted"
    CONFIDENTIAL = "confidential"
    EXEMPT = "exempt"


class ExemptionRule(BaseModel):
    # (1) frozen + extra="forbid": block runtime mutation and silent schema drift.
    model_config = ConfigDict(frozen=True, extra="forbid")

    statute_ref: str
    scope: str
    security_boundary: SecurityBoundary
    retention_override_months: Optional[int] = Field(default=None, ge=0)
    precedence_weight: int = Field(ge=1, le=10)        # (2) higher weight = stronger exemption
    applicable_agencies: List[str] = Field(default_factory=list)
    effective_date: date
    sunset_date: Optional[date] = None
    partial_disclosure_allowed: bool = False

    @field_validator("statute_ref")
    @classmethod
    def validate_citation_format(cls, v: str) -> str:
        # (3) Reject unparseable citations so record scanning never silently misses a match.
        if not v.startswith(("§", "Sec.", "Art.", "Code")):
            raise ValueError("Statute citation must begin with §, Sec., Art., or Code")
        return v

    @model_validator(mode="after")
    def validate_temporal_window(self):
        # (4) A sunset on/before the effective date would create a permanently-inactive rule.
        if self.sunset_date and self.sunset_date <= self.effective_date:
            raise ValueError("sunset_date must be strictly after effective_date")
        return self


class JurisdictionExemptions(BaseModel):
    model_config = ConfigDict(frozen=True, extra="forbid")
    jurisdiction_code: str
    schema_version: str                                 # (5) pin to legislative amendment tracking
    last_amended: date
    exemptions: Dict[str, Dict[str, ExemptionRule]]     # category -> rule_id -> rule


def load_exemption_manifest(filepath: str) -> JurisdictionExemptions:
    """Ingest a manifest, validate against the strict schema, return an immutable config."""
    try:
        with open(filepath, "r", encoding="utf-8") as f:
            raw_data = json.load(f)
        validated = JurisdictionExemptions.model_validate(raw_data)
        logger.info(
            "manifest_loaded jurisdiction=%s schema_version=%s",
            validated.jurisdiction_code, validated.schema_version,
        )
        return validated
    except Exception as e:
        # (6) Fail closed: a bad manifest must halt ingestion, never default to "disclose".
        logger.error("manifest_validation_failed error=%s", str(e))
        raise RuntimeError("Compliance-critical schema validation failed. Halting ingestion.") from e


def resolve_exemption_conflict(
    candidate_rules: List[ExemptionRule],
    evaluation_date: Optional[date] = None,
) -> Optional[ExemptionRule]:
    """Deterministically pick the winning exemption when several apply to one segment."""
    if not candidate_rules:
        return None

    today = evaluation_date or date.today()
    # (7) Temporal filter FIRST: exclude repealed or not-yet-effective statutes.
    active_rules = [
        r for r in candidate_rules
        if r.effective_date <= today and (r.sunset_date is None or r.sunset_date > today)
    ]
    if not active_rules:
        logger.warning("no_active_exemption evaluation_date=%s", today)
        return None

    # (8) Tie-break order: EXEMPT > CONFIDENTIAL > RESTRICTED > PUBLIC.
    boundary_order = {
        SecurityBoundary.EXEMPT: 4,
        SecurityBoundary.CONFIDENTIAL: 3,
        SecurityBoundary.RESTRICTED: 2,
        SecurityBoundary.PUBLIC: 1,
    }
    winner = max(
        active_rules,
        key=lambda r: (r.precedence_weight, boundary_order.get(r.security_boundary, 0)),
    )
    logger.info(
        "exemption_resolved statute=%s weight=%s boundary=%s partial=%s",
        winner.statute_ref, winner.precedence_weight,
        winner.security_boundary.value, winner.partial_disclosure_allowed,
    )
    return winner

The resolver runs the temporal filter before precedence sorting, which is the part teams most often get backwards.

When partial_disclosure_allowed is True on the winning rule, the engine triggers redaction rather than full withholding, and any retention_override_months feeds straight into Records Retention Scheduling so an exempt record is not destroyed while still under obligation.

Expected Output & Verification

Loading a valid Texas manifest and resolving two overlapping rules produces a clean, greppable audit trail:

text

2026-06-27 09:14:02,118 | INFO | exemption_mapper | manifest_loaded jurisdiction=TX schema_version=2026.1
2026-06-27 09:14:02,121 | INFO | exemption_mapper | exemption_resolved statute=§ 552.108 weight=9 boundary=exempt partial=False

Assert the contract in a unit test so a precedence inversion fails CI rather than production:

python

def test_higher_weight_wins_within_active_window():
    privacy = ExemptionRule(
        statute_ref="§ 552.101", scope="personnel", security_boundary="confidential",
        precedence_weight=6, effective_date=date(2020, 1, 1),
    )
    law_enforcement = ExemptionRule(
        statute_ref="§ 552.108", scope="open investigation", security_boundary="exempt",
        precedence_weight=9, effective_date=date(2019, 1, 1),
    )
    winner = resolve_exemption_conflict([privacy, law_enforcement], date(2026, 6, 27))
    assert winner.statute_ref == "§ 552.108"   # stronger weight + EXEMPT boundary

A repealed statute (a sunset_date in the past) must drop out before sorting; feed one in and assert resolve_exemption_conflict returns the next-strongest active rule, never the lapsed one. For full coverage, replay a corpus of historical request payloads through the resolver in a dry-run mode and diff expected against actual outcomes before any production deploy.

Common Pitfalls

Sorting before the temporal filter. If you sort by precedence_weight first and only then check dates, a repealed-but-high-weight statute can win and withhold a record that is now public. Always filter on effective_date/sunset_date before the precedence sort — the order in the resolver above is load-bearing.
Citation-format false negatives. The same statute appears as § 552.108, Sec. 552.108, and Code § 552.108 across drafting styles. Without normalization at ingestion, record scanning misses matches and a protected record leaks. Normalize citations through a regex preprocessor or lookup table before validation, and keep the statute_ref prefix check as a backstop.
Mutable manifests. Editing a loaded dictionary at runtime — or accepting unknown keys — lets schema drift slip in unreviewed. frozen=True plus extra="forbid" force every change through a new schema_version and a formal approval, which is exactly what keeps the audit trail intact when a legislature amends an exemption mid-year.

FAQ

Why a three-tier nested dictionary instead of one flat lookup table?

Because a single record segment can trigger several exemptions at once, and a flat table cannot represent that without losing statutory lineage. Nesting jurisdiction_code -> exemption_category -> exemption_rule keeps each rule’s citation, dates, and boundary attached to it, so the resolver can compare all candidates for a segment and pick a winner deterministically. A flat {code: "exempt"} map forces a last-write-wins collision the moment two carve-outs overlap, which is precisely the case that produces an unlawful disclosure or denial.

How do I handle a legislative amendment without breaking historical decisions?

Treat the manifest as immutable and versioned. Increment schema_version, add the amended rule with its new effective_date, and set a sunset_date on the superseded rule rather than deleting it. Because the resolver filters by date at evaluation time, a request adjudicated under last year’s law still resolves against the rule that was active then, while new requests pick up the amended rule. Keeping repealed rules in place is what makes a past decision reproducible on appeal.

What should the loader do when a manifest fails validation?

Fail closed and halt ingestion. The load_exemption_manifest function logs the error and raises rather than returning a partial or empty config, because the only safe default for a compliance system is to refuse to operate on an untrusted exemption map — never to fall back to “disclose”. Pair that with schema-version pinning so a manifest that drops a required field is rejected at load, not discovered mid-decision.

FOIA Request Taxonomy Design — the parent classification model these exemption maps plug into.
Security Boundary Configuration — where a resolved exemption’s boundary is enforced at every routing hop.
State Law Compliance Frameworks — jurisdiction-specific statutory rules that feed these manifests.
Records Retention Scheduling — consumes retention_override_months so exempt records survive their obligation window.
Core Architecture & Compliance Mapping — the architecture layer this dictionary belongs to.

How to Map State-Specific FOIA Exemptions to Python Dictionaries #

Scenario & Compliance Stakes #

Prerequisites #

Architecture Overview #

Implementation #

Expected Output & Verification #

Common Pitfalls #

FAQ #

Related #