top of page


The Invisible Wall Blocking Your FEFO (First Expired, First Out) Strategy
FEFO (First Expired, First Out) only works when your systems recognize one product as one product. In CPG networks, the same SKU can arrive from plants, co-packers, and distributors under different item codes—creating phantom inventory, missed rotation, and expired stock. The result is predictable: retailer rejections, chargebacks, higher freight, and write-offs. Stabilize product identity with governed data and high-volume entity resolution.

Gandhinath Swaminathan
Feb 105 min read


Orchestration of Identity: Turning Algorithms into a Well-tuned Arrangement
Your data thinks Coca-Cola Zero Sugar 12oz and Coke Zero 12 Pack are different products. Healthcare systems can't tell if two patient records refer to the same person. Banks miss money laundering patterns hidden in ownership networks. The algorithms exist—BM25, HNSW, SPLADE, Graph Transformers—but knowing when to use them is the hard part. This framework shows you how to sequence matching algorithms into production entity resolution systems tailored to your domain's risk prof

Gandhinath Swaminathan
Jan 268 min read


Why Probabilistic Record Linkage Still Matters
Probabilistic record linkage still matters because identity data is messy and match decisions carry real financial and compliance risk. This article explains the intuition behind Fellegi–Sunter and Bayesian record linkage, shows how they control false merges and splits across noisy customer and product records, and points to modern tools and books that help you put these ideas into practice.

Gandhinath Swaminathan
Jan 225 min read


Heterogeneous Knowledge Graphs: Multi-Hop Reasoning Beyond Pairwise Matching
Pairwise matching treats each comparison as a one-off. A persistent knowledge graph turns product mentions, manufacturers, model numbers, attributes, and price bins into typed nodes and relations. Matching becomes neighborhood comparison: multi-hop paths (convergent evidence) can beat any single similarity score.

Gandhinath Swaminathan
Jan 227 min read


From Inverted Index to Attention Graph: Turning SPLADE Tokens Into ER Decisions
False entity merges don’t just dirty data. They distort inventory, pricing, and forecasts, then every model and report built on top. Learned sparse retrieval improves recall, but it can still treat records like unordered tokens. This post adds token-to-token attention as a structural check so near-duplicates pass and lookalikes fail, with a trail you can audit.

Gandhinath Swaminathan
Jan 213 min read


When “Almost” Isn’t Good Enough: Why Top Engineers Still Rely On BM25
BM25 looks old on paper, but it still decides which records are worth comparing when identifiers can’t afford to be “almost” right. This post walks through the TF‑IDF roots of BM25, how k1 and b shape the scoring curve, and why Lucene, Elasticsearch, and OpenSearch still rely on it. You’ll see how term statistics, not embeddings, keep product codes, SKUs, and customer records anchored during entity resolution.

Gandhinath Swaminathan
Jan 85 min read
bottom of page