Hi, in real-life I work a lot with json & fts search, here's a feature I've always wished I had, but never tackle it. Until yesterday that is.
SELECT jsonb_path_query(doc, '$.comments[*] ? (@.user == "Alice" && @.body tsmatch "performance")'); This patch introduces a tsmatch boolean operator to the JSONPath engine. By integrating FTS natively into path expressions, this operator allows for high-precision filtering of nested JSONB structures— solving issues with structural ambiguity and query complexity. Currently, users must choose between two suboptimal paths for FTS-ing nested JSON: - Imprecise Global Indexing jsonb_to_tsvector aggregates text into a flat vector. This ignores JSON boundaries, leading to false positives when the same key (e.g., "body") appears in different contexts (e.g., a "Product Description" vs. a "Customer Review"). - Complex SQL Workarounds Achieving 100% precision requires unnesting the document via jsonb_array_elements and LATERAL joins. This leads to verbose SQL and high memory overhead from generating intermediate heap tuples. One of the most significant advantages of tsmatch is its ability to participate in multi-condition predicates within the same JSON object - something jsonb_to_tsvector cannot do. SELECT jsonb_path_query(doc, '$.comments[*] ? (@.user == "Alice" && @.body tsmatch "performance")'); In a flat vector, the association between "Alice" and "performance" is lost. tsmatch preserves this link by evaluating the FTS predicate in-place during path traversal. While the SQL/JSON standard (ISO/IEC 9075-2) does not explicitly define an FTS operator, tsmatch is architecturally modeled after the standard-defined like_regex. The implementation follows the like_regex precedent: it is a non-indexable predicate that relies on GIN path-matching for pruning and heap re-checks for precision. Caching is scoped to the JsonPathExecContext, ensuring 'compile-once' efficiency per execution without violating the stability requirements of prepared statements. This initial implementation uses plainto_tsquery. However, the grammar is designed to support a "mode" flag (similar to like_regex flags) in future iterations to toggle between to_tsquery, websearch_to_tsquery, and phraseto_tsquery.
v1-0001-Add-tsmatch-JSONPath-operator-for-granular-Full-T.patch
Description: Binary data
tsmatch_bench.out
Description: Binary data
tsmatch_bench.sql
Description: Binary data
