[
https://issues.apache.org/jira/browse/SOLR-18227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated SOLR-18227:
----------------------------------
Labels: Solr pull-request-available (was: Solr)
> Support named queries in Solr
> -----------------------------
>
> Key: SOLR-18227
> URL: https://issues.apache.org/jira/browse/SOLR-18227
> Project: Solr
> Issue Type: Improvement
> Components: query parsers
> Affects Versions: 10.1, 9.10.1
> Reporter: Dmitry Tikhonov
> Priority: Major
> Labels: Solr, pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Solr has no built-in way to answer "which query clauses actually matched
> this document?" for a given result set. This is a common need in relevance
> debugging, A/B testing pipelines, and rules-based boosting:
> you want to know not just that document 42 scored 3.7, but that it matched
> the "brand_exact" clause and the "recency_boost" clause, while not matching
> the "in_stock" clause.
> Lucene has provided the NamedMatches API since Lucene 8 precisely for this
> purpose, but Solr has never exposed it.
>
> *Proposed Solution*
> 1. *_name* *local-param on query parsers* — Add a _name local param to a
> focused set of widely-used query parsers. When present, the parser wraps its
> result with NamedMatches.wrapQuery(name, query) so the name
> travels with the query through scoring and can be recovered post-search:
>
>
> {code:java}
> q={!bool _name=all_books
> should='{!term _name=fantasy f=cat}fantasy'
> should='{!term _name=scifi f=cat}scifi'}
> {code}
> 1. Supported parsers: term, terms, bool, lucene, prefix, dismax, edismax,
> fuzzy.
> 2. *MatchedQueriesComponent* — A new SearchComponent activated by
> matched_queries=true (alias mq=true) that performs a lightweight second pass
> over the top-N hits using Weight.matches(). It reports which named
> clauses fired per document and as an aggregate summary:
>
>
> {code:java}
> "matched_queries_per_hit": {
> "1": ["all_books", "fantasy"],
> "5": ["all_books", "scifi"]
> },
> "matched_queries_summary": {
> "all_books": { "count": 7, "docIds": ["1","2","3","4","5","6","7"] },
> "fantasy": { "count": 4, "docIds": ["1","2","3","4"] },
> "scifi": { "count": 3, "docIds": ["5","6","7"] }
> }
> {code}
> *Implementation Notes*
> - The second pass uses *Weight.matches(LeafReaderContext, docId)* — the
> same API used by highlighters. It performs per-document posting-list seeks
> over the top-N result set only, not a full re-scan of the index.
> - *ScoreMode.COMPLETE_NO_SCORES* is used for the matches weight, allowing
> Lucene to skip score computation entirely.
>
> - localParams null-safety: all parsers guard localParams != null before
> reading _name so the feature is inert when a parser is used as a defType
> default (where localParams is null).
> - *MatchedQueriesComponent* must be registered in solrconfig.xml and added
> to a request handler's component chain. It is a no-op unless
> matched_queries=true is present on the request.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]