[ 
https://issues.apache.org/jira/browse/SOLR-18227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SOLR-18227:
----------------------------------
    Labels: Solr pull-request-available  (was: Solr)

> Support named queries in Solr
> -----------------------------
>
>                 Key: SOLR-18227
>                 URL: https://issues.apache.org/jira/browse/SOLR-18227
>             Project: Solr
>          Issue Type: Improvement
>          Components: query parsers
>    Affects Versions: 10.1, 9.10.1
>            Reporter: Dmitry Tikhonov
>            Priority: Major
>              Labels: Solr, pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
>   Solr has no built-in way to answer "which query clauses actually matched 
> this document?" for a given result set. This is a common need in relevance 
> debugging, A/B testing pipelines, and rules-based boosting: 
>   you want to know not just that document 42 scored 3.7, but that it matched 
> the "brand_exact" clause and the "recency_boost" clause, while not matching 
> the "in_stock" clause.
>   Lucene has provided the NamedMatches API since Lucene 8 precisely for this 
> purpose, but Solr has never exposed it.                                       
>                                                         
>   *Proposed Solution*
>   1. *_name* *local-param on query parsers* — Add a _name local param to a 
> focused set of widely-used query parsers. When present, the parser wraps its 
> result with NamedMatches.wrapQuery(name, query) so the name    
>   travels with the query through scoring and can be recovered post-search:    
>                         
>  
> {code:java}
> q={!bool _name=all_books 
> should='{!term _name=fantasy f=cat}fantasy'
> should='{!term _name=scifi   f=cat}scifi'}       
> {code}
>   1. Supported parsers: term, terms, bool, lucene, prefix, dismax, edismax, 
> fuzzy. 
>   2. *MatchedQueriesComponent* — A new SearchComponent activated by 
> matched_queries=true (alias mq=true) that performs a lightweight second pass 
> over the top-N hits using Weight.matches(). It reports which named 
>   clauses fired per document and as an aggregate summary:                     
>                                   
>  
> {code:java}
> "matched_queries_per_hit": {                 
>     "1": ["all_books", "fantasy"],                          
>     "5": ["all_books", "scifi"]                  
>   },                                                        
>   "matched_queries_summary": {                                                
>     "all_books": { "count": 7, "docIds": ["1","2","3","4","5","6","7"] },
>     "fantasy":   { "count": 4, "docIds": ["1","2","3","4"] },
>     "scifi":     { "count": 3, "docIds": ["5","6","7"] }
>   }                              
> {code}
>   *Implementation Notes*                                      
>   - The second pass uses *Weight.matches(LeafReaderContext, docId)* — the 
> same API used by highlighters. It performs per-document posting-list seeks 
> over the top-N result set only, not a full re-scan of the index.
>   - *ScoreMode.COMPLETE_NO_SCORES* is used for the matches weight, allowing 
> Lucene to skip score computation entirely.                                    
>                                                            
>   - localParams null-safety: all parsers guard localParams != null before 
> reading _name so the feature is inert when a parser is used as a defType 
> default (where localParams is null).  
>   - *MatchedQueriesComponent* must be registered in solrconfig.xml and added 
> to a request handler's component chain. It is a no-op unless 
> matched_queries=true is present on the request.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to