dsmiley commented on code in PR #4277:
URL: https://github.com/apache/solr/pull/4277#discussion_r3093610784


##########
changelog/unreleased/SOLR-18195-collapse-results-combined-query.yml:
##########
@@ -0,0 +1,9 @@
+# See https://github.com/apache/solr/blob/main/dev-docs/changelog.adoc
+title: Support for Collapse Results in Combined Query Component

Review Comment:
   If we're searching for this, we'll likely search for 
"CombinedQueryComponent" as that's its name -- a class.  Maybe also add (RRF) 
because that's the magical incantation to jog the memory of anyone reading this 
   ```suggestion
   title: Support for using {!collapse} with CombinedQueryComponent (RRF)
   ```



##########
solr/core/src/java/org/apache/solr/handler/component/CombinedQueryComponent.java:
##########
@@ -296,6 +317,10 @@ protected void mergeIds(ResponseBuilder rb, ShardRequest 
sreq) {
     long approximateTotalHits = 0;
     Map<String, List<ShardDoc>> shardDocMap = new HashMap<>();
     String[] queriesToCombineKeys = 
rb.req.getParams().getParams(CombinerParams.COMBINER_QUERY);
+    // Build per-shard set of doc IDs retained after collapse in simpleCombine.

Review Comment:
   That helps a lot!
   It may be strange to suggest this but I think simply _removing_ the comment 
words "retained after collapse in simpleCombine" would be slightly clearer, as 
that references stuff happening at a shard level which confused me a little as 
we're processing at the coordinator here.
   
   What I find confusing from your explanation is that a shard returns *both* 
`response_per_query` and the standard `response` key.  I will suggest a comment 
to put somewhere (not precisely at the line here).  Not sure if it's actually 
correct or useful but tell me:
   "This component receives *both* "response" and "response_per_query".  Only 
the former is deduplicated, and thus most (all?) processing of the latter must 
exclude docs not present in the former."
   I wonder if we'd be better off with only "response" and then an additional 
key to provide info on which queries matched.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to