kaivalnp commented on PR #12590:
URL: https://github.com/apache/lucene/pull/12590#issuecomment-1738933241

   > My main concern is that we are adding yet another extension point for 
"partial control" when we already have that with the rewrite or something even 
more complex with the collector
   
   While we technically can do so, copy-pasting 
[`#rewrite`](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java#L64-L93)
 seems very repetitive to me because we would also have to copy-paste 
[`#sequentialSearch`](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java#L95-L102),
 
[`#parallelSearch`](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java#L104-L112),
 
[`#searchLeaf`](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java#L114-L122),
 
[`#getLeafResults`](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java#L124-L154),
 
[`#createBitSet`](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java#L156-L172),
 [`
 
#createRewrittenQuery`](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java#L219-L234)
 as well as 
[`DocAndScoreQuery`](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java#L297-L452)
 (almost the entire file, all of which are `private`) to do so..
   
   Doing this via the collector has an overhead of finding out the index-level 
`topK` from multiple segment-level results *twice* (which is done 
[here](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java#L88)
 anyways)
   
   > It would be very easy to do the wrong thing by allowing sub-classes to 
override this method for yet another avenue of customization
   
   We also seem to have 
[`#exactSearch`](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java#L180-L181)
 as `protected` only for checking its execution from tests (that could have the 
same issue).. We could maybe add some javadocs to 
[`#createRewrittenQuery`](https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java#L219-L234),
 so that users know when to override it / do so more carefully?
   
   > It seems to me we have enough extension points no?
   
   I feel that giving access to the final `topK` results to implementers is a 
good extension point to have, allowing them to post-process the final results 
or even rewrite into some custom `Query` *if needed* (in most cases, just 
delegate to the parent function)
   
   Also open to other suggestions (like maybe a specific `postProcess` function 
that receives the `topK` and returns the results after reading / modifying, so 
that the API is more specific)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to