krickert commented on PR #15676:
URL: https://github.com/apache/lucene/pull/15676#issuecomment-3930001585

   You’re right, I mixed objectives. I’ll focus on recall next, specifically 
recall vs `efSearch` across three scenarios:
   
   - single-shard baseline  
   - multi-shard independent  
   - multi-shard collaborative on the same shard graphs
   
   I’ll treat work/latency as secondary and keep them out of the main 
conclusion for now.
   
   Next I’ll test whether recall can be improved by adding shard-aware 
index-time context instead of relying on search-time coordination alone. I’ll 
prototype a lightweight global routing layer and cross-shard neighborhood 
metadata so shard traversal starts with better global priors.
   
   I think the core issue is that each shard currently builds and searches its 
own local ANN neighborhood frontier. A single shard can look strong, but once 
we merge across many shard-local frontiers, recall drops much harder than I 
expected. It’s honestly more severe than I thought, and that’s exactly why I 
think index-time global awareness can help.  I'm looking through some papers 
for a round, but I'll test out a few more scenarios.
   
   Thanks for being patient, by the way.. I really want to push hard for 
getting a high K search to be the norm.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to