iprithv commented on code in PR #16145:
URL: https://github.com/apache/lucene/pull/16145#discussion_r3320693345


##########
lucene/core/src/java/org/apache/lucene/store/MemorySegmentIndexInput.java:
##########


Review Comment:
   clone() doesn't copy isRandom, so cloned inputs lose the random access hint. 
Since IndexInput is commonly cloned (e.g., OffHeapFloatVectorValues.copy() does 
slice.clone() for per-thread HNSW vector access), this can disable the intended 
optimization for RANDOM mode. 
   
   Earlier it wouldn't have mattered, but as you introduce isRandom I think it 
needs to be included in clone() as well.



##########
lucene/CHANGES.txt:
##########
@@ -141,6 +141,12 @@ Improvements
 
 Optimizations
 ---------------------
+* GITHUB#16145: Always fire prefetch when ReadAdvice is RANDOM in 
MemorySegmentIndexInput. The power-of-two throttle in
+  MemorySegmentIndexInput assumes "consecutive cache hits → file is warm", 
which does not hold for
+  random-access patterns where each request visits unpredictable pages. A 
negative sentinel in the
+  shared prefetch counter signals RANDOM mode with zero overhead on the 
non-RANDOM hot path.
+  (Michael Marshall)

Review Comment:
   this changelog entry doesn’t match the actual implementation. it talks about 
a “negative sentinel” approach, but the code uses a boolean isRandom flag 
instead. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to