iprithv commented on code in PR #16145: URL: https://github.com/apache/lucene/pull/16145#discussion_r3320693345
########## lucene/core/src/java/org/apache/lucene/store/MemorySegmentIndexInput.java: ########## Review Comment: clone() doesn't copy isRandom, so cloned inputs lose the random access hint. Since IndexInput is commonly cloned (e.g., OffHeapFloatVectorValues.copy() does slice.clone() for per-thread HNSW vector access), this can disable the intended optimization for RANDOM mode. Earlier it wouldn't have mattered, but as you introduce isRandom I think it needs to be included in clone() as well. ########## lucene/CHANGES.txt: ########## @@ -141,6 +141,12 @@ Improvements Optimizations --------------------- +* GITHUB#16145: Always fire prefetch when ReadAdvice is RANDOM in MemorySegmentIndexInput. The power-of-two throttle in + MemorySegmentIndexInput assumes "consecutive cache hits → file is warm", which does not hold for + random-access patterns where each request visits unpredictable pages. A negative sentinel in the + shared prefetch counter signals RANDOM mode with zero overhead on the non-RANDOM hot path. + (Michael Marshall) Review Comment: this changelog entry doesn’t match the actual implementation. it talks about a “negative sentinel” approach, but the code uses a boolean isRandom flag instead. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
