Re: [PR] Support madvise for MmapMemory [pinot]

via GitHub Wed, 31 Jul 2024 11:06:04 -0700


dinoocch commented on PR #13721:
URL: https://github.com/apache/pinot/pull/13721#issuecomment-2261087393


   > But we can't change the default without a wide spectrum of consequences 
and I'd discourage that. Though it's obviously good to have this feature and 
make it configurable.
   
   Good point. Let's keep the current behavior for now to avoid surprises.
   
   > But they dropped this in Lucene 9 (AFAICT) and they are now using 
MemorySegment.
   
   Seems like they wrote some blog posts regarding this:
   
   https://blog.mikemccandless.com/2010/06/lucene-and-fadvisemadvise.html
   
https://www.elastic.co/search-labs/blog/lucene-and-java-moving-forward-together
   
   The new version of lucene is using the panama apis, which offer a lot of 
interesting potential once support for java < 21 is dropped -- 
https://github.com/apache/lucene/blob/main/lucene/core/src/java21/org/apache/lucene/store/PosixNativeAccess.java
   
   > Logically, a reasonably high read ahead should be quite useful in most 
cases.
   
   From my limited understanding read ahead is extremely useful in systems 
which benefit from large read operations (for example nfs) or more practically 
managed disks like those in 
[azure](https://learn.microsoft.com/en-us/azure/virtual-machines/disks-types) 
where the max throughput can only be achieved properly by batching io 
operations into a single read.
   
   In an ideal world, there's a lot we could potentially do (and also a lot of 
limitations currently imposed by the page-cache and mmap on us), some examples:
   
   * Smartly madvise buffers based on their size -- "medium" sized indexes 
which use binary search might benefit from NORMAL, while very large or small 
such indexes likely would prefer RANDOM (I would guess)
   * I think there's some potential for WILLNEED to be useful to start 
async-reads of pages to preemptively reduce the chances of page fault
   
   I am honestly a bit more interested in if we would benefit more from direct 
io and managing the cache internally though... 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Re: [PR] Support madvise for MmapMemory [pinot]

Reply via email to