[GitHub] [lucene] jainankitk commented on issue #12317: Option for disabling term dictionary compression

via GitHub Mon, 22 May 2023 11:54:07 -0700


jainankitk commented on issue #12317:
URL: https://github.com/apache/lucene/issues/12317#issuecomment-1557736895


   @gsmiller - Thank you for reviewing and providing your comments
   
   > it looks like you're primarily looking at an indexing-related performance 
issue and concerned with the memory usage during writing. Is that correct?
   
   Looking at an issue around higher GC in recent versions (8.10+) compared to 
previous version (7.x). Nothing specifically with the indexing
   
   
   > When you disabled the patch, did you notice query-time performance changes?
   
   Did not notice any degradation in performance as the index size is small, so 
it can fit in memory with / without compression
   
   > Compression isn't only useful for saving disk space; it's useful for 
keeping index pages hot in the OS cache and getting better data locality, which 
translates to better query-time performance.
   
   Not sure if I understand this completely. Based on my understanding, file is 
nothing but an array of bytes, and lucene reader directly works with that. Now 
if we compress and store those bytes, the indices into that array changes and 
reader cannot use that directly. So even if we can keep it hot in the OS cache, 
some intermediate logic takes care of decoding that sequence of bytes 
(decompression). That decompressed sequence needs to be stored somewhere, be it 
byte buffer on heap or native memory. Although we will decode only the blocks 
that lucene reader needs, we could have directly read the same blocks into 
native memory from uncompressed file.
   
   
   @jpountz Thoughts?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [lucene] jainankitk commented on issue #12317: Option for disabling term dictionary compression

Reply via email to