jainankitk opened a new issue, #12317:
URL: https://github.com/apache/lucene/issues/12317

   ### Description
   
   While working on a customer issue, I noticed that memory allocations for 
recently added [term dictionary 
compression](https://github.com/apache/lucene-solr/commit/33a7af9cbfb9f668b4aee433906ee93d55e1e709)
 is significant. After disabling the compression using patch, I was able to 
notice some reduction in the memory allocation.
   
   Generally the cost of storage is significantly lower than memory/CPU, but 
can be useful once the segment/index is being archived. But during live data 
ingestion when segments merge frequently, the cost of compression/decompression 
is paid more than once.
   
   Wondering couple of things here:
   
   * Should we expose an option to disable term dictionary compression?
   * Does it make sense to initialize the HighCompressionHashTable lazily in 
TermsWriter? As some code paths (non-compression) don't end up using this.
   
   For context, the customer workload is running on instance having 32G memory 
with 16G allocated for heap. Attaching the memory allocation profile below:
   
   ![Screenshot 2023-05-19 at 5 27 53 
PM](https://github.com/apache/lucene/assets/8193480/bfa964b4-76d4-4903-89a7-7164f993ea3e)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to