jainankitk opened a new issue, #12317: URL: https://github.com/apache/lucene/issues/12317
### Description While working on a customer issue, I noticed that memory allocations for recently added [term dictionary compression](https://github.com/apache/lucene-solr/commit/33a7af9cbfb9f668b4aee433906ee93d55e1e709) is significant. After disabling the compression using patch, I was able to notice some reduction in the memory allocation. Generally the cost of storage is significantly lower than memory/CPU, but can be useful once the segment/index is being archived. But during live data ingestion when segments merge frequently, the cost of compression/decompression is paid more than once. Wondering couple of things here: * Should we expose an option to disable term dictionary compression? * Does it make sense to initialize the HighCompressionHashTable lazily in TermsWriter? As some code paths (non-compression) don't end up using this. For context, the customer workload is running on instance having 32G memory with 16G allocated for heap. Attaching the memory allocation profile below:  -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
