richardstartin commented on issue #12078: URL: https://github.com/apache/pinot/issues/12078#issuecomment-1836789089
I would recommend against using String.intern, see an authoritative source [here](https://shipilev.net/jvm/anatomy-quarks/10-string-intern/), which recommends manual interning over use of String.intern. Depending on which GC (G1, Shenandoah, Z) you’re running with you may be able to get the GC to deduplicate the backing data with -XX:+UseStringDeduplication. Assuming that data in dictionaries should get quite old, you can tune it with -XX:StringDeduplicationAgeThreshold=n where n is by default 3 collections. You can check it’s working properly with -XX:+PrintStringDeduplicationStatistics. This solution has the benefit of not making code changes with data-dependent efficacy and ramifications. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
