Misha Dmitriev created HIVE-16166:
-------------------------------------
Summary: HS2 may still waste up to 15% of memory on duplicate
strings
Key: HIVE-16166
URL: https://issues.apache.org/jira/browse/HIVE-16166
Project: Hive
Issue Type: Improvement
Reporter: Misha Dmitriev
Assignee: Misha Dmitriev
A heap dump obtained from one of our users shows that 15% of memory is wasted
on duplicate strings, despite the recent optimizations that I made. The
problematic strings just come from different sources this time. See the excerpt
from the jxray (www.jxray.com) analysis attached.
Adding String.intern() calls in the appropriate places reduces the overhead of
duplicate strings with this workload to ~6%. The remaining duplicates come
mostly from JDK internal and MapReduce data structures, and thus are more
difficult to fix.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)