[jira] [Comment Edited] (HIVE-16166) HS2 may still waste up to 15% of memory on duplicate strings

Misha Dmitriev (JIRA) Thu, 16 Mar 2017 11:55:13 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-16166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15928635#comment-15928635
 ]


Misha Dmitriev edited comment on HIVE-16166 at 3/16/17 6:53 PM:
----------------------------------------------------------------

[~spena] I ran all these tests locally, and they passed for me. So this looks 
like flakiness in the Hive build, which is not uncommon.


was (Author: mi...@cloudera.com):
@spena I ran all these tests locally, and they passed for me. So this looks 
like flakiness in the Hive build, which is not uncommon.

> HS2 may still waste up to 15% of memory on duplicate strings
> ------------------------------------------------------------
>
>                 Key: HIVE-16166
>                 URL: https://issues.apache.org/jira/browse/HIVE-16166
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Misha Dmitriev
>            Assignee: Misha Dmitriev
>         Attachments: ch_2_excerpt.txt, HIVE-16166.01.patch
>
>
> A heap dump obtained from one of our users shows that 15% of memory is wasted 
> on duplicate strings, despite the recent optimizations that I made. The 
> problematic strings just come from different sources this time. See the 
> excerpt from the jxray (www.jxray.com) analysis attached.
> Adding String.intern() calls in the appropriate places reduces the overhead 
> of duplicate strings with this workload to ~6%. The remaining duplicates come 
> mostly from JDK internal and MapReduce data structures, and thus are more 
> difficult to fix.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Comment Edited] (HIVE-16166) HS2 may still waste up to 15% of memory on duplicate strings

Reply via email to