[jira] [Commented] (HIVE-20153) Count and Sum UDF consume more memory in Hive 2+

Sahil Takiar (JIRA) Thu, 12 Jul 2018 09:45:41 -0700


    [ 
https://issues.apache.org/jira/browse/HIVE-20153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16541927#comment-16541927
 ]


Sahil Takiar commented on HIVE-20153:
-------------------------------------

CC: [~aihuaxu]

> Count and Sum UDF consume more memory in Hive 2+
> ------------------------------------------------
>
>                 Key: HIVE-20153
>                 URL: https://issues.apache.org/jira/browse/HIVE-20153
>             Project: Hive
>          Issue Type: Bug
>          Components: UDF
>    Affects Versions: 2.3.2
>            Reporter: Szehon Ho
>            Priority: Major
>         Attachments: Screen Shot 2018-07-12 at 6.41.28 PM.png
>
>
> While playing with Hive2, we noticed that queries with a lot of count() and 
> sum() aggregations run out of memory on Hadoop side much faster than in 
> Hive1.  In many queries, we have to double the memory.
>  
> Taking heap dump, we see one of the main culprit is the field 'uniqueObjects' 
> in GeneraicUDAFSum and GenericUDAFCount, which was added to support Window 
> functions.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-20153) Count and Sum UDF consume more memory in Hive 2+

Reply via email to