[ 
https://issues.apache.org/jira/browse/HIVE-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siying Dong updated HIVE-1758:
------------------------------

    Attachment: HIVE-1758.1.patch

1. use array instead of ArrayList<Object>
2. abstract KeyWrapper a little bit so that we can write specific codes for 
different kind of keys
3. use ListKeyWrapper and TestKeyWrapper

The motivation of the task is to reduce memory footprint. There is minor impact 
to CPU time.

> optimize group by hash map memory
> ---------------------------------
>
>                 Key: HIVE-1758
>                 URL: https://issues.apache.org/jira/browse/HIVE-1758
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: Siying Dong
>         Attachments: HIVE-1758.1.patch
>
>
> Group By map side's hash map consumes a lot of memory, thereby decreasing its 
> effectiveness.
> We can use some of the optimizations from map-join to reduce the memory 
> footprint:
>   class KeyWrapper {
>     int hashcode;
>     ArrayList<Object> keys;
>     // decide whether this is already in hashmap (keys in hashmap are 
> deepcopied
>     // version, and we need to use 'currentKeyObjectInspector').
>     boolean copy = false;
> 1. Changes keys to Array
> 2. Optimize the scenario when keys is of a small size (1,2) etc
> Let us start profiling it and take it from there

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to