[ 
https://issues.apache.org/jira/browse/HIVE-20509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barnabas Maidics updated HIVE-20509:
------------------------------------
    Attachment: after.png
                before.png
        Status: Patch Available  (was: Open)

It was clearly a waste of memory. I did some measurements with 20.000 
partitions to see how much we win.

In the attached images you can see the reserved memory by the ArrayLists and 
pathToAliases HashMap.

According to VisualVM with the default size an ArrayList reserved 136 bytes, 
with the size of 1 it reserved only 64 bytes. This simple change reduced the 
memory cost to less than 50%.

The overall memory change of pathToAliases (with 20.000 partitions): 4.2 MB -> 
2.7 MB. 

> Plan: fix wasted memory in plans with large partition counts
> ------------------------------------------------------------
>
>                 Key: HIVE-20509
>                 URL: https://issues.apache.org/jira/browse/HIVE-20509
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Planning
>            Reporter: Gopal V
>            Assignee: Barnabas Maidics
>            Priority: Minor
>              Labels: newbie
>         Attachments: HIVE-20509.patch, after.png, before.png
>
>
> {code}
>   public void addPathToAlias(Path path, String newAlias){
>     ArrayList<String> aliases = pathToAliases.get(path);
>     if (aliases == null) {
>       aliases = new ArrayList<>();
>       StringInternUtils.internUriStringsInPath(path);
>       pathToAliases.put(path, aliases);
>     }
>     aliases.add(newAlias.intern());
>   }
> {code}
> ArrayList::DEFAULT_CAPACITY is 10, so this wastes 500 bytes of memory due to 
> the {{new ArrayList<>();}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to