[jira] [Updated] (HIVE-20509) Plan: fix wasted memory in plans with large partition counts

2018-10-16 Thread Peter Vary (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-20509:
--
   Resolution: Fixed
Fix Version/s: 4.0.0
   Status: Resolved  (was: Patch Available)

Pushed to master.
Thanks for the patch [~b.maidics] and for the review [~gopalv]!

> Plan: fix wasted memory in plans with large partition counts
> 
>
> Key: HIVE-20509
> URL: https://issues.apache.org/jira/browse/HIVE-20509
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Gopal V
>Assignee: Barnabas Maidics
>Priority: Minor
>  Labels: newbie
> Fix For: 4.0.0
>
> Attachments: HIVE-20509.2.patch, HIVE-20509.patch, after.png, 
> before.png
>
>
> {code}
>   public void addPathToAlias(Path path, String newAlias){
> ArrayList aliases = pathToAliases.get(path);
> if (aliases == null) {
>   aliases = new ArrayList<>();
>   StringInternUtils.internUriStringsInPath(path);
>   pathToAliases.put(path, aliases);
> }
> aliases.add(newAlias.intern());
>   }
> {code}
> ArrayList::DEFAULT_CAPACITY is 10, so this wastes 500 bytes of memory due to 
> the {{new ArrayList<>();}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20509) Plan: fix wasted memory in plans with large partition counts

2018-09-11 Thread Barnabas Maidics (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barnabas Maidics updated HIVE-20509:

Attachment: HIVE-20509.2.patch
Status: Patch Available  (was: Open)

Resubmit the patch. Possible flaky test

> Plan: fix wasted memory in plans with large partition counts
> 
>
> Key: HIVE-20509
> URL: https://issues.apache.org/jira/browse/HIVE-20509
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Gopal V
>Assignee: Barnabas Maidics
>Priority: Minor
>  Labels: newbie
> Attachments: HIVE-20509.2.patch, HIVE-20509.patch, after.png, 
> before.png
>
>
> {code}
>   public void addPathToAlias(Path path, String newAlias){
> ArrayList aliases = pathToAliases.get(path);
> if (aliases == null) {
>   aliases = new ArrayList<>();
>   StringInternUtils.internUriStringsInPath(path);
>   pathToAliases.put(path, aliases);
> }
> aliases.add(newAlias.intern());
>   }
> {code}
> ArrayList::DEFAULT_CAPACITY is 10, so this wastes 500 bytes of memory due to 
> the {{new ArrayList<>();}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20509) Plan: fix wasted memory in plans with large partition counts

2018-09-11 Thread Barnabas Maidics (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barnabas Maidics updated HIVE-20509:

Status: Open  (was: Patch Available)

> Plan: fix wasted memory in plans with large partition counts
> 
>
> Key: HIVE-20509
> URL: https://issues.apache.org/jira/browse/HIVE-20509
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Gopal V
>Assignee: Barnabas Maidics
>Priority: Minor
>  Labels: newbie
> Attachments: HIVE-20509.patch, after.png, before.png
>
>
> {code}
>   public void addPathToAlias(Path path, String newAlias){
> ArrayList aliases = pathToAliases.get(path);
> if (aliases == null) {
>   aliases = new ArrayList<>();
>   StringInternUtils.internUriStringsInPath(path);
>   pathToAliases.put(path, aliases);
> }
> aliases.add(newAlias.intern());
>   }
> {code}
> ArrayList::DEFAULT_CAPACITY is 10, so this wastes 500 bytes of memory due to 
> the {{new ArrayList<>();}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20509) Plan: fix wasted memory in plans with large partition counts

2018-09-10 Thread Barnabas Maidics (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barnabas Maidics updated HIVE-20509:

Attachment: (was: HIVE-20509.patch)

> Plan: fix wasted memory in plans with large partition counts
> 
>
> Key: HIVE-20509
> URL: https://issues.apache.org/jira/browse/HIVE-20509
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Gopal V
>Assignee: Barnabas Maidics
>Priority: Minor
>  Labels: newbie
> Attachments: HIVE-20509.patch, after.png, before.png
>
>
> {code}
>   public void addPathToAlias(Path path, String newAlias){
> ArrayList aliases = pathToAliases.get(path);
> if (aliases == null) {
>   aliases = new ArrayList<>();
>   StringInternUtils.internUriStringsInPath(path);
>   pathToAliases.put(path, aliases);
> }
> aliases.add(newAlias.intern());
>   }
> {code}
> ArrayList::DEFAULT_CAPACITY is 10, so this wastes 500 bytes of memory due to 
> the {{new ArrayList<>();}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20509) Plan: fix wasted memory in plans with large partition counts

2018-09-10 Thread Barnabas Maidics (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barnabas Maidics updated HIVE-20509:

Attachment: HIVE-20509.patch
Status: Patch Available  (was: Open)

> Plan: fix wasted memory in plans with large partition counts
> 
>
> Key: HIVE-20509
> URL: https://issues.apache.org/jira/browse/HIVE-20509
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Gopal V
>Assignee: Barnabas Maidics
>Priority: Minor
>  Labels: newbie
> Attachments: HIVE-20509.patch, after.png, before.png
>
>
> {code}
>   public void addPathToAlias(Path path, String newAlias){
> ArrayList aliases = pathToAliases.get(path);
> if (aliases == null) {
>   aliases = new ArrayList<>();
>   StringInternUtils.internUriStringsInPath(path);
>   pathToAliases.put(path, aliases);
> }
> aliases.add(newAlias.intern());
>   }
> {code}
> ArrayList::DEFAULT_CAPACITY is 10, so this wastes 500 bytes of memory due to 
> the {{new ArrayList<>();}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20509) Plan: fix wasted memory in plans with large partition counts

2018-09-10 Thread Barnabas Maidics (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barnabas Maidics updated HIVE-20509:

Status: Open  (was: Patch Available)

> Plan: fix wasted memory in plans with large partition counts
> 
>
> Key: HIVE-20509
> URL: https://issues.apache.org/jira/browse/HIVE-20509
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Gopal V
>Assignee: Barnabas Maidics
>Priority: Minor
>  Labels: newbie
> Attachments: HIVE-20509.patch, after.png, before.png
>
>
> {code}
>   public void addPathToAlias(Path path, String newAlias){
> ArrayList aliases = pathToAliases.get(path);
> if (aliases == null) {
>   aliases = new ArrayList<>();
>   StringInternUtils.internUriStringsInPath(path);
>   pathToAliases.put(path, aliases);
> }
> aliases.add(newAlias.intern());
>   }
> {code}
> ArrayList::DEFAULT_CAPACITY is 10, so this wastes 500 bytes of memory due to 
> the {{new ArrayList<>();}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20509) Plan: fix wasted memory in plans with large partition counts

2018-09-10 Thread Barnabas Maidics (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barnabas Maidics updated HIVE-20509:

Attachment: after.png
before.png
Status: Patch Available  (was: Open)

It was clearly a waste of memory. I did some measurements with 20.000 
partitions to see how much we win.

In the attached images you can see the reserved memory by the ArrayLists and 
pathToAliases HashMap.

According to VisualVM with the default size an ArrayList reserved 136 bytes, 
with the size of 1 it reserved only 64 bytes. This simple change reduced the 
memory cost to less than 50%.

The overall memory change of pathToAliases (with 20.000 partitions): 4.2 MB -> 
2.7 MB. 

> Plan: fix wasted memory in plans with large partition counts
> 
>
> Key: HIVE-20509
> URL: https://issues.apache.org/jira/browse/HIVE-20509
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Gopal V
>Assignee: Barnabas Maidics
>Priority: Minor
>  Labels: newbie
> Attachments: HIVE-20509.patch, after.png, before.png
>
>
> {code}
>   public void addPathToAlias(Path path, String newAlias){
> ArrayList aliases = pathToAliases.get(path);
> if (aliases == null) {
>   aliases = new ArrayList<>();
>   StringInternUtils.internUriStringsInPath(path);
>   pathToAliases.put(path, aliases);
> }
> aliases.add(newAlias.intern());
>   }
> {code}
> ArrayList::DEFAULT_CAPACITY is 10, so this wastes 500 bytes of memory due to 
> the {{new ArrayList<>();}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20509) Plan: fix wasted memory in plans with large partition counts

2018-09-10 Thread Barnabas Maidics (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barnabas Maidics updated HIVE-20509:

Attachment: HIVE-20509.patch

> Plan: fix wasted memory in plans with large partition counts
> 
>
> Key: HIVE-20509
> URL: https://issues.apache.org/jira/browse/HIVE-20509
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Gopal V
>Assignee: Barnabas Maidics
>Priority: Minor
>  Labels: newbie
> Attachments: HIVE-20509.patch
>
>
> {code}
>   public void addPathToAlias(Path path, String newAlias){
> ArrayList aliases = pathToAliases.get(path);
> if (aliases == null) {
>   aliases = new ArrayList<>();
>   StringInternUtils.internUriStringsInPath(path);
>   pathToAliases.put(path, aliases);
> }
> aliases.add(newAlias.intern());
>   }
> {code}
> ArrayList::DEFAULT_CAPACITY is 10, so this wastes 500 bytes of memory due to 
> the {{new ArrayList<>();}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20509) Plan: fix wasted memory in plans with large partition counts

2018-09-05 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-20509:
---
Labels: newbie  (was: )

> Plan: fix wasted memory in plans with large partition counts
> 
>
> Key: HIVE-20509
> URL: https://issues.apache.org/jira/browse/HIVE-20509
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Gopal V
>Priority: Minor
>  Labels: newbie
>
> {code}
>   public void addPathToAlias(Path path, String newAlias){
> ArrayList aliases = pathToAliases.get(path);
> if (aliases == null) {
>   aliases = new ArrayList<>();
>   StringInternUtils.internUriStringsInPath(path);
>   pathToAliases.put(path, aliases);
> }
> aliases.add(newAlias.intern());
>   }
> {code}
> ArrayList::DEFAULT_CAPACITY is 10, so this wastes 500 bytes of memory due to 
> the {{new ArrayList<>();}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HIVE-20509) Plan: fix wasted memory in plans with large partition counts

2018-09-05 Thread Gopal V (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-20509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-20509:
---
Priority: Minor  (was: Major)

> Plan: fix wasted memory in plans with large partition counts
> 
>
> Key: HIVE-20509
> URL: https://issues.apache.org/jira/browse/HIVE-20509
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Gopal V
>Priority: Minor
>  Labels: newbie
>
> {code}
>   public void addPathToAlias(Path path, String newAlias){
> ArrayList aliases = pathToAliases.get(path);
> if (aliases == null) {
>   aliases = new ArrayList<>();
>   StringInternUtils.internUriStringsInPath(path);
>   pathToAliases.put(path, aliases);
> }
> aliases.add(newAlias.intern());
>   }
> {code}
> ArrayList::DEFAULT_CAPACITY is 10, so this wastes 500 bytes of memory due to 
> the {{new ArrayList<>();}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)