[
https://issues.apache.org/jira/browse/HIVE-6418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13906340#comment-13906340
]
Sergey Shelukhin commented on HIVE-6418:
----------------------------------------
>From applying this patch on my test, I get the following savings... HashMap
>with 1M entries, key is double, row is double and string (strings are short);
>one row per container (savings with multiple rows per key will be lower).
Lazy part is disabled.
Before (sorted by shallow size):
|Class|Objects|Shallow Size|Retained Size|
|java.lang.Object[]|3000000|80000000|232384000|
|org.apache.hadoop.hive.serde2.io.DoubleWritable|3000000|72000000|72000000|
|byte[]|1000000|32384000|32384000|
|java.util.HashMap$Entry|1000000|32000000|328384000|
|java.util.ArrayList|1000000|24000000|208384000|
|org.apache.hadoop.hive.ql.exec.persistence.MapJoinEagerRowContainer|1000000|24000000|232384000|
|org.apache.hadoop.hive.ql.exec.persistence.MapJoinEagerRowContainer$NoCopyingArrayList|1000000|24000000|160384000|
|org.apache.hadoop.io.Text|1000000|24000000|56384000|
|org.apache.hadoop.hive.ql.exec.persistence.MapJoinKey|1000000|16000000|64000000|
|java.util.HashMap$Entry[]|1|8388624|336772624|
|java.util.HashMap|1|48|336772672|
After:
|Class|Objects|Shallow Size|Retained Size|
|org.apache.hadoop.hive.serde2.io.DoubleWritable|3000000|72000000|72000000|
|java.lang.Object[]|2000000|56000000|184384000|
|byte[]|1000000|32384000|32384000|
|java.util.HashMap$Entry|1000000|32000000|264384000|
|org.apache.hadoop.hive.ql.exec.persistence.LazyFlatRowContainer|1000000|32000000|168384000|
|org.apache.hadoop.io.Text|1000000|24000000|56384000|
|org.apache.hadoop.hive.ql.exec.persistence.MapJoinKey|1000000|16000000|64000000|
|java.util.HashMap$Entry[]|1|8388624|272772624|
|java.util.HashMap|1|48|272772672|
Savings of 19~%
> MapJoinRowContainer has large memory overhead in typical cases
> --------------------------------------------------------------
>
> Key: HIVE-6418
> URL: https://issues.apache.org/jira/browse/HIVE-6418
> Project: Hive
> Issue Type: Improvement
> Reporter: Sergey Shelukhin
> Assignee: Sergey Shelukhin
> Attachments: HIVE-6418.01.patch, HIVE-6418.02.patch,
> HIVE-6418.03.patch, HIVE-6418.WIP.patch, HIVE-6418.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)