[ https://issues.apache.org/jira/browse/HIVE-6418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13906340#comment-13906340 ]
Sergey Shelukhin commented on HIVE-6418: ---------------------------------------- >From applying this patch on my test, I get the following savings... HashMap >with 1M entries, key is double, row is double and string (strings are short); >one row per container (savings with multiple rows per key will be lower). Lazy part is disabled. Before (sorted by shallow size): |Class|Objects|Shallow Size|Retained Size| |java.lang.Object[]|3000000|80000000|232384000| |org.apache.hadoop.hive.serde2.io.DoubleWritable|3000000|72000000|72000000| |byte[]|1000000|32384000|32384000| |java.util.HashMap$Entry|1000000|32000000|328384000| |java.util.ArrayList|1000000|24000000|208384000| |org.apache.hadoop.hive.ql.exec.persistence.MapJoinEagerRowContainer|1000000|24000000|232384000| |org.apache.hadoop.hive.ql.exec.persistence.MapJoinEagerRowContainer$NoCopyingArrayList|1000000|24000000|160384000| |org.apache.hadoop.io.Text|1000000|24000000|56384000| |org.apache.hadoop.hive.ql.exec.persistence.MapJoinKey|1000000|16000000|64000000| |java.util.HashMap$Entry[]|1|8388624|336772624| |java.util.HashMap|1|48|336772672| After: |Class|Objects|Shallow Size|Retained Size| |org.apache.hadoop.hive.serde2.io.DoubleWritable|3000000|72000000|72000000| |java.lang.Object[]|2000000|56000000|184384000| |byte[]|1000000|32384000|32384000| |java.util.HashMap$Entry|1000000|32000000|264384000| |org.apache.hadoop.hive.ql.exec.persistence.LazyFlatRowContainer|1000000|32000000|168384000| |org.apache.hadoop.io.Text|1000000|24000000|56384000| |org.apache.hadoop.hive.ql.exec.persistence.MapJoinKey|1000000|16000000|64000000| |java.util.HashMap$Entry[]|1|8388624|272772624| |java.util.HashMap|1|48|272772672| Savings of 19~% > MapJoinRowContainer has large memory overhead in typical cases > -------------------------------------------------------------- > > Key: HIVE-6418 > URL: https://issues.apache.org/jira/browse/HIVE-6418 > Project: Hive > Issue Type: Improvement > Reporter: Sergey Shelukhin > Assignee: Sergey Shelukhin > Attachments: HIVE-6418.01.patch, HIVE-6418.02.patch, > HIVE-6418.03.patch, HIVE-6418.WIP.patch, HIVE-6418.patch > > -- This message was sent by Atlassian JIRA (v6.1.5#6160)