----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/58777/#review173243 -----------------------------------------------------------
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HashMapWrapper.java Lines 148 (patched) <https://reviews.apache.org/r/58777/#comment246289> Key and Value for the non-optimized hash table loader is Object[] which will hold serialized binary objects or deserialized object corresponding to column values. It is very intrusive to add memory estimation for all types, OIs, writable etc. so the assumption here is that each entry in the hash table is of size 1KB. In most cases, we use optimized hash table which is pretty much flat and can provide better in-memory estimates. Best way to find deep object size is to iterate all declared fields and used instrumentation object size to find the actual size but it needs a separate agent combined with reflection :) ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java Lines 228 (patched) <https://reviews.apache.org/r/58777/#comment246291> Good catch. My bad. Both should not be multiplied by inflation factor. Only no conditional task size has to be multiplied by inflation factor. Regd. compressed tables. It actually depends. ORC for example, even if the table is compressed the raw data size returned by ORC reader represents uncompressed data size. Metastore stores file size (compressed) and raw data size. Statistics annotation will use raw data size when available else hive.stats.deserialization.factor can be set to account for inflation. - Prasanth_J On April 27, 2017, 8:43 a.m., Prasanth_J wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/58777/ > ----------------------------------------------------------- > > (Updated April 27, 2017, 8:43 a.m.) > > > Review request for hive, Gunther Hagleitner, Sergey Shelukhin, and Siddharth > Seth. > > > Bugs: HIVE-16546 > https://issues.apache.org/jira/browse/HIVE-16546 > > > Repository: hive-git > > > Description > ------- > > HIVE-16546: LLAP: Fail map join tasks if hash table memory exceeds threshold > > > Diffs > ----- > > common/src/java/org/apache/hadoop/hive/common/MemoryEstimate.java > PRE-CREATION > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java d3ea824 > > ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/BytesBytesMultiHashMap.java > 04e24bd > ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HashMapWrapper.java > a3bccc6 > > ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/HybridHashTableContainer.java > 04e89e8 > > ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinBytesTableContainer.java > c86e5f5 > > ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinTableContainer.java > 6d71fef > ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java 7b13e90 > ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ObjectCache.java 72dcdd3 > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastBytesHashMap.java > 6242daf > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastBytesHashMultiSet.java > 1a41961 > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastBytesHashSet.java > 331867c > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastBytesHashTable.java > b93e977 > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastHashTable.java > b6db3bc > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastHashTableLoader.java > 49ecdd1 > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastKeyStore.java > be51693 > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastLongHashMap.java > 6fe98f9 > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastLongHashMultiSet.java > 9140aee > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastLongHashSet.java > d3efb11 > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastLongHashTable.java > 8bfa07c > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastMultiKeyHashMap.java > add4788 > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastMultiKeyHashMultiSet.java > faefdbb > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastMultiKeyHashSet.java > 5328910 > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastStringHashMap.java > f13034f > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastStringHashMultiSet.java > 53ad7b4 > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastStringHashSet.java > 723c729 > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastTableContainer.java > 05f1cf1 > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/fast/VectorMapJoinFastValueStore.java > f9c5b34 > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/hashtable/VectorMapJoinHashTable.java > c7e585c > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/optimized/VectorMapJoinOptimizedHashSet.java > 93a89d7 > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/optimized/VectorMapJoinOptimizedHashTable.java > 5fe7861 > > ql/src/java/org/apache/hadoop/hive/ql/exec/vector/mapjoin/optimized/VectorMapJoinOptimizedStringHashSet.java > f921b9c > ql/src/java/org/apache/hadoop/hive/ql/optimizer/ConvertJoinMapJoin.java > ad77e87 > ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java > b2893e7 > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/calcite/translator/HiveOpConverter.java > d375d1b > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java > 93b8a5d > > ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/GenSparkSkewJoinProcessor.java > 405c3ca > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java d39b8bd > ql/src/java/org/apache/hadoop/hive/ql/plan/JoinDesc.java 032c7bb > ql/src/java/org/apache/hadoop/hive/ql/plan/MapJoinDesc.java 940630c > serde/src/java/org/apache/hadoop/hive/serde2/WriteBuffers.java a4ecd9f > > > Diff: https://reviews.apache.org/r/58777/diff/2/ > > > Testing > ------- > > > Thanks, > > Prasanth_J > >