-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56687/
-----------------------------------------------------------

(Updated Feb. 24, 2017, 9:27 p.m.)


Review request for hive, Chaoyu Tang, Mohit Sabharwal, and Sergio Pena.


Changes
-------

Addressed Rui's and Mohit's comments.


Bugs: https://issues.apache.org/jira/browse/HIVE-15882
    
https://issues.apache.org/jira/browse/https://issues.apache.org/jira/browse/HIVE-15882


Repository: hive-git


Description
-------

See the description of the problem in 
https://issues.apache.org/jira/browse/HIVE-15882 Interning strings per this 
review removes most of the overhead due to duplicate strings.

Also, where maps in several places are created from other maps, use the 
original map's size for the new map. This is to avoid the situation when a map 
with default capacity (typically 16) is created to hold just 2-3 entries, and 
the rest of the internal 16-entry array is wasted.


Diffs (updated)
-----

  common/src/java/org/apache/hadoop/hive/common/StringInternUtils.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 
e81cbce3e333d44a4088c10491f399e92a505293 
  ql/src/java/org/apache/hadoop/hive/ql/hooks/Entity.java 
08420664d59f28f75872c25c9f8ee42577b23451 
  ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java 
e91064b9c75e8adb2b36f21ff19ec0c1539b03b9 
  ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 
51530ac16c92cc75d501bfcb573557754ba0c964 
  ql/src/java/org/apache/hadoop/hive/ql/io/SymbolicInputFormat.java 
55b3b551a1dac92583b6e03b10beb8172ca93d45 
  ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveLockObject.java 
82dc89803be9cf9e0018720eeceb90ff450bfdc8 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java 
c0edde9e92314d86482b5c46178987e79fae57fe 
  ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 
c6ae6f290857cfd10f1023058ede99bf4a10f057 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
24d16812515bdfa90b4be7a295c0388fcdfe95ef 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java
 ede4fcbe342052ad86dadebcc49da2c0f515ea98 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/NullScanTaskDispatcher.java
 0882ae2c6205b1636cbc92e76ef66bb70faadc76 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverMergeFiles.java 
68b0ad9ea63f051f16fec3652d8525f7ab07eb3f 
  ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java 
d4bdd96eaf8d179bed43b8a8c3be0d338940154a 
  ql/src/java/org/apache/hadoop/hive/ql/plan/MsckDesc.java 
b7a7e4b7a5f8941b080c7805d224d3885885f444 
  ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java 
73981e826870139a42ad881103fdb0a2ef8433a2 

Diff: https://reviews.apache.org/r/56687/diff/


Testing
-------

I've measured how much memory this change plus another one (interning 
Properties in PartitionDesc) save in my HS2 benchmark - the result is 37%. See 
the details in HIVE-15882.


Thanks,

Misha Dmitriev

Reply via email to