Re: Review Request 56687: Intern strings in various critical places to reduce memory consumption.

Misha Dmitriev Mon, 27 Feb 2017 11:31:16 -0800

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56687/#review166916
-----------------------------------------------------------





ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverMergeFiles.java 
(line 322)
<https://reviews.apache.org/r/56687/#comment239005>

    Oh, now I see what you mean. Yes, you are right, the old code was actually 
better. I suspect that my change as you saw it was a byproduct of some other 
bigger change that I subsequently partially reverted. Anyway, I've just changed 
this code to what it was originally. Also, indeed, found one or two other 
similar changes and reverted them as well.


- Misha Dmitriev


On Feb. 24, 2017, 9:27 p.m., Misha Dmitriev wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56687/
> -----------------------------------------------------------
> 
> (Updated Feb. 24, 2017, 9:27 p.m.)
> 
> 
> Review request for hive, Chaoyu Tang, Mohit Sabharwal, and Sergio Pena.
> 
> 
> Bugs: https://issues.apache.org/jira/browse/HIVE-15882
>     
> https://issues.apache.org/jira/browse/https://issues.apache.org/jira/browse/HIVE-15882
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> See the description of the problem in 
> https://issues.apache.org/jira/browse/HIVE-15882 Interning strings per this 
> review removes most of the overhead due to duplicate strings.
> 
> Also, where maps in several places are created from other maps, use the 
> original map's size for the new map. This is to avoid the situation when a 
> map with default capacity (typically 16) is created to hold just 2-3 entries, 
> and the rest of the internal 16-entry array is wasted.
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/common/StringInternUtils.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 
> e81cbce3e333d44a4088c10491f399e92a505293 
>   ql/src/java/org/apache/hadoop/hive/ql/hooks/Entity.java 
> 08420664d59f28f75872c25c9f8ee42577b23451 
>   ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java 
> e91064b9c75e8adb2b36f21ff19ec0c1539b03b9 
>   ql/src/java/org/apache/hadoop/hive/ql/io/HiveInputFormat.java 
> 51530ac16c92cc75d501bfcb573557754ba0c964 
>   ql/src/java/org/apache/hadoop/hive/ql/io/SymbolicInputFormat.java 
> 55b3b551a1dac92583b6e03b10beb8172ca93d45 
>   ql/src/java/org/apache/hadoop/hive/ql/lockmgr/HiveLockObject.java 
> 82dc89803be9cf9e0018720eeceb90ff450bfdc8 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java 
> c0edde9e92314d86482b5c46178987e79fae57fe 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 
> c6ae6f290857cfd10f1023058ede99bf4a10f057 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 
> 24d16812515bdfa90b4be7a295c0388fcdfe95ef 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/GenMRSkewJoinProcessor.java
>  ede4fcbe342052ad86dadebcc49da2c0f515ea98 
>   
> ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/NullScanTaskDispatcher.java
>  0882ae2c6205b1636cbc92e76ef66bb70faadc76 
>   
> ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverMergeFiles.java 
> 68b0ad9ea63f051f16fec3652d8525f7ab07eb3f 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java 
> d4bdd96eaf8d179bed43b8a8c3be0d338940154a 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/MsckDesc.java 
> b7a7e4b7a5f8941b080c7805d224d3885885f444 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java 
> 73981e826870139a42ad881103fdb0a2ef8433a2 
> 
> Diff: https://reviews.apache.org/r/56687/diff/
> 
> 
> Testing
> -------
> 
> I've measured how much memory this change plus another one (interning 
> Properties in PartitionDesc) save in my HS2 benchmark - the result is 37%. 
> See the details in HIVE-15882.
> 
> 
> Thanks,
> 
> Misha Dmitriev
> 
>

Re: Review Request 56687: Intern strings in various critical places to reduce memory consumption.

Reply via email to