Misha Dmitriev has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10982 )

Change subject: IMPALA-7219. 7.5% of Catalog Server heap wasted by empty 
HashMaps and ArrayLists
......................................................................


Patch Set 2:

Thank you for the quick response, Vuk. Answering your questions:

1. The respective IMPALA-7219 contains more details. In the case that I looked 
at, there was exactly the same number (a little less than 8 million) of both 
kinds of HashMaps. I.e. ~8M coming from IncompleteTable.colsByName_ and ~8M 
coming from StructType.fieldMap_. Each empty HashMap takes 48 bytes, so 
collectively they wasted ~700MB of memory.

2. Let me clarify. There is no such thing as an object (collection) "with no 
fields". When you call 'new HashMap()', you always create a HashMap object, 
which internally has fields like 'int size, threshold' etc. The good news is 
that the most important field, 'HashMap$Entry table[]' is null until the first 
element is added to this table. But just the above data fields, plus the 
12-byte internal header that each object in the JVM heap has, result in the 
fact that an empty, completely unpopulated HashMap, occupies 48 bytes. This is 
what I want to get rid of. If all this still seems a little vague, I suggest 
you take a look at the slides here: 
https://www.slideshare.net/MikhailMishaDmitriev/java-memory-analysis-problems-and-solutions?trk=v-feed
 In particular, slide 10 shows the internals and lifecycle of a HashMap.

So your suggestion in (2) may or may not make sense depending what exactly you 
had in mind, and needs some clarification.

Regarding tests: yes, I ran it through the Impala Jenkins build, and all tests 
passed: https://jenkins.impala.io/job/pre-review-test/186/


--
To view, visit http://gerrit.cloudera.org:8080/10982
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If9c75f65ecb3ba3f2c739fa483a84dc052f471c6
Gerrit-Change-Number: 10982
Gerrit-PatchSet: 2
Gerrit-Owner: Misha Dmitriev <count...@gmail.com>
Gerrit-Reviewer: Misha Dmitriev <count...@gmail.com>
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com>
Gerrit-Comment-Date: Fri, 20 Jul 2018 18:41:03 +0000
Gerrit-HasComments: No

Reply via email to