Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10982 )

Change subject: IMPALA-7219. 7.5% of Catalog Server heap wasted by empty 
HashMaps and ArrayLists
......................................................................


Patch Set 2:

(5 comments)

See suggestion about changing IncompleteTable to not inherit from Table -- 
maybe now with the 'FeTable' interface we could avoid that?

http://gerrit.cloudera.org:8080/#/c/10982/2/fe/src/main/java/org/apache/impala/catalog/StructType.java
File fe/src/main/java/org/apache/impala/catalog/StructType.java:

http://gerrit.cloudera.org:8080/#/c/10982/2/fe/src/main/java/org/apache/impala/catalog/StructType.java@37
PS2, Line 37: public class StructType extends Type {
agreed with Vuk -- not clear to me why we have so many empty structs. I thought 
StructType is only used for describing the schema of a table, and a table with 
no fields seems uncommon. Do you have handy the "paths to GC root" for these 
empty fields?


http://gerrit.cloudera.org:8080/#/c/10982/2/fe/src/main/java/org/apache/impala/catalog/StructType.java@39
PS2, Line 39:   // Made non-final so that it can be initialized lazily to save 
memory when this map is not used.
nit: this comment makes sense when looking at this as a patch, but when someone 
sees the code later, the "Made non-final" thing isn't super clear.

Perhaps better to just describe the current treatment of the member, like: "Map 
of field name to field. Null is used to indicate empty to save memory."


http://gerrit.cloudera.org:8080/#/c/10982/2/fe/src/main/java/org/apache/impala/catalog/StructType.java@84
PS2, Line 84:   public StructField getField(String fieldName) {
I wonder if we could do even better here and only lazily construct a field map 
at places where method is called. it seems this is the only usage site of the 
field map, and best I can tell, this method is only called during query 
analysis, and never by the catalogd itself. Given that, it doesn't really make 
sense to keep around these members, when it would be pretty trivial to 
reconstruct the map as necessary during path resolution. Then we could fully 
remove all these maps (even the non-empty ones). Would that save significant 
memory?

Another thought here: it seems like the most common case where we have an empty 
struct woudl be a reference from IncompleteTable. I'm guessing you were looking 
at a catalogd dump from a cluster with a lot of tables? Maybe we could change 
IncompleteTable to not inherit from 'Table' so that it can avoid all of these 
members entirely?


http://gerrit.cloudera.org:8080/#/c/10982/2/fe/src/main/java/org/apache/impala/catalog/Table.java
File fe/src/main/java/org/apache/impala/catalog/Table.java:

http://gerrit.cloudera.org:8080/#/c/10982/2/fe/src/main/java/org/apache/impala/catalog/Table.java@320
PS2, Line 320: {
             :       return;
             :     }
> nit: single line
same as below


http://gerrit.cloudera.org:8080/#/c/10982/2/fe/src/main/java/org/apache/impala/catalog/Table.java@447
PS2, Line 447:   public Column getColumn(String name) {
I think it would actually be an error to call getColumn before the colsByName_ 
map was initted. Such a call would indicate that columns haven't been loaded.



--
To view, visit http://gerrit.cloudera.org:8080/10982
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If9c75f65ecb3ba3f2c739fa483a84dc052f471c6
Gerrit-Change-Number: 10982
Gerrit-PatchSet: 2
Gerrit-Owner: Misha Dmitriev <count...@gmail.com>
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com>
Gerrit-Comment-Date: Fri, 20 Jul 2018 18:26:02 +0000
Gerrit-HasComments: Yes

Reply via email to