Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11158 )

Change subject: IMPALA-7540. Intern most repetitive strings and network 
addresses in catalog
......................................................................


Patch Set 2:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java
File fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java:

http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java@62
PS2, Line 62:   public static void internFieldsInPlace(
nit: for each method, should we list the set of fields that we selectively 
intern? (here and below internFieldsInPlace()?


http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java@88
PS2, Line 88:     internFieldsInPlace(table.getClustering_columns());
null check for table


http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java@113
PS2, Line 113:     if (sd.isSetCols()) {
null check for sd.


http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java@131
PS2, Line 131:     if (fs.isSetName()) {
null check


http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java@155
PS2, Line 155: "-1".equals(val) ||
             :           "0".equals(val) ||
             :           val.isEmpty() ||
             :           "true".equalsIgnoreCase(val) ||
             :           "false".equalsIgnoreCase(val) ||
             :           "TASK".equals(val) ||
             :           val.startsWith("impala_"))
Can we create a static HashSet out of these (lowercased) and use something like

if (hashSet.contains(val.lower())))

Easier to extend I think


http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java@172
PS2, Line 172: e.getKey().intern()
Maybe move everything to guava interner to be consistent? Native interning is 
slow anyway (as per the blog post)


http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java@174
PS2, Line 174:     return ret;
Preconditions.checkState(ret.size() == parameters.size())?


http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java@188
PS2, Line 188: internColumnFamily
can we rename it to something like internString() that we can reuse for all 
Strings?


http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/CatalogInterners.java@198
PS2, Line 198:     // but it's likely someone would trip over one of these.
This looks risky if someone updates the thrift def for TNetworkAddress :-)


http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/HBaseColumn.java
File fe/src/main/java/org/apache/impala/catalog/HBaseColumn.java:

http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/HBaseColumn.java@35
PS2, Line 35: columnQualifier
just curious, any reason why not this?


http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
File fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java:

http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java@853
PS2, Line 853: hmsParameters_ = CatalogInterners.internParameters(
             :           msPartition.getParameters());
Shouldn't we do it after we remove the incremental stats ? (or) blacklist 
incremental stats from getting interned.


http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/11158/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1203
PS2, Line 1203: intern
just curious, why not intern everything via Guava?



--
To view, visit http://gerrit.cloudera.org:8080/11158
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib3121aefa4391bcb1477d9dba0a49440d7000d26
Gerrit-Change-Number: 11158
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Comment-Date: Fri, 07 Sep 2018 03:11:29 +0000
Gerrit-HasComments: Yes

Reply via email to