Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12940 )

Change subject: IMPALA-7322: Add storage wait time to profile
......................................................................


Patch Set 3:

(9 comments)

a few comments / questions; still working on understanding the internals of 
this code, so will add some more comments later;

http://gerrit.cloudera.org:8080/#/c/12940/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12940/6//COMMIT_MSG@26
PS6, Line 26:       load-requests=1 catalog-updates=3
It's not clear to me what part of this profile existing before the patch, and 
what was added by this patch. Can you specifically call out the parts that 
changed? I think it makes the commit message more informative.


http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
File fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java:

http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@232
PS6, Line 232:       for (FeTable ldTbl: loadedTbls_.values()) {
nit: I think storageLoadTimeNano is clearer


http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@234
PS6, Line 234:             
requestedTbls.contains(((Table)ldTbl).getTableName())) {
nit: before the the query *was* called


http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@235
PS6, Line 235:           strgTmNano += ((Table)ldTbl).getStorageLoadTime();
nit: ldTbl --> loadedTbl


http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java@259
PS6, Line 259:     Set<TableName> viewTbls = new HashSet<>();
nit: is this necessary?


http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java
File fe/src/main/java/org/apache/impala/catalog/HBaseTable.java:

http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java@105
PS6, Line 105:       final Timer.Context ctxStg =
nit: I think storageLoadTimer is a better variable name; in general I wouldn't 
abbreviate words inside variables names unless really necessary (e.g. if it 
would make the variable name way to long)


http://gerrit.cloudera.org:8080/#/c/12940/6/fe/src/main/java/org/apache/impala/catalog/HBaseTable.java@120
PS6, Line 120:       // Set table stats.
nit: is this necessary


http://gerrit.cloudera.org:8080/#/c/12940/6/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/12940/6/tests/query_test/test_observability.py@567
PS6, Line 567:
the setup of this test seems a little off. right now this gets skipped for S3, 
Local, Isilon, ABFS, and ADLS tests; but I see no reason why this feature 
shouldn't work for all the files stores as well right? especially for S3 where 
storage load time could be long because S3 metadata operations can take a long 
time.

I would suggest re-organizing this test as follows:

* There is a lot of duplicate code here, I would suggest creating an internal 
helper method that runs the invalidate query, runs the select count(*) against 
a configurable database, checks for the storage-load-time, re-runs the query, 
and checks for storage-load-time
* you could either use the Python annotations to setup the correct test matrix, 
or use the IS_* flags from tests.util.filesystem_utils

I think the correct test matrix would be:

* When running regular tests (e.g. against HDFS), run the queries against HDFS, 
HBase, and Kudu
* When running S3 tests, only run against S3 (S3 tests load all HDFS tables 
onto S3, so you can use functional.alltypes; unlike HBase and Kudu, there is no 
S3 specific table)
* The setup for S3 should apply to Local, Isilon, ABFS, ADLS as well


http://gerrit.cloudera.org:8080/#/c/12940/6/tests/query_test/test_observability.py@575
PS6, Line 575:     storage load time should be in the profile."""
is this a HDFS specific feature? do we cache metadata for Kudu and HBase tables 
too?



--
To view, visit http://gerrit.cloudera.org:8080/12940
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6dde7e394b7c1c396d835ef6aa0a55930c0a8660
Gerrit-Change-Number: 12940
Gerrit-PatchSet: 3
Gerrit-Owner: Yongzhi Chen <yc...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <anu...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Sahil Takiar <stak...@cloudera.com>
Gerrit-Reviewer: Vihang Karajgaonkar <vih...@cloudera.com>
Gerrit-Reviewer: Yongzhi Chen <yc...@cloudera.com>
Gerrit-Comment-Date: Thu, 23 May 2019 20:18:53 +0000
Gerrit-HasComments: Yes

Reply via email to