Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11388 )

Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
......................................................................


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@282
PS3, Line 282: CacheStats
I see these are used only for tests at the moment. Is there anything worthwhile 
to pull from here? The profile is local to a query whereas these are for the 
lifetime of the cache, but perhaps there is something useful here to understand 
the larger context? For example, cache has a generally high hit rate but your 
query got unlucky.


http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@369
PS3, Line 369: addStatsToProfile
these stats are per query... would it be useful to aggregate these for the 
lifetime of the cache? Not sure what CacheStats tracks that differs from the 
metrics tracked here.

ultimately, I'm wondering what helps to tune the two main flags here: size and 
expiration time. one answer is 'adjust them until you reach a high hit rate'.. 
not sure right now what would tell me if a lower hit rate is due to memory or 
expiration.

the other perspective here is that some of the cache entries are not 
fine-grained (e.g., table names). is there enough info here to tell when its 
time to tune one of those?


http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java
File fe/src/main/java/org/apache/impala/service/FrontendProfile.java:

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@43
PS3, Line 43:  * This class is thread-safe.
High-level comment is that its a clear way to instrument FE stuff. Immediate 
use-case from my end is the RPC's made in incremental-stats when pulling from 
catalogd.

Bharath: you mentioned that more instrumentation was needed for the 
compute-incremental-stats change and that it should go into the profile. How 
were you thinking it would get there from one of the expression nodes? Just 
raising it in case there is some existing alternative that I've overlooked.



--
To view, visit http://gerrit.cloudera.org:8080/11388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
Gerrit-Change-Number: 11388
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: Andrew Sherman <asher...@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com>
Gerrit-Comment-Date: Fri, 07 Sep 2018 17:25:49 +0000
Gerrit-HasComments: Yes

Reply via email to