Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/11388 )
Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile ...................................................................... Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@282 PS3, Line 282: CacheStats I see these are used only for tests at the moment. Is there anything worthwhile to pull from here? The profile is local to a query whereas these are for the lifetime of the cache, but perhaps there is something useful here to understand the larger context? For example, cache has a generally high hit rate but your query got unlucky. http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@369 PS3, Line 369: addStatsToProfile these stats are per query... would it be useful to aggregate these for the lifetime of the cache? Not sure what CacheStats tracks that differs from the metrics tracked here. ultimately, I'm wondering what helps to tune the two main flags here: size and expiration time. one answer is 'adjust them until you reach a high hit rate'.. not sure right now what would tell me if a lower hit rate is due to memory or expiration. the other perspective here is that some of the cache entries are not fine-grained (e.g., table names). is there enough info here to tell when its time to tune one of those? http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java File fe/src/main/java/org/apache/impala/service/FrontendProfile.java: http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@43 PS3, Line 43: * This class is thread-safe. High-level comment is that its a clear way to instrument FE stuff. Immediate use-case from my end is the RPC's made in incremental-stats when pulling from catalogd. Bharath: you mentioned that more instrumentation was needed for the compute-incremental-stats change and that it should go into the profile. How were you thinking it would get there from one of the expression nodes? Just raising it in case there is some existing alternative that I've overlooked. -- To view, visit http://gerrit.cloudera.org:8080/11388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 Gerrit-Change-Number: 11388 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon <t...@apache.org> Gerrit-Reviewer: Andrew Sherman <asher...@cloudera.com> Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-Reviewer: Vuk Ercegovac <vercego...@cloudera.com> Gerrit-Comment-Date: Fri, 07 Sep 2018 17:25:49 +0000 Gerrit-HasComments: Yes