[Impala-ASF-CR] WIP IMPALA-7527: add fetch-from-catalogd cache info to profile
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/11388 ) Change subject: WIP IMPALA-7527: add fetch-from-catalogd cache info to profile .. Patch Set 5: (4 comments) added e2e test, restricted nested scopes, added rpc metrics (per query, not broken down by rpc method) http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@309 PS3, Line 309: throw new TException(e); > Should we track num rpcs? Done. These stats add the following lines (for a miss): ... - CatalogFetch.RPCs.Requests: 4 (4) - CatalogFetch.RPCs.Time: 337ms - CatalogFetch.RPCs.Bytes: 33.28 KB (34082) ... http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@349 PS3, Line 349: return (ValueType)cache_.get(key, new Callable() { > Does it help to implement a ScopedTimerWithStats (try-with-resources way)? it might, but for three call-sites, I didn't think it was worth it right now. http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@363 PS3, Line 363: temString, hit.getRef() ? "hit" : "miss"); : } > this is kinda weird. Maybe overload this method that takes a single bool wh its the only call-site with this 1 | 0 logic. the rest do not use it, so I don't think its worth it to generalize it at this point. http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java File fe/src/main/java/org/apache/impala/service/FrontendProfile.java: http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@151 PS3, Line 151: oldThreadLocalValue_ = THREAD_LOCAL.get(); > k I think I'll just prevent nested scopes and document it as such added the precondition and a todo to enable nested scopes. -- To view, visit http://gerrit.cloudera.org:8080/11388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 Gerrit-Change-Number: 11388 Gerrit-PatchSet: 5 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 18 Sep 2018 16:05:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP IMPALA-7527: add fetch-from-catalogd cache info to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11388 ) Change subject: WIP IMPALA-7527: add fetch-from-catalogd cache info to profile .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/675/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 Gerrit-Change-Number: 11388 Gerrit-PatchSet: 4 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 14 Sep 2018 08:02:20 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-7527: add fetch-from-catalogd cache info to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11388 ) Change subject: WIP IMPALA-7527: add fetch-from-catalogd cache info to profile .. Patch Set 5: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/676/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/11388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 Gerrit-Change-Number: 11388 Gerrit-PatchSet: 5 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 14 Sep 2018 07:50:10 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP IMPALA-7527: add fetch-from-catalogd cache info to profile
Vuk Ercegovac has uploaded a new patch set (#5) to the change originally created by Todd Lipcon. ( http://gerrit.cloudera.org:8080/11388 ) Change subject: WIP IMPALA-7527: add fetch-from-catalogd cache info to profile .. WIP IMPALA-7527: add fetch-from-catalogd cache info to profile This patch adds a Java wrapper for a RuntimeProfile object. The wrapper supports some basic operations like non-hierarchical counters and informational strings. During planning, a profile is created, and passed back to the backend as part of the ExecRequest. The backend then updates the query profile based on the info emitted from the frontend. This patch also adds the first use case for this profile information: the CatalogdMetaProvider emits counters for cache hits, misses, and fetch times, broken down by metadata category. The emitted profile is a bit of a superset of the existing 'timeline' functionality. However, it seems that some tools may parse the timeline in its current location in the profile, so moving it might be incompatible. I elected to leave that alone for now and just emit counters in the new profile. WIP for a few reasons: - needs tests - maybe the amount of information is too verbose to include in every profile? Should we only include such info on misses, since we assume hits don't contribute to the timing much? Should we not bother breaking down by category? - worth trying to include the byte sizes of metadata fetched? Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 --- M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-server.cc M be/src/util/runtime-profile.h M common/thrift/Frontend.thrift M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/main/java/org/apache/impala/service/FrontendProfile.java M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java 9 files changed, 297 insertions(+), 49 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/11388/5 -- To view, visit http://gerrit.cloudera.org:8080/11388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 Gerrit-Change-Number: 11388 Gerrit-PatchSet: 5 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] WIP IMPALA-7527: add fetch-from-catalogd cache info to profile
Vuk Ercegovac has uploaded a new patch set (#4) to the change originally created by Todd Lipcon. ( http://gerrit.cloudera.org:8080/11388 ) Change subject: WIP IMPALA-7527: add fetch-from-catalogd cache info to profile .. WIP IMPALA-7527: add fetch-from-catalogd cache info to profile This patch adds a Java wrapper for a RuntimeProfile object. The wrapper supports some basic operations like non-hierarchical counters and informational strings. During planning, a profile is created, and passed back to the backend as part of the ExecRequest. The backend then updates the query profile based on the info emitted from the frontend. This patch also adds the first use case for this profile information: the CatalogdMetaProvider emits counters for cache hits, misses, and fetch times, broken down by metadata category. The emitted profile is a bit of a superset of the existing 'timeline' functionality. However, it seems that some tools may parse the timeline in its current location in the profile, so moving it might be incompatible. I elected to leave that alone for now and just emit counters in the new profile. WIP for a few reasons: - needs tests - maybe the amount of information is too verbose to include in every profile? Should we only include such info on misses, since we assume hits don't contribute to the timing much? Should we not bother breaking down by category? - worth trying to include the byte sizes of metadata fetched? Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 --- M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-server.cc M be/src/util/runtime-profile.h M common/thrift/Frontend.thrift M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/main/java/org/apache/impala/service/FrontendProfile.java M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java 9 files changed, 294 insertions(+), 49 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/11388/4 -- To view, visit http://gerrit.cloudera.org:8080/11388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 Gerrit-Change-Number: 11388 Gerrit-PatchSet: 4 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/11388 ) Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@369 PS3, Line 369: addStatsToProfile > Yea, CacheStats will give top-level stats, but not per-object-type stats, e agreed, lets get the information out, then work on making it easier. http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java File fe/src/main/java/org/apache/impala/service/FrontendProfile.java: http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@73 PS3, Line 73:* Create a new profile, setting it as the current thread-lcoal profile for the > spelling: thread-local Done -- To view, visit http://gerrit.cloudera.org:8080/11388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 Gerrit-Change-Number: 11388 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 14 Sep 2018 07:27:30 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/11388 ) Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java File fe/src/main/java/org/apache/impala/service/FrontendProfile.java: http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@43 PS3, Line 43: * This class is thread-safe. > I haven't looked much at those code paths but perhaps the "merge stats" cod the merge stats code, afaict, is run by the coordinator but in c++ so happens at an inconvenient time to send the rpc stats from the FE analysis phase. this change would make it easy to get these into the profile regardless. -- To view, visit http://gerrit.cloudera.org:8080/11388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 Gerrit-Change-Number: 11388 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 07 Sep 2018 19:08:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11388 ) Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile .. Patch Set 3: (6 comments) http://gerrit.cloudera.org:8080/#/c/11388/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11388/3//COMMIT_MSG@32 PS3, Line 32: breaking down by category? > Yep, just looking at the code, I have a feeling it could be chatty and most Here's a sample of 'explain select * from lineitem; profile;' on a cold cache: Frontend: - CatalogFetch.ColumnStats.Misses: 16 (16) - CatalogFetch.ColumnStats.Requests: 16 (16) - CatalogFetch.ColumnStats.Time: 21ms - CatalogFetch.Config.Misses: 1 (1) - CatalogFetch.Config.Requests: 1 (1) - CatalogFetch.Config.Time: 37ms - CatalogFetch.DatabaseList.Hits: 1 (1) - CatalogFetch.DatabaseList.Requests: 1 (1) - CatalogFetch.DatabaseList.Time: 0 - CatalogFetch.PartitionLists.Misses: 1 (1) - CatalogFetch.PartitionLists.Requests: 1 (1) - CatalogFetch.PartitionLists.Time: 5ms - CatalogFetch.Partitions.Hits: 2 (2) - CatalogFetch.Partitions.Misses: 1 (1) - CatalogFetch.Partitions.Requests: 3 (3) - CatalogFetch.Partitions.Time: 13ms - CatalogFetch.TableNames.Hits: 1 (1) - CatalogFetch.TableNames.Requests: 1 (1) - CatalogFetch.TableNames.Time: 0 - CatalogFetch.Tables.Misses: 1 (1) - CatalogFetch.Tables.Requests: 1 (1) - CatalogFetch.Tables.Time: 345ms on a warm cache: Frontend: - CatalogFetch.ColumnStats.Hits: 16 (16) - CatalogFetch.ColumnStats.Requests: 16 (16) - CatalogFetch.ColumnStats.Time: 0 - CatalogFetch.Config.Hits: 1 (1) - CatalogFetch.Config.Requests: 1 (1) - CatalogFetch.Config.Time: 0 - CatalogFetch.DatabaseList.Hits: 1 (1) - CatalogFetch.DatabaseList.Requests: 1 (1) - CatalogFetch.DatabaseList.Time: 0 - CatalogFetch.PartitionLists.Hits: 1 (1) - CatalogFetch.PartitionLists.Requests: 1 (1) - CatalogFetch.PartitionLists.Time: 0 - CatalogFetch.Partitions.Hits: 2 (2) - CatalogFetch.Partitions.Requests: 2 (2) - CatalogFetch.Partitions.Time: 0 - CatalogFetch.TableNames.Hits: 1 (1) - CatalogFetch.TableNames.Requests: 1 (1) - CatalogFetch.TableNames.Time: 0 - CatalogFetch.Tables.Misses: 1 (1) - CatalogFetch.Tables.Requests: 1 (1) - CatalogFetch.Tables.Time: 1ms (the number of lines is the same). If we think this is too long, we could simply count misses only, or we could aggregate all of the counters into a single "CatalogFetch" instead of breaking out by object type. We could also add a more specific thrift struct to profiles with its own stringification, but that seems like it might turn into quite a bit of work. http://gerrit.cloudera.org:8080/#/c/11388/3//COMMIT_MSG@33 PS3, Line 33: - worth trying to include the byte sizes of metadata fetched? > Could be useful from incremental stats / large partitioned tables. k i'll see if it's easy to do http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@282 PS3, Line 282: CacheStats > I see these are used only for tests at the moment. Is there anything worthw yea, I have a separate JIRA filed to add daemon-level stats as a metric http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@369 PS3, Line 369: addStatsToProfile > these stats are per query... would it be useful to aggregate these for the Yea, CacheStats will give top-level stats, but not per-object-type stats, etc. I think providing super actionable metrics on whether a larger cache would help (and how much larger it needs to be) is a bit tricky. Two approaches I've seen: - track the actual last access time/age of items that are evicted from the cache, and expose that as a metric. If you're evicting stuff that was used in the last five minutes, maybe you'd be better off with a larger cache to fit more recent stuff. CON: since we're using version-number based invalidation for granular table metadata instead of active eviction, it is totally fine to be evicting such items if they're stale, because we know we would never hit them again. So, this might not be an accurate measure. We could probably add some background thread scanning through the cache to more aggressively evict items which are stale-version-numbered to solve this. - add a shadow/victim cache which remembers the keys of evicted items, as well as a counter of how many bytes had been evicted from the cache at the time they were evicted. If we get a miss on the main cache
[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/11388 ) Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java File fe/src/main/java/org/apache/impala/service/FrontendProfile.java: http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@43 PS3, Line 43: * This class is thread-safe. > From the compute incremental stats side, IMO, we need the following metrics thx. I don't want to derail the discussion in this patch with incremental stats specifics. I was just interested in mechanism-- without this change, is there another way to collect these stats, in this case from the ComputeStatsStmt (an Expr node), and make them available in the profile as you were looking for? -- To view, visit http://gerrit.cloudera.org:8080/11388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 Gerrit-Change-Number: 11388 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 07 Sep 2018 18:06:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/11388 ) Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java File fe/src/main/java/org/apache/impala/service/FrontendProfile.java: http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@43 PS3, Line 43: * This class is thread-safe. > High-level comment is that its a clear way to instrument FE stuff. Immediat >From the compute incremental stats side, IMO, we need the following metrics, (1) Time spent, stats size from on-demand incremental stats fetch RPCs (planner) (2) Time spent to merge the stats after the child queries finish I think (2) is already captured in some form (and perhaps we can make it more clear by narrowing down the scope to just measure the time taken to merge). I was hoping this change makes it easy to do (1). I'm not sure what is needed from the expression nodes. May be I'm missing something. -- To view, visit http://gerrit.cloudera.org:8080/11388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 Gerrit-Change-Number: 11388 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 07 Sep 2018 18:00:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/11388 ) Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile .. Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@282 PS3, Line 282: CacheStats I see these are used only for tests at the moment. Is there anything worthwhile to pull from here? The profile is local to a query whereas these are for the lifetime of the cache, but perhaps there is something useful here to understand the larger context? For example, cache has a generally high hit rate but your query got unlucky. http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@369 PS3, Line 369: addStatsToProfile these stats are per query... would it be useful to aggregate these for the lifetime of the cache? Not sure what CacheStats tracks that differs from the metrics tracked here. ultimately, I'm wondering what helps to tune the two main flags here: size and expiration time. one answer is 'adjust them until you reach a high hit rate'.. not sure right now what would tell me if a lower hit rate is due to memory or expiration. the other perspective here is that some of the cache entries are not fine-grained (e.g., table names). is there enough info here to tell when its time to tune one of those? http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java File fe/src/main/java/org/apache/impala/service/FrontendProfile.java: http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@43 PS3, Line 43: * This class is thread-safe. High-level comment is that its a clear way to instrument FE stuff. Immediate use-case from my end is the RPC's made in incremental-stats when pulling from catalogd. Bharath: you mentioned that more instrumentation was needed for the compute-incremental-stats change and that it should go into the profile. How were you thinking it would get there from one of the expression nodes? Just raising it in case there is some existing alternative that I've overlooked. -- To view, visit http://gerrit.cloudera.org:8080/11388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 Gerrit-Change-Number: 11388 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 07 Sep 2018 17:25:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11388 ) Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java File fe/src/main/java/org/apache/impala/service/FrontendProfile.java: http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@73 PS3, Line 73:* Create a new profile, setting it as the current thread-lcoal profile for the spelling: thread-local http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@151 PS3, Line 151: oldThreadLocalValue_ = THREAD_LOCAL.get(); > Ya I think we should have some guard that prevents nested scopes. I doubt i I see now. I think maybe all that is needed is more comments, perhaps in createNewWIthScope(), explaining the existence of a stack of FrontendProfiles. Plus some exercise of the nestable scopes in the forthcoming unit test -- To view, visit http://gerrit.cloudera.org:8080/11388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 Gerrit-Change-Number: 11388 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 07 Sep 2018 16:13:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/11388 ) Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile .. Patch Set 3: (7 comments) Some high-level comments. http://gerrit.cloudera.org:8080/#/c/11388/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11388/3//COMMIT_MSG@26 PS3, Line 26: Can you add a sample output here. http://gerrit.cloudera.org:8080/#/c/11388/3//COMMIT_MSG@32 PS3, Line 32: breaking down by category? > still hoping to get some feedback on the above Yep, just looking at the code, I have a feeling it could be chatty and most users don't probably need that information. But if you can paste some sample profiles somewhere (especially with a mix of cache hits + misses, with/without partitions, multiple partial RPCs etc.) that'd be great. http://gerrit.cloudera.org:8080/#/c/11388/3//COMMIT_MSG@33 PS3, Line 33: - worth trying to include the byte sizes of metadata fetched? Could be useful from incremental stats / large partitioned tables. http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@309 PS3, Line 309: ret = FeSupport.GetPartialCatalogObject(new TSerializer().serialize(req)); Should we track num rpcs? http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@349 PS3, Line 349: Stopwatch sw = new Stopwatch().start(); Does it help to implement a ScopedTimerWithStats (try-with-resources way)? http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@363 PS3, Line 363: /*numHits=*/hit.getRef() ? 1 : 0, : /*numMisses=*/hit.getRef() ? 0 : 1 this is kinda weird. Maybe overload this method that takes a single bool which calls the helper? http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java File fe/src/main/java/org/apache/impala/service/FrontendProfile.java: http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@151 PS3, Line 151: oldThreadLocalValue_ = THREAD_LOCAL.get(); > Just felt like it's good practice to treat scopes as nestable and "restore" Ya I think we should have some guard that prevents nested scopes. I doubt if anyone will use it, but still :-) -- To view, visit http://gerrit.cloudera.org:8080/11388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 Gerrit-Change-Number: 11388 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 07 Sep 2018 05:27:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11388 ) Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile .. Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/11388/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11388/3//COMMIT_MSG@32 PS3, Line 32: breaking down by category? still hoping to get some feedback on the above http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@189 PS3, Line 189: private static final String DB_LIST_STATS_CATEGORY = "DatabaseList"; > Is it worth having some sort of enum for these keys? Maybe it will make the yea I think it'll add quite some boiler plate without any real gains in safety since the profiles store this stuff in a string map anyway http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java File fe/src/main/java/org/apache/impala/service/FrontendProfile.java: http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@151 PS3, Line 151: oldThreadLocalValue_ = THREAD_LOCAL.get(); > Can you explain what is going on here? Is there a case where oldThreadLocal Just felt like it's good practice to treat scopes as nestable and "restore" the prior state after the scope is popped out of. Currently we have a pretty strict 1:1 query:thread kind of mapping but maybe in the future that might not be the case. I can add a Preconditions.checkState that it starts as null prior to entering the scope if you think that's cleaner until we have some fancier use cases? -- To view, visit http://gerrit.cloudera.org:8080/11388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 Gerrit-Change-Number: 11388 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 07 Sep 2018 02:22:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11388 ) Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@189 PS3, Line 189: private static final String DB_LIST_STATS_CATEGORY = "DatabaseList"; Is it worth having some sort of enum for these keys? Maybe it will make the code more untidy? http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java File fe/src/main/java/org/apache/impala/service/FrontendProfile.java: http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@151 PS3, Line 151: oldThreadLocalValue_ = THREAD_LOCAL.get(); Can you explain what is going on here? Is there a case where oldThreadLocalValue_ is not null? -- To view, visit http://gerrit.cloudera.org:8080/11388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 Gerrit-Change-Number: 11388 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 06 Sep 2018 22:54:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11388 ) Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/595/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 Gerrit-Change-Number: 11388 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 06 Sep 2018 00:00:45 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
Hello Bharath Vissapragada, Andrew Sherman, Impala Public Jenkins, Vuk Ercegovac, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11388 to look at the new patch set (#3). Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile .. WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile This patch adds a Java wrapper for a RuntimeProfile object. The wrapper supports some basic operations like non-hierarchical counters and informational strings. During planning, a profile is created, and passed back to the backend as part of the ExecRequest. The backend then updates the query profile based on the info emitted from the frontend. This patch also adds the first use case for this profile information: the CatalogdMetaProvider emits counters for cache hits, misses, and fetch times, broken down by metadata category. The emitted profile is a bit of a superset of the existing 'timeline' functionality. However, it seems that some tools may parse the timeline in its current location in the profile, so moving it might be incompatible. I elected to leave that alone for now and just emit counters in the new profile. WIP for a few reasons: - needs tests - maybe the amount of information is too verbose to include in every profile? Should we only include such info on misses, since we assume hits don't contribute to the timing much? Should we not bother breaking down by category? - worth trying to include the byte sizes of metadata fetched? Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 --- M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-server.cc M be/src/util/runtime-profile.h M common/thrift/Frontend.thrift M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/service/Frontend.java A fe/src/main/java/org/apache/impala/service/FrontendProfile.java M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java 9 files changed, 294 insertions(+), 50 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/11388/3 -- To view, visit http://gerrit.cloudera.org:8080/11388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 Gerrit-Change-Number: 11388 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11388 ) Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/11388/2/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/11388/2/be/src/service/impala-server.cc@950 PS2, Line 950: (*request_state)->query_events()->MarkEvent("Planning finished"); > Looks like both FE and BE set "Planning Finished" in the timeline. Is that It's certainly weird, but actually they are in two different timelines. I didn't want to change semantics with this patch, though this stuff is definitely due for a serious cleanup. -- To view, visit http://gerrit.cloudera.org:8080/11388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 Gerrit-Change-Number: 11388 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 19:30:29 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11388 ) Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/11388/2/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/11388/2/be/src/service/impala-server.cc@950 PS2, Line 950: (*request_state)->query_events()->MarkEvent("Planning finished"); Looks like both FE and BE set "Planning Finished" in the timeline. Is that wrong? -- To view, visit http://gerrit.cloudera.org:8080/11388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 Gerrit-Change-Number: 11388 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 19:02:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11388 ) Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile .. Patch Set 2: > Patch Set 2: > > Forgot to git add FrontendProfile? Indeed... I will never learn. -- To view, visit http://gerrit.cloudera.org:8080/11388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 Gerrit-Change-Number: 11388 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 18:07:15 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/11388 ) Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile .. Patch Set 2: Forgot to git add FrontendProfile? -- To view, visit http://gerrit.cloudera.org:8080/11388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 Gerrit-Change-Number: 11388 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 04:39:48 + Gerrit-HasComments: No
[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/11388 ) Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile .. Patch Set 1: (2 comments) http://gerrit.cloudera.org:8080/#/c/11388/1/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/11388/1/be/src/service/client-request-state.cc@78 PS1, Line 78: frontend_profile_(RuntimeProfile::Create(_pool_, "Frontend")), perhaps I can leave this out entirely and only instantiate it (and add as a child) in SetFrontendProfile? http://gerrit.cloudera.org:8080/#/c/11388/1/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/11388/1/be/src/service/impala-server.cc@954 PS1, Line 954: (*request_state)->SetFrontendProfile(result.profile); does anyone understand the threading model of the runtime profile stuff? could I do this inline as part of 'Exec' below instead or is this lock necessary? -- To view, visit http://gerrit.cloudera.org:8080/11388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 Gerrit-Change-Number: 11388 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 02:18:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
Hello Vuk Ercegovac, I'd like you to do a code review. Please visit http://gerrit.cloudera.org:8080/11388 to review the following change. Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile .. WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile This patch adds a Java wrapper for a RuntimeProfile object. The wrapper supports some basic operations like non-hierarchical counters and informational strings. During planning, a profile is created, and passed back to the backend as part of the ExecRequest. The backend then updates the query profile based on the info emitted from the frontend. This patch also adds the first use case for this profile information: the CatalogdMetaProvider emits counters for cache hits, misses, and fetch times, broken down by metadata category. The emitted profile is a bit of a superset of the existing 'timeline' functionality. However, it seems that some tools may parse the timeline in its current location in the profile, so moving it might be incompatible. I elected to leave that alone for now and just emit counters in the new profile. WIP for a few reasons: - needs tests - maybe the amount of information is too verbose to include in every profile? Should we only include such info on misses, since we assume hits don't contribute to the timing much? Should we not bother breaking down by category? - worth trying to include the byte sizes of metadata fetched? Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 --- M be/src/service/client-request-state.cc M be/src/service/client-request-state.h M be/src/service/impala-server.cc M be/src/util/runtime-profile.h M common/thrift/Frontend.thrift M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java 8 files changed, 134 insertions(+), 50 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/11388/1 -- To view, visit http://gerrit.cloudera.org:8080/11388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 Gerrit-Change-Number: 11388 Gerrit-PatchSet: 1 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac