[Impala-ASF-CR] WIP IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-18 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11388 )

Change subject: WIP IMPALA-7527: add fetch-from-catalogd cache info to profile
..


Patch Set 5:

(4 comments)

added e2e test, restricted nested scopes, added rpc metrics (per query, not 
broken down by rpc method)

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@309
PS3, Line 309:   throw new TException(e);
> Should we track num rpcs?
Done. These stats add the following lines (for a miss):
...
- CatalogFetch.RPCs.Requests: 4 (4)
- CatalogFetch.RPCs.Time: 337ms
- CatalogFetch.RPCs.Bytes: 33.28 KB (34082)
...


http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@349
PS3, Line 349:   return (ValueType)cache_.get(key, new 
Callable() {
> Does it help to implement a ScopedTimerWithStats (try-with-resources way)?
it might, but for three call-sites, I didn't think it was worth it right now.


http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@363
PS3, Line 363: temString, hit.getRef() ? "hit" : "miss");
 : }
> this is kinda weird. Maybe overload this method that takes a single bool wh
its the only call-site with this 1 | 0 logic. the rest do not use it, so I 
don't think its worth it to generalize it at this point.


http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java
File fe/src/main/java/org/apache/impala/service/FrontendProfile.java:

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@151
PS3, Line 151:   oldThreadLocalValue_ = THREAD_LOCAL.get();
> k I think I'll just prevent nested scopes and document it as such
added the precondition and a todo to enable nested scopes.



--
To view, visit http://gerrit.cloudera.org:8080/11388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
Gerrit-Change-Number: 11388
Gerrit-PatchSet: 5
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Tue, 18 Sep 2018 16:05:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11388 )

Change subject: WIP IMPALA-7527: add fetch-from-catalogd cache info to profile
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/675/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
Gerrit-Change-Number: 11388
Gerrit-PatchSet: 4
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 14 Sep 2018 08:02:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11388 )

Change subject: WIP IMPALA-7527: add fetch-from-catalogd cache info to profile
..


Patch Set 5:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/676/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/11388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
Gerrit-Change-Number: 11388
Gerrit-PatchSet: 5
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 14 Sep 2018 07:50:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-14 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has uploaded a new patch set (#5) to the change originally 
created by Todd Lipcon. ( http://gerrit.cloudera.org:8080/11388 )

Change subject: WIP IMPALA-7527: add fetch-from-catalogd cache info to profile
..

WIP IMPALA-7527: add fetch-from-catalogd cache info to profile

This patch adds a Java wrapper for a RuntimeProfile object. The wrapper
supports some basic operations like non-hierarchical counters and
informational strings.

During planning, a profile is created, and passed back to the backend as
part of the ExecRequest. The backend then updates the query profile
based on the info emitted from the frontend.

This patch also adds the first use case for this profile information:
the CatalogdMetaProvider emits counters for cache hits, misses, and
fetch times, broken down by metadata category.

The emitted profile is a bit of a superset of the existing 'timeline'
functionality. However, it seems that some tools may parse the timeline
in its current location in the profile, so moving it might be
incompatible. I elected to leave that alone for now and just emit
counters in the new profile.

WIP for a few reasons:
- needs tests
- maybe the amount of information is too verbose to include in every
  profile? Should we only include such info on misses, since we assume
  hits don't contribute to the timing much? Should we not bother
  breaking down by category?
- worth trying to include the byte sizes of metadata fetched?

Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
---
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-server.cc
M be/src/util/runtime-profile.h
M common/thrift/Frontend.thrift
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
A fe/src/main/java/org/apache/impala/service/FrontendProfile.java
M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java
9 files changed, 297 insertions(+), 49 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/11388/5
--
To view, visit http://gerrit.cloudera.org:8080/11388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
Gerrit-Change-Number: 11388
Gerrit-PatchSet: 5
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] WIP IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-14 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has uploaded a new patch set (#4) to the change originally 
created by Todd Lipcon. ( http://gerrit.cloudera.org:8080/11388 )

Change subject: WIP IMPALA-7527: add fetch-from-catalogd cache info to profile
..

WIP IMPALA-7527: add fetch-from-catalogd cache info to profile

This patch adds a Java wrapper for a RuntimeProfile object. The wrapper
supports some basic operations like non-hierarchical counters and
informational strings.

During planning, a profile is created, and passed back to the backend as
part of the ExecRequest. The backend then updates the query profile
based on the info emitted from the frontend.

This patch also adds the first use case for this profile information:
the CatalogdMetaProvider emits counters for cache hits, misses, and
fetch times, broken down by metadata category.

The emitted profile is a bit of a superset of the existing 'timeline'
functionality. However, it seems that some tools may parse the timeline
in its current location in the profile, so moving it might be
incompatible. I elected to leave that alone for now and just emit
counters in the new profile.

WIP for a few reasons:
- needs tests
- maybe the amount of information is too verbose to include in every
  profile? Should we only include such info on misses, since we assume
  hits don't contribute to the timing much? Should we not bother
  breaking down by category?
- worth trying to include the byte sizes of metadata fetched?

Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
---
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-server.cc
M be/src/util/runtime-profile.h
M common/thrift/Frontend.thrift
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
A fe/src/main/java/org/apache/impala/service/FrontendProfile.java
M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java
9 files changed, 294 insertions(+), 49 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/11388/4
--
To view, visit http://gerrit.cloudera.org:8080/11388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
Gerrit-Change-Number: 11388
Gerrit-PatchSet: 4
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-14 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11388 )

Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@369
PS3, Line 369: addStatsToProfile
> Yea, CacheStats will give top-level stats, but not per-object-type stats, e
agreed, lets get the information out, then work on making it easier.


http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java
File fe/src/main/java/org/apache/impala/service/FrontendProfile.java:

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@73
PS3, Line 73:* Create a new profile, setting it as the current thread-lcoal 
profile for the
> spelling: thread-local
Done



--
To view, visit http://gerrit.cloudera.org:8080/11388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
Gerrit-Change-Number: 11388
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 14 Sep 2018 07:27:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-07 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11388 )

Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java
File fe/src/main/java/org/apache/impala/service/FrontendProfile.java:

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@43
PS3, Line 43:  * This class is thread-safe.
> I haven't looked much at those code paths but perhaps the "merge stats" cod
the merge stats code, afaict, is run by the coordinator but in c++ so happens 
at an inconvenient time to send the rpc stats from the FE analysis phase. this 
change would make it easy to get these into the profile regardless.



--
To view, visit http://gerrit.cloudera.org:8080/11388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
Gerrit-Change-Number: 11388
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 07 Sep 2018 19:08:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-07 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11388 )

Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
..


Patch Set 3:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/11388/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11388/3//COMMIT_MSG@32
PS3, Line 32:   breaking down by category?
> Yep, just looking at the code, I have a feeling it could be chatty and most
Here's a sample of 'explain select * from lineitem; profile;' on a cold cache:

  Frontend:
 - CatalogFetch.ColumnStats.Misses: 16 (16)
 - CatalogFetch.ColumnStats.Requests: 16 (16)
 - CatalogFetch.ColumnStats.Time: 21ms
 - CatalogFetch.Config.Misses: 1 (1)
 - CatalogFetch.Config.Requests: 1 (1)
 - CatalogFetch.Config.Time: 37ms
 - CatalogFetch.DatabaseList.Hits: 1 (1)
 - CatalogFetch.DatabaseList.Requests: 1 (1)
 - CatalogFetch.DatabaseList.Time: 0
 - CatalogFetch.PartitionLists.Misses: 1 (1)
 - CatalogFetch.PartitionLists.Requests: 1 (1)
 - CatalogFetch.PartitionLists.Time: 5ms
 - CatalogFetch.Partitions.Hits: 2 (2)
 - CatalogFetch.Partitions.Misses: 1 (1)
 - CatalogFetch.Partitions.Requests: 3 (3)
 - CatalogFetch.Partitions.Time: 13ms
 - CatalogFetch.TableNames.Hits: 1 (1)
 - CatalogFetch.TableNames.Requests: 1 (1)
 - CatalogFetch.TableNames.Time: 0
 - CatalogFetch.Tables.Misses: 1 (1)
 - CatalogFetch.Tables.Requests: 1 (1)
 - CatalogFetch.Tables.Time: 345ms

on a warm cache:
  Frontend:
 - CatalogFetch.ColumnStats.Hits: 16 (16)
 - CatalogFetch.ColumnStats.Requests: 16 (16)
 - CatalogFetch.ColumnStats.Time: 0
 - CatalogFetch.Config.Hits: 1 (1)
 - CatalogFetch.Config.Requests: 1 (1)
 - CatalogFetch.Config.Time: 0
 - CatalogFetch.DatabaseList.Hits: 1 (1)
 - CatalogFetch.DatabaseList.Requests: 1 (1)
 - CatalogFetch.DatabaseList.Time: 0
 - CatalogFetch.PartitionLists.Hits: 1 (1)
 - CatalogFetch.PartitionLists.Requests: 1 (1)
 - CatalogFetch.PartitionLists.Time: 0
 - CatalogFetch.Partitions.Hits: 2 (2)
 - CatalogFetch.Partitions.Requests: 2 (2)
 - CatalogFetch.Partitions.Time: 0
 - CatalogFetch.TableNames.Hits: 1 (1)
 - CatalogFetch.TableNames.Requests: 1 (1)
 - CatalogFetch.TableNames.Time: 0
 - CatalogFetch.Tables.Misses: 1 (1)
 - CatalogFetch.Tables.Requests: 1 (1)
 - CatalogFetch.Tables.Time: 1ms


(the number of lines is the same). If we think this is too long, we could 
simply count misses only, or we could aggregate all of the counters into a 
single "CatalogFetch" instead of breaking out by object type. We could also add 
a more specific thrift struct to profiles with its own stringification, but 
that seems like it might turn into quite a bit of work.


http://gerrit.cloudera.org:8080/#/c/11388/3//COMMIT_MSG@33
PS3, Line 33: - worth trying to include the byte sizes of metadata fetched?
> Could be useful from incremental stats / large partitioned tables.
k i'll see if it's easy to do


http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@282
PS3, Line 282: CacheStats
> I see these are used only for tests at the moment. Is there anything worthw
yea, I have a separate JIRA filed to add daemon-level stats as a metric


http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@369
PS3, Line 369: addStatsToProfile
> these stats are per query... would it be useful to aggregate these for the
Yea, CacheStats will give top-level stats, but not per-object-type stats, etc.

I think providing super actionable metrics on whether a larger cache would help 
(and how much larger it needs to be) is a bit tricky. Two approaches I've seen:

- track the actual last access time/age of items that are evicted from the 
cache, and expose that as a metric. If you're evicting stuff that was used in 
the last five minutes, maybe you'd be better off with a larger cache to fit 
more recent stuff.
CON: since we're using version-number based invalidation for granular table 
metadata instead of active eviction, it is totally fine to be evicting such 
items if they're stale, because we know we would never hit them again. So, this 
might not be an accurate measure. We could probably add some background thread 
scanning through the cache to more aggressively evict items which are 
stale-version-numbered to solve this.

- add a shadow/victim cache which remembers the keys of evicted items, as well 
as a counter of how many bytes had been evicted from the cache at the time they 
were evicted. If we get a miss on the main cache 

[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-07 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11388 )

Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java
File fe/src/main/java/org/apache/impala/service/FrontendProfile.java:

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@43
PS3, Line 43:  * This class is thread-safe.
> From the compute incremental stats side, IMO, we need the following metrics
thx. I don't want to derail the discussion in this patch with incremental stats 
specifics. I was just interested in mechanism-- without this change, is there 
another way to collect these stats, in this case from the ComputeStatsStmt (an 
Expr node), and make them available in the profile as you were looking for?



--
To view, visit http://gerrit.cloudera.org:8080/11388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
Gerrit-Change-Number: 11388
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 07 Sep 2018 18:06:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-07 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11388 )

Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java
File fe/src/main/java/org/apache/impala/service/FrontendProfile.java:

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@43
PS3, Line 43:  * This class is thread-safe.
> High-level comment is that its a clear way to instrument FE stuff. Immediat
>From the compute incremental stats side, IMO, we need the following metrics,

(1) Time spent, stats size from on-demand incremental stats fetch RPCs (planner)
(2) Time spent to merge the stats after the child queries finish

I think (2) is already captured in some form (and perhaps we can make it more 
clear by narrowing down the scope to just measure the time taken to merge).

I was hoping this change makes it easy to do (1). I'm not sure what is needed 
from the expression nodes. May be I'm missing something.



--
To view, visit http://gerrit.cloudera.org:8080/11388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
Gerrit-Change-Number: 11388
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 07 Sep 2018 18:00:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-07 Thread Vuk Ercegovac (Code Review)
Vuk Ercegovac has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11388 )

Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@282
PS3, Line 282: CacheStats
I see these are used only for tests at the moment. Is there anything worthwhile 
to pull from here? The profile is local to a query whereas these are for the 
lifetime of the cache, but perhaps there is something useful here to understand 
the larger context? For example, cache has a generally high hit rate but your 
query got unlucky.


http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@369
PS3, Line 369: addStatsToProfile
these stats are per query... would it be useful to aggregate these for the 
lifetime of the cache? Not sure what CacheStats tracks that differs from the 
metrics tracked here.

ultimately, I'm wondering what helps to tune the two main flags here: size and 
expiration time. one answer is 'adjust them until you reach a high hit rate'.. 
not sure right now what would tell me if a lower hit rate is due to memory or 
expiration.

the other perspective here is that some of the cache entries are not 
fine-grained (e.g., table names). is there enough info here to tell when its 
time to tune one of those?


http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java
File fe/src/main/java/org/apache/impala/service/FrontendProfile.java:

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@43
PS3, Line 43:  * This class is thread-safe.
High-level comment is that its a clear way to instrument FE stuff. Immediate 
use-case from my end is the RPC's made in incremental-stats when pulling from 
catalogd.

Bharath: you mentioned that more instrumentation was needed for the 
compute-incremental-stats change and that it should go into the profile. How 
were you thinking it would get there from one of the expression nodes? Just 
raising it in case there is some existing alternative that I've overlooked.



--
To view, visit http://gerrit.cloudera.org:8080/11388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
Gerrit-Change-Number: 11388
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 07 Sep 2018 17:25:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-07 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11388 )

Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java
File fe/src/main/java/org/apache/impala/service/FrontendProfile.java:

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@73
PS3, Line 73:* Create a new profile, setting it as the current thread-lcoal 
profile for the
spelling: thread-local


http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@151
PS3, Line 151:   oldThreadLocalValue_ = THREAD_LOCAL.get();
> Ya I think we should have some guard that prevents nested scopes. I doubt i
I see now. I think maybe all that is needed is more comments, perhaps in 
createNewWIthScope(), explaining the existence of a stack of FrontendProfiles. 
Plus some exercise of the nestable  scopes in the forthcoming unit test



-- 
To view, visit http://gerrit.cloudera.org:8080/11388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
Gerrit-Change-Number: 11388
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 07 Sep 2018 16:13:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-06 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11388 )

Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
..


Patch Set 3:

(7 comments)

Some high-level comments.

http://gerrit.cloudera.org:8080/#/c/11388/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11388/3//COMMIT_MSG@26
PS3, Line 26:
Can you add a sample output here.


http://gerrit.cloudera.org:8080/#/c/11388/3//COMMIT_MSG@32
PS3, Line 32:   breaking down by category?
> still hoping to get some feedback on the above
Yep, just looking at the code, I have a feeling it could be chatty and most 
users don't probably need that information. But if you can paste some sample 
profiles somewhere (especially with a mix of cache hits + misses, with/without 
partitions, multiple partial RPCs etc.) that'd be great.


http://gerrit.cloudera.org:8080/#/c/11388/3//COMMIT_MSG@33
PS3, Line 33: - worth trying to include the byte sizes of metadata fetched?
Could be useful from incremental stats / large partitioned tables.


http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@309
PS3, Line 309:   ret = FeSupport.GetPartialCatalogObject(new 
TSerializer().serialize(req));
Should we track num rpcs?


http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@349
PS3, Line 349: Stopwatch sw = new Stopwatch().start();
Does it help to implement a ScopedTimerWithStats (try-with-resources way)?


http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@363
PS3, Line 363: /*numHits=*/hit.getRef() ? 1 : 0,
 :   /*numMisses=*/hit.getRef() ? 0 : 1
this is kinda weird. Maybe overload this method that takes a single bool which 
calls the helper?


http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java
File fe/src/main/java/org/apache/impala/service/FrontendProfile.java:

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@151
PS3, Line 151:   oldThreadLocalValue_ = THREAD_LOCAL.get();
> Just felt like it's good practice to treat scopes as nestable and "restore"
Ya I think we should have some guard that prevents nested scopes. I doubt if 
anyone will use it, but still :-)



--
To view, visit http://gerrit.cloudera.org:8080/11388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
Gerrit-Change-Number: 11388
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 07 Sep 2018 05:27:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-06 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11388 )

Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/11388/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11388/3//COMMIT_MSG@32
PS3, Line 32:   breaking down by category?
still hoping to get some feedback on the above


http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@189
PS3, Line 189:   private static final String DB_LIST_STATS_CATEGORY = 
"DatabaseList";
> Is it worth having some sort of enum for these keys? Maybe it will make the
yea I think it'll add quite some boiler plate without any real gains in safety 
since the profiles store this stuff in a string map anyway


http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java
File fe/src/main/java/org/apache/impala/service/FrontendProfile.java:

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@151
PS3, Line 151:   oldThreadLocalValue_ = THREAD_LOCAL.get();
> Can you explain what is going on here? Is there a case where oldThreadLocal
Just felt like it's good practice to treat scopes as nestable and "restore" the 
prior state after the scope is popped out of. Currently we have a pretty strict 
1:1 query:thread kind of mapping but maybe in the future that might not be the 
case. I can add a Preconditions.checkState that it starts as null prior to 
entering the scope if you think that's cleaner until we have some fancier use 
cases?



--
To view, visit http://gerrit.cloudera.org:8080/11388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
Gerrit-Change-Number: 11388
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 07 Sep 2018 02:22:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-06 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11388 )

Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@189
PS3, Line 189:   private static final String DB_LIST_STATS_CATEGORY = 
"DatabaseList";
Is it worth having some sort of enum for these keys? Maybe it will make the 
code more untidy?


http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java
File fe/src/main/java/org/apache/impala/service/FrontendProfile.java:

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@151
PS3, Line 151:   oldThreadLocalValue_ = THREAD_LOCAL.get();
Can you explain what is going on here? Is there a case where 
oldThreadLocalValue_ is not null?



--
To view, visit http://gerrit.cloudera.org:8080/11388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
Gerrit-Change-Number: 11388
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 06 Sep 2018 22:54:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-05 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11388 )

Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/595/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
Gerrit-Change-Number: 11388
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 06 Sep 2018 00:00:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-05 Thread Todd Lipcon (Code Review)
Hello Bharath Vissapragada, Andrew Sherman, Impala Public Jenkins, Vuk 
Ercegovac,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11388

to look at the new patch set (#3).

Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
..

WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile

This patch adds a Java wrapper for a RuntimeProfile object. The wrapper
supports some basic operations like non-hierarchical counters and
informational strings.

During planning, a profile is created, and passed back to the backend as
part of the ExecRequest. The backend then updates the query profile
based on the info emitted from the frontend.

This patch also adds the first use case for this profile information:
the CatalogdMetaProvider emits counters for cache hits, misses, and
fetch times, broken down by metadata category.

The emitted profile is a bit of a superset of the existing 'timeline'
functionality. However, it seems that some tools may parse the timeline
in its current location in the profile, so moving it might be
incompatible. I elected to leave that alone for now and just emit
counters in the new profile.

WIP for a few reasons:
- needs tests
- maybe the amount of information is too verbose to include in every
  profile? Should we only include such info on misses, since we assume
  hits don't contribute to the timing much? Should we not bother
  breaking down by category?
- worth trying to include the byte sizes of metadata fetched?

Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
---
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-server.cc
M be/src/util/runtime-profile.h
M common/thrift/Frontend.thrift
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
A fe/src/main/java/org/apache/impala/service/FrontendProfile.java
M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java
9 files changed, 294 insertions(+), 50 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/11388/3
--
To view, visit http://gerrit.cloudera.org:8080/11388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
Gerrit-Change-Number: 11388
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-05 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11388 )

Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11388/2/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/11388/2/be/src/service/impala-server.cc@950
PS2, Line 950: (*request_state)->query_events()->MarkEvent("Planning 
finished");
> Looks like both FE and BE set "Planning Finished" in the timeline. Is that
It's certainly weird, but actually they are in two different timelines. I 
didn't want to change semantics with this patch, though this stuff is 
definitely due for a serious cleanup.



--
To view, visit http://gerrit.cloudera.org:8080/11388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
Gerrit-Change-Number: 11388
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 19:30:29 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-05 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11388 )

Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11388/2/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/11388/2/be/src/service/impala-server.cc@950
PS2, Line 950: (*request_state)->query_events()->MarkEvent("Planning 
finished");
Looks like both FE and BE set "Planning Finished" in the timeline. Is that 
wrong?



--
To view, visit http://gerrit.cloudera.org:8080/11388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
Gerrit-Change-Number: 11388
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 19:02:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-05 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11388 )

Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
..


Patch Set 2:

> Patch Set 2:
>
> Forgot to git add FrontendProfile?

Indeed... I will never learn.


--
To view, visit http://gerrit.cloudera.org:8080/11388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
Gerrit-Change-Number: 11388
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 18:07:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-04 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11388 )

Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
..


Patch Set 2:

Forgot to git add FrontendProfile?


--
To view, visit http://gerrit.cloudera.org:8080/11388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
Gerrit-Change-Number: 11388
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 04:39:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-04 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11388 )

Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11388/1/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/11388/1/be/src/service/client-request-state.cc@78
PS1, Line 78: frontend_profile_(RuntimeProfile::Create(_pool_, 
"Frontend")),
perhaps I can leave this out entirely and only instantiate it (and add as a 
child) in SetFrontendProfile?


http://gerrit.cloudera.org:8080/#/c/11388/1/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/11388/1/be/src/service/impala-server.cc@954
PS1, Line 954: (*request_state)->SetFrontendProfile(result.profile);
does anyone understand the threading model of the runtime profile stuff? could 
I do this inline as part of 'Exec' below instead or is this lock necessary?



--
To view, visit http://gerrit.cloudera.org:8080/11388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
Gerrit-Change-Number: 11388
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 02:18:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-04 Thread Todd Lipcon (Code Review)
Hello Vuk Ercegovac,

I'd like you to do a code review. Please visit

http://gerrit.cloudera.org:8080/11388

to review the following change.


Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
..

WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile

This patch adds a Java wrapper for a RuntimeProfile object. The wrapper
supports some basic operations like non-hierarchical counters and
informational strings.

During planning, a profile is created, and passed back to the backend as
part of the ExecRequest. The backend then updates the query profile
based on the info emitted from the frontend.

This patch also adds the first use case for this profile information:
the CatalogdMetaProvider emits counters for cache hits, misses, and
fetch times, broken down by metadata category.

The emitted profile is a bit of a superset of the existing 'timeline'
functionality. However, it seems that some tools may parse the timeline
in its current location in the profile, so moving it might be
incompatible. I elected to leave that alone for now and just emit
counters in the new profile.

WIP for a few reasons:
- needs tests
- maybe the amount of information is too verbose to include in every
  profile? Should we only include such info on misses, since we assume
  hits don't contribute to the timing much? Should we not bother
  breaking down by category?
- worth trying to include the byte sizes of metadata fetched?

Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
---
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-server.cc
M be/src/util/runtime-profile.h
M common/thrift/Frontend.thrift
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/test/java/org/apache/impala/catalog/local/CatalogdMetaProviderTest.java
8 files changed, 134 insertions(+), 50 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/11388/1
--
To view, visit http://gerrit.cloudera.org:8080/11388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
Gerrit-Change-Number: 11388
Gerrit-PatchSet: 1
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac