[Impala-ASF-CR] IMPALA-8858: Add observability of the query load on executor groups

2019-09-05 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14103 )

Change subject: IMPALA-8858: Add observability of the query load on executor 
groups
..


Patch Set 8:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/14103/8/be/src/runtime/exec-env.cc
File be/src/runtime/exec-env.cc:

http://gerrit.cloudera.org:8080/#/c/14103/8/be/src/runtime/exec-env.cc@364
PS8, Line 364:
 : TUpdateExecutorMembershipRequest update_req;
 : for (const auto& it : snapshot->current_backends) {
 :   const TBackendDescriptor& backend = it.second;
 :   if (backend.is_executor) {
 : 
update_req.hostnames.insert(backend.address.hostname);
 : update_req.ip_addresses.insert(backend.ip_address);
 : update_req.num_executors++;
 :   }
 : }
 : Status status = 
this->frontend()->UpdateExecutorMembership(update_req);
 : if (!status.ok()) {
 :   LOG(WARNING) << "Error updating frontend membership 
snapshot: "
 :<< status.GetDetail();
 : }
I'm inclined to think that this code should not be in the ExecEnv (but we can 
postpone that to a cleanup change). I'm aware that the typedefs inside the CMM 
cannot be forward-declared and we shouldn't include it in the frontend, but we 
should look into pulling the typedefs out of the CMM and then simplify this 
code by moving it into the destination of the callbacks.


http://gerrit.cloudera.org:8080/#/c/14103/8/be/src/runtime/exec-env.cc@494
PS8, Line 494: std::
nit: remove std:: in .cc files


http://gerrit.cloudera.org:8080/#/c/14103/8/be/src/runtime/exec-env.cc@496
PS8, Line 496:   current_backend_set.insert(it.second.address);
Same as comment in the other callback


http://gerrit.cloudera.org:8080/#/c/14103/8/be/src/scheduling/admission-controller.h
File be/src/scheduling/admission-controller.h:

http://gerrit.cloudera.org:8080/#/c/14103/8/be/src/scheduling/admission-controller.h@948
PS8, Line 948: load metric of the 'grp_name' executor group by 'delta'
While reading this I was confused what 'delta' is, maybe instead of "query 
load" we should use "query count" (load is often a value between 0 and 1)? 
'num_queries' might also be a good parameter name if we rename the method to 
something like "IncExecGroupMetricLocked()".


http://gerrit.cloudera.org:8080/#/c/14103/8/be/src/scheduling/cluster-membership-mgr.h
File be/src/scheduling/cluster-membership-mgr.h:

http://gerrit.cloudera.org:8080/#/c/14103/8/be/src/scheduling/cluster-membership-mgr.h@124
PS8, Line 124:   typedef std::function UpdateCallbackFn;
Previously, UpdateLocalServerFn was only called when backends had been deleted. 
I think it would be nice to preserve that, e.g. by adding a bool 
can_contain_deletes (feel free to find a better name) to the callback.


http://gerrit.cloudera.org:8080/#/c/14103/8/be/src/scheduling/cluster-membership-mgr.cc
File be/src/scheduling/cluster-membership-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/14103/8/be/src/scheduling/cluster-membership-mgr.cc@321
PS8, Line 321:   SetStateAndUpdateMetrics(new_state);
I know I asked for the metric update to be moved there, but now that we 
generalized the callback mechanism, I think it could be nice to update the 
metrics in a callback. What do you think?


http://gerrit.cloudera.org:8080/#/c/14103/8/be/src/scheduling/cluster-membership-mgr.cc@322
PS8, Line 322:
nit: extra space


http://gerrit.cloudera.org:8080/#/c/14103/8/be/src/scheduling/cluster-membership-mgr.cc@323
PS8, Line 323: NotifyAllCallbackUpdateFns
I think NotifyListeners(new_state) might be more concise but clear enough


http://gerrit.cloudera.org:8080/#/c/14103/6/common/thrift/metrics.json
File common/thrift/metrics.json:

http://gerrit.cloudera.org:8080/#/c/14103/6/common/thrift/metrics.json@2502
PS6, Line 2502: ad.$0"
> do you think num-running-queries is more appropriate here?
maybe even num-running is ok (see my comment elsewhere about "load"). How do we 
call other metrics that track number of queries?



--
To view, visit http://gerrit.cloudera.org:8080/14103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I58cde8699c33af8b87273437e9d8bf6371a34539
Gerrit-Change-Number: 14103
Gerrit-PatchSet: 8
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Thu, 05 Sep 2019 21:59:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8858: Add observability of the query load on executor groups

2019-09-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14103 )

Change subject: IMPALA-8858: Add observability of the query load on executor 
groups
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4463/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I58cde8699c33af8b87273437e9d8bf6371a34539
Gerrit-Change-Number: 14103
Gerrit-PatchSet: 8
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Wed, 04 Sep 2019 22:33:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8858: Add observability of the query load on executor groups

2019-09-04 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14103 )

Change subject: IMPALA-8858: Add observability of the query load on executor 
groups
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4462/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I58cde8699c33af8b87273437e9d8bf6371a34539
Gerrit-Change-Number: 14103
Gerrit-PatchSet: 7
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Wed, 04 Sep 2019 22:26:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8858: Add observability of the query load on executor groups

2019-09-04 Thread Bikramjeet Vig (Code Review)
Hello Andrew Sherman, Lars Volker, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14103

to look at the new patch set (#8).

Change subject: IMPALA-8858: Add observability of the query load on executor 
groups
..

IMPALA-8858: Add observability of the query load on executor groups

With this patch, all executor groups with at least one executor
will have a metric added that display the number of queries
(admitted by the local coordinator) running on them. The metric
is removed only when the group has no executors or no longer
exists. It gets updated when either the cluster membership
changes or a query gets admitted or released by the admission
controller. Also adds the ability to delete metrics from a metric
group after registration.

Testing:
Added a custom cluster test and a BE metric test.

Change-Id: I58cde8699c33af8b87273437e9d8bf6371a34539
---
M be/src/runtime/exec-env.cc
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
M be/src/scheduling/cluster-membership-mgr.cc
M be/src/scheduling/cluster-membership-mgr.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/metrics-test.cc
M be/src/util/metrics.cc
M be/src/util/metrics.h
M common/thrift/metrics.json
M tests/custom_cluster/test_executor_groups.py
12 files changed, 265 insertions(+), 135 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/14103/8
--
To view, visit http://gerrit.cloudera.org:8080/14103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I58cde8699c33af8b87273437e9d8bf6371a34539
Gerrit-Change-Number: 14103
Gerrit-PatchSet: 8
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-8858: Add observability of the query load on executor groups

2019-09-04 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14103 )

Change subject: IMPALA-8858: Add observability of the query load on executor 
groups
..


Patch Set 7:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/14103/6/be/src/scheduling/admission-controller.h
File be/src/scheduling/admission-controller.h:

http://gerrit.cloudera.org:8080/#/c/14103/6/be/src/scheduling/admission-controller.h@943
PS6, Line 943:   /// Updates the list of executor groups for which we maintain 
the query load metrics.
> "groups for which we maintain..."
Done


http://gerrit.cloudera.org:8080/#/c/14103/6/be/src/scheduling/admission-controller.h@944
PS6, Line 944: from the metric gr
> Should we consider keeping groups with >0 executors around, even if they're
With how the current cancellation logic works, the query is still counted as 
running against a backend by the impala-server till it gets un-registered, 
which means that it would be cancelled regardless of whether it has completed 
its fragments on a particular backend.

However, I still agree with keeping non-zero groups around since they can still 
have an intermediate non zero value for query load which can prove useful.


http://gerrit.cloudera.org:8080/#/c/14103/6/be/src/scheduling/admission-controller.cc
File be/src/scheduling/admission-controller.cc:

http://gerrit.cloudera.org:8080/#/c/14103/6/be/src/scheduling/admission-controller.cc@320
PS6, Line 320:   dequeue_thread_->Join();
> Can the argument be a const&?
changed this mechanism, switched to sending a snapshotPtr instead.


http://gerrit.cloudera.org:8080/#/c/14103/6/be/src/scheduling/admission-controller.cc@1852
PS6, Line 1852:   for (const auto& group : snapshot->executor_groups) {
> I feel that this function's body could benefit from a comment explaining wh
Done


http://gerrit.cloudera.org:8080/#/c/14103/6/be/src/scheduling/cluster-membership-mgr.h
File be/src/scheduling/cluster-membership-mgr.h:

http://gerrit.cloudera.org:8080/#/c/14103/6/be/src/scheduling/cluster-membership-mgr.h@135
PS6, Line 135:   /// events. They may be called at any time before or after 
calling Init() and are
> The comment should include what gets passed into the callback, or alternati
changed the update mechanism


http://gerrit.cloudera.org:8080/#/c/14103/3/be/src/scheduling/cluster-membership-mgr.cc
File be/src/scheduling/cluster-membership-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/14103/3/be/src/scheduling/cluster-membership-mgr.cc@342
PS3, Line 342: base_snapshot = current_membership_.get();
> I'm still in favor of this.
Done.
Note: this adds a little extra work to the critical path since now we cant 
optimize work for specific calls:
- NotifyLocalServerForDeletedBackend called every time and not just when 
update_local_server is true
- Admission controller needs to separately create a set a  groups instead of 
leveraging the set previously created during metric updates.

These are pretty minor + cluster changes would be way less frequent in practice 
so shouldn't matter much.


http://gerrit.cloudera.org:8080/#/c/14103/3/be/src/scheduling/cluster-membership-mgr.cc@346
PS3, Line 346:   // copying the Snapshot if it isn't.
> ?
Done


http://gerrit.cloudera.org:8080/#/c/14103/3/be/src/scheduling/cluster-membership-mgr.cc@551
PS3, Line 551:
 :
 :
> What was the reason for this change?
just seemed consistent with other update calls to send the whole snapshotPtr


http://gerrit.cloudera.org:8080/#/c/14103/6/be/src/util/metrics.h
File be/src/util/metrics.h:

http://gerrit.cloudera.org:8080/#/c/14103/6/be/src/util/metrics.h@455
PS6, Line 455:   typedef std::unordered_map> MetricMap;
> Is there a reason this map needs to be ordered? If so, can you add it to th
oops, didnt notice that a map was being used instead of an unordered_map. Not 
sure what the initial intention was but I dont see any reason why it would 
require an ordered map. changing it to unordered_map. Same goes for 
ChildGroupMap below.


http://gerrit.cloudera.org:8080/#/c/14103/6/common/thrift/metrics.json
File common/thrift/metrics.json:

http://gerrit.cloudera.org:8080/#/c/14103/6/common/thrift/metrics.json@2495
PS6, Line 2495:   "description": "Total number of queries running on executor 
group: $0",
> I'd suggest moving the $0 to the back of the string to make it slightly eas
makes sense to me. Moving it to the end.


http://gerrit.cloudera.org:8080/#/c/14103/6/common/thrift/metrics.json@2502
PS6, Line 2502: ad.$0"
do you think num-running-queries is more appropriate here?



--
To view, visit http://gerrit.cloudera.org:8080/14103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I58cde8699c33af8b87273437e9d8bf6371a34539
Gerrit-Change-Number: 14103
Gerrit-PatchSet: 7
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Andrew 

[Impala-ASF-CR] IMPALA-8858: Add observability of the query load on executor groups

2019-09-04 Thread Bikramjeet Vig (Code Review)
Hello Andrew Sherman, Lars Volker, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14103

to look at the new patch set (#7).

Change subject: IMPALA-8858: Add observability of the query load on executor 
groups
..

IMPALA-8858: Add observability of the query load on executor groups

With this patch, all executor groups with at least one executor
will have a metric added that display the number of queries
(admitted by the local coordinator) running on them. The metric
is removed only when the group has no executors or no longer
exists. It gets updated when either the cluster membership
changes or a query gets admitted or released by the admission
controller. Also adds the ability to delete metrics from a metric
group after registration.

Testing:
Added a custom cluster test and a BE metric test.

Change-Id: I58cde8699c33af8b87273437e9d8bf6371a34539
---
M be/src/runtime/exec-env.cc
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
M be/src/scheduling/cluster-membership-mgr.cc
M be/src/scheduling/cluster-membership-mgr.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/util/metrics-test.cc
M be/src/util/metrics.cc
M be/src/util/metrics.h
M common/thrift/metrics.json
M tests/custom_cluster/test_executor_groups.py
12 files changed, 264 insertions(+), 135 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/14103/7
--
To view, visit http://gerrit.cloudera.org:8080/14103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I58cde8699c33af8b87273437e9d8bf6371a34539
Gerrit-Change-Number: 14103
Gerrit-PatchSet: 7
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-8858: Add observability of the query load on executor groups

2019-08-29 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14103 )

Change subject: IMPALA-8858: Add observability of the query load on executor 
groups
..


Patch Set 6:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/14103/6/be/src/scheduling/admission-controller.h
File be/src/scheduling/admission-controller.h:

http://gerrit.cloudera.org:8080/#/c/14103/6/be/src/scheduling/admission-controller.h@943
PS6, Line 943:   /// Updates the list of executor groups that we maintain the 
query load metrics for.
"groups for which we maintain..."


http://gerrit.cloudera.org:8080/#/c/14103/6/be/src/scheduling/admission-controller.h@944
PS6, Line 944: or are not healthy
Should we consider keeping groups with >0 executors around, even if they're not 
healthy? In theory such a group could even run a query if an executor is not 
involved in execution of the query or finishes early before failing.


http://gerrit.cloudera.org:8080/#/c/14103/6/be/src/scheduling/admission-controller.cc
File be/src/scheduling/admission-controller.cc:

http://gerrit.cloudera.org:8080/#/c/14103/6/be/src/scheduling/admission-controller.cc@320
PS6, Line 320:   [this](std::unordered_set healthy_grp_names) {
Can the argument be a const&?


http://gerrit.cloudera.org:8080/#/c/14103/6/be/src/scheduling/admission-controller.cc@1852
PS6, Line 1852:   while (it != healthy_exec_group_query_load_map_.end()) {
I feel that this function's body could benefit from a comment explaining what 
it does.


http://gerrit.cloudera.org:8080/#/c/14103/6/be/src/scheduling/cluster-membership-mgr.h
File be/src/scheduling/cluster-membership-mgr.h:

http://gerrit.cloudera.org:8080/#/c/14103/6/be/src/scheduling/cluster-membership-mgr.h@135
PS6, Line 135:   /// 'update_membership_lock_' is held when calling this 
callback.
The comment should include what gets passed into the callback, or alternative a 
typedef could make it more clear (see above).


http://gerrit.cloudera.org:8080/#/c/14103/3/be/src/scheduling/cluster-membership-mgr.cc
File be/src/scheduling/cluster-membership-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/14103/3/be/src/scheduling/cluster-membership-mgr.cc@342
PS3, Line 342:   UpdateFrontendExecutorMembership(*new_backend_map, 
*new_executor_groups);
> I think it would be cleaner to call all callbacks from a single spot. That
I'm still in favor of this.


http://gerrit.cloudera.org:8080/#/c/14103/3/be/src/scheduling/cluster-membership-mgr.cc@346
PS3, Line 346:   UpdateMetrics(new_state);
> Should we call UpdateMetrics() from SetState()?
?


http://gerrit.cloudera.org:8080/#/c/14103/6/be/src/util/metrics.h
File be/src/util/metrics.h:

http://gerrit.cloudera.org:8080/#/c/14103/6/be/src/util/metrics.h@455
PS6, Line 455:   typedef std::map> 
MetricMap;
Is there a reason this map needs to be ordered? If so, can you add it to the 
comment?


http://gerrit.cloudera.org:8080/#/c/14103/6/common/thrift/metrics.json
File common/thrift/metrics.json:

http://gerrit.cloudera.org:8080/#/c/14103/6/common/thrift/metrics.json@2495
PS6, Line 2495:   "description": "Total number of queries running on $0 
executor group",
I'd suggest moving the $0 to the back of the string to make it slightly easier 
to read (and consistent with the label), but it's personal preference and I 
don't feel strongly about it.



--
To view, visit http://gerrit.cloudera.org:8080/14103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I58cde8699c33af8b87273437e9d8bf6371a34539
Gerrit-Change-Number: 14103
Gerrit-PatchSet: 6
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Thu, 29 Aug 2019 20:54:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8858: Add observability of the query load on executor groups

2019-08-28 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14103 )

Change subject: IMPALA-8858: Add observability of the query load on executor 
groups
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4427/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I58cde8699c33af8b87273437e9d8bf6371a34539
Gerrit-Change-Number: 14103
Gerrit-PatchSet: 6
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Thu, 29 Aug 2019 02:19:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8858: Add observability of the query load on executor groups

2019-08-28 Thread Bikramjeet Vig (Code Review)
Hello Andrew Sherman, Lars Volker, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14103

to look at the new patch set (#6).

Change subject: IMPALA-8858: Add observability of the query load on executor 
groups
..

IMPALA-8858: Add observability of the query load on executor groups

With this patch, all healthy executor groups will have a metric added
that display the number of queries (admitted by the local coordinator)
running on them. The metric for a group is added when a it is healthy
and removed when its either not healthy or no longer exists. It gets
updated when either the cluster membership changes or a query gets
admitted or released by the admission controller.
Also adds the ability to delete metrics from a metric group after
registration.

Testing:
Added a custom cluster test and a BE metric test.

Change-Id: I58cde8699c33af8b87273437e9d8bf6371a34539
---
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
M be/src/scheduling/cluster-membership-mgr.cc
M be/src/scheduling/cluster-membership-mgr.h
M be/src/util/metrics-test.cc
M be/src/util/metrics.cc
M be/src/util/metrics.h
M common/thrift/metrics.json
M tests/custom_cluster/test_executor_groups.py
9 files changed, 210 insertions(+), 34 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/14103/6
--
To view, visit http://gerrit.cloudera.org:8080/14103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I58cde8699c33af8b87273437e9d8bf6371a34539
Gerrit-Change-Number: 14103
Gerrit-PatchSet: 6
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-8858: Add observability of the query load on executor groups

2019-08-28 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14103 )

Change subject: IMPALA-8858: Add observability of the query load on executor 
groups
..


Patch Set 6:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/14103/5/be/src/util/metrics.h
File be/src/util/metrics.h:

http://gerrit.cloudera.org:8080/#/c/14103/5/be/src/util/metrics.h@334
PS5, Line 334:  invalidat
> nit: reword, something like "users of this class", or just mention that del
Done


http://gerrit.cloudera.org:8080/#/c/14103/5/be/src/util/metrics.h@375
PS5, Line 375:
> nit: It logs an error though, so a user shouldn't just call this without be
As suggested, removed the concept of transient metrics


http://gerrit.cloudera.org:8080/#/c/14103/5/be/src/util/metrics.h@456
PS5, Line 456:   MetricMap metric_map_;
> The separation between the pool and the map seems complicated. Should we co
Good idea. I quickly scanned through the metrics in kudu/util and that seems to 
use something similar instead of an objectPool.


http://gerrit.cloudera.org:8080/#/c/14103/5/be/src/util/metrics.h@462
PS5, Line 462: Produces a
> why not use a unique_ptr?
we need a  type smart pointer that supports type erasure so we 
can store multiple subclass Metric types in the same map.
This post explains it pretty well https://stackoverflow.com/a/39288979



--
To view, visit http://gerrit.cloudera.org:8080/14103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I58cde8699c33af8b87273437e9d8bf6371a34539
Gerrit-Change-Number: 14103
Gerrit-PatchSet: 6
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Thu, 29 Aug 2019 01:40:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8858: Add observability of the query load on executor groups

2019-08-27 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14103 )

Change subject: IMPALA-8858: Add observability of the query load on executor 
groups
..


Patch Set 5:

(4 comments)

Only looked at the deletion logic

http://gerrit.cloudera.org:8080/#/c/14103/5/be/src/util/metrics.h
File be/src/util/metrics.h:

http://gerrit.cloudera.org:8080/#/c/14103/5/be/src/util/metrics.h@334
PS5, Line 334: The client
nit: reword, something like "users of this class", or just mention that 
deletion invalidates pointers to the deleted metrics (but no other pointers).


http://gerrit.cloudera.org:8080/#/c/14103/5/be/src/util/metrics.h@375
PS5, Line 375: Is a no-op for non-transient metrics.
nit: It logs an error though, so a user shouldn't just call this without being 
reasonably sure that they're deleting a transient metric.


http://gerrit.cloudera.org:8080/#/c/14103/5/be/src/util/metrics.h@456
PS5, Line 456:   boost::scoped_ptr obj_pool_;
The separation between the pool and the map seems complicated. Should we 
consider switching to the map altogether? Then metrics could just have a flag, 
or we could remove the distinction entirely.


http://gerrit.cloudera.org:8080/#/c/14103/5/be/src/util/metrics.h@462
PS5, Line 462: shared_ptr
why not use a unique_ptr?



--
To view, visit http://gerrit.cloudera.org:8080/14103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I58cde8699c33af8b87273437e9d8bf6371a34539
Gerrit-Change-Number: 14103
Gerrit-PatchSet: 5
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Tue, 27 Aug 2019 23:24:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8858: Add observability of the query load on executor groups

2019-08-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14103 )

Change subject: IMPALA-8858: Add observability of the query load on executor 
groups
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4357/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I58cde8699c33af8b87273437e9d8bf6371a34539
Gerrit-Change-Number: 14103
Gerrit-PatchSet: 4
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Sat, 24 Aug 2019 01:03:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8858: Add observability of the query load on executor groups

2019-08-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14103 )

Change subject: IMPALA-8858: Add observability of the query load on executor 
groups
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/4358/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I58cde8699c33af8b87273437e9d8bf6371a34539
Gerrit-Change-Number: 14103
Gerrit-PatchSet: 5
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Sat, 24 Aug 2019 00:59:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8858: Add observability of the query load on executor groups

2019-08-23 Thread Bikramjeet Vig (Code Review)
Hello Andrew Sherman, Lars Volker, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14103

to look at the new patch set (#5).

Change subject: IMPALA-8858: Add observability of the query load on executor 
groups
..

IMPALA-8858: Add observability of the query load on executor groups

With this patch, all healthy executor groups will have a metric added
that display the number of queries (admitted by the local coordinator)
running on them. The metric for a group is added when a it is healthy
and removed when its either not healthy or no longer exists. It gets
updated when either the cluster membership changes or a query gets
admitted or released by the admission controller.

Testing:
Added a custom cluster test and a BE metric test.

Change-Id: I58cde8699c33af8b87273437e9d8bf6371a34539
---
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
M be/src/scheduling/cluster-membership-mgr.cc
M be/src/scheduling/cluster-membership-mgr.h
M be/src/util/metrics-test.cc
M be/src/util/metrics.h
M common/thrift/metrics.json
M tests/custom_cluster/test_executor_groups.py
8 files changed, 244 insertions(+), 29 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/14103/5
--
To view, visit http://gerrit.cloudera.org:8080/14103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I58cde8699c33af8b87273437e9d8bf6371a34539
Gerrit-Change-Number: 14103
Gerrit-PatchSet: 5
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-8858: Add observability of the query load on executor groups

2019-08-23 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14103 )

Change subject: IMPALA-8858: Add observability of the query load on executor 
groups
..


Patch Set 4:

@all Please let me know if the changes to enable metric deletion make sense.


--
To view, visit http://gerrit.cloudera.org:8080/14103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I58cde8699c33af8b87273437e9d8bf6371a34539
Gerrit-Change-Number: 14103
Gerrit-PatchSet: 4
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Sat, 24 Aug 2019 00:14:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8858: Add observability of the query load on executor groups

2019-08-23 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14103 )

Change subject: IMPALA-8858: Add observability of the query load on executor 
groups
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14103/4/tests/custom_cluster/test_executor_groups.py
File tests/custom_cluster/test_executor_groups.py:

http://gerrit.cloudera.org:8080/#/c/14103/4/tests/custom_cluster/test_executor_groups.py@387
PS4, Line 387: =
> flake8: E711 comparison to None should be 'if cond is None:'
Done



--
To view, visit http://gerrit.cloudera.org:8080/14103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I58cde8699c33af8b87273437e9d8bf6371a34539
Gerrit-Change-Number: 14103
Gerrit-PatchSet: 4
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Sat, 24 Aug 2019 00:15:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8858: Add observability of the query load on executor groups

2019-08-23 Thread Bikramjeet Vig (Code Review)
Hello Andrew Sherman, Lars Volker, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14103

to look at the new patch set (#4).

Change subject: IMPALA-8858: Add observability of the query load on executor 
groups
..

IMPALA-8858: Add observability of the query load on executor groups

With this patch, all healthy executor groups will have a metric added
that display the number of queries (admitted by the local coordinator)
running on them. The metric for a group is added when a it is healthy
and removed when its either not healthy or no longer exists. It gets
updated when either the cluster membership changes or a query gets
admitted or released by the admission controller.

Testing:
Added a custom cluster test and a BE metric test.

Change-Id: I58cde8699c33af8b87273437e9d8bf6371a34539
---
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
M be/src/scheduling/cluster-membership-mgr.cc
M be/src/scheduling/cluster-membership-mgr.h
M be/src/util/metrics-test.cc
M be/src/util/metrics.h
M common/thrift/metrics.json
M tests/custom_cluster/test_executor_groups.py
8 files changed, 244 insertions(+), 29 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/03/14103/4
--
To view, visit http://gerrit.cloudera.org:8080/14103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I58cde8699c33af8b87273437e9d8bf6371a34539
Gerrit-Change-Number: 14103
Gerrit-PatchSet: 4
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 


[Impala-ASF-CR] IMPALA-8858: Add observability of the query load on executor groups

2019-08-23 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14103 )

Change subject: IMPALA-8858: Add observability of the query load on executor 
groups
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14103/4/tests/custom_cluster/test_executor_groups.py
File tests/custom_cluster/test_executor_groups.py:

http://gerrit.cloudera.org:8080/#/c/14103/4/tests/custom_cluster/test_executor_groups.py@387
PS4, Line 387: =
flake8: E711 comparison to None should be 'if cond is None:'



--
To view, visit http://gerrit.cloudera.org:8080/14103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I58cde8699c33af8b87273437e9d8bf6371a34539
Gerrit-Change-Number: 14103
Gerrit-PatchSet: 4
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Sat, 24 Aug 2019 00:14:15 +
Gerrit-HasComments: Yes