Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14103 )

Change subject: IMPALA-8858: Add observability of idle executor groups
......................................................................


Patch Set 3:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/14103/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14103/3//COMMIT_MSG@7
PS3, Line 7: IMPALA-8858: Add observability of idle executor groups
I think it would be good to also expose the total number of healthy executor 
groups per pool while you're at it. This would help to determine overall 
service readiness (e.g. some users may want to wait until at least one healthy 
group is available).


http://gerrit.cloudera.org:8080/#/c/14103/3/be/src/scheduling/admission-controller.h
File be/src/scheduling/admission-controller.h:

http://gerrit.cloudera.org:8080/#/c/14103/3/be/src/scheduling/admission-controller.h@408
PS3, Line 408: seperated
nit: typo


http://gerrit.cloudera.org:8080/#/c/14103/3/be/src/scheduling/admission-controller.cc
File be/src/scheduling/admission-controller.cc:

http://gerrit.cloudera.org:8080/#/c/14103/3/be/src/scheduling/admission-controller.cc@83
PS3, Line 83:     "admission-controller.executor-groups.total-healthy-idle";
Do we actually need these per resource pool? In the future we might have 
multiple pools and we already have the association between groups and pools 
through the name-prefix.


http://gerrit.cloudera.org:8080/#/c/14103/3/be/src/scheduling/admission-controller.cc@1852
PS3, Line 1852: vector<std::string> GetIdleExecutorGroupNames(
This should probably call GetExecutorGroupsForPool() (see previous comment)


http://gerrit.cloudera.org:8080/#/c/14103/3/be/src/scheduling/cluster-membership-mgr.h
File be/src/scheduling/cluster-membership-mgr.h:

http://gerrit.cloudera.org:8080/#/c/14103/3/be/src/scheduling/cluster-membership-mgr.h@136
PS3, Line 136: Metric
Remove the word "Metric", it's an implementation detail what the receiver of 
the callback does with it.


http://gerrit.cloudera.org:8080/#/c/14103/3/be/src/scheduling/cluster-membership-mgr.h@221
PS3, Line 221: Updating
nit lowercase


http://gerrit.cloudera.org:8080/#/c/14103/3/be/src/scheduling/cluster-membership-mgr.cc
File be/src/scheduling/cluster-membership-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/14103/3/be/src/scheduling/cluster-membership-mgr.cc@342
PS3, Line 342:   UpdateFrontendExecutorMembership(*new_backend_map, 
*new_executor_groups);
I think it would be cleaner to call all callbacks from a single spot. That will 
also allow us to move them into a list and loop over them, if we add even more.


http://gerrit.cloudera.org:8080/#/c/14103/3/be/src/scheduling/cluster-membership-mgr.cc@346
PS3, Line 346:   UpdateMetrics(new_state);
Should we call UpdateMetrics() from SetState()?


http://gerrit.cloudera.org:8080/#/c/14103/3/be/src/scheduling/cluster-membership-mgr.cc@551
PS3, Line 551: SnapshotPtr new_state){
             :   const BackendIdMap& current_backends = 
new_state->current_backends;
             :   const ExecutorGroups& executor_groups = 
new_state->executor_groups;
What was the reason for this change?


http://gerrit.cloudera.org:8080/#/c/14103/3/common/thrift/metrics.json
File common/thrift/metrics.json:

http://gerrit.cloudera.org:8080/#/c/14103/3/common/thrift/metrics.json@2485
PS3, Line 2485: seperated
nit: typo


http://gerrit.cloudera.org:8080/#/c/14103/3/tests/custom_cluster/test_executor_groups.py
File tests/custom_cluster/test_executor_groups.py:

http://gerrit.cloudera.org:8080/#/c/14103/3/tests/custom_cluster/test_executor_groups.py@341
PS3, Line 341:     self._add_executor_group(group_name_suffixes[0], 1, 
max_concurrent_queries=1)
Add num_executors parameter to be explicit, maybe also spell out min_size= here 
and below?

I think it would be good to express the intent in a comment, too, in particular 
why you're creating them with different sizes.


http://gerrit.cloudera.org:8080/#/c/14103/3/tests/custom_cluster/test_executor_groups.py@346
PS3, Line 346: self.coordinator.service.get_metric_value(
             :       
"admission-controller.executor-groups.total-healthy-idle").split(",")
A helper like self._get_idle_groups() might make this more readable.


http://gerrit.cloudera.org:8080/#/c/14103/3/tests/custom_cluster/test_executor_groups.py@362
PS3, Line 362:     # Splitting on an empty string with a delimiter returns ['']
This quirk could then be hidden inside the helper, too



--
To view, visit http://gerrit.cloudera.org:8080/14103
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I58cde8699c33af8b87273437e9d8bf6371a34539
Gerrit-Change-Number: 14103
Gerrit-PatchSet: 3
Gerrit-Owner: Bikramjeet Vig <bikramjeet....@cloudera.com>
Gerrit-Reviewer: Andrew Sherman <asher...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Lars Volker <l...@cloudera.com>
Gerrit-Comment-Date: Fri, 23 Aug 2019 16:36:09 +0000
Gerrit-HasComments: Yes

Reply via email to