Hello Andrew Sherman, Michael Ho, Tim Armstrong, Bikramjeet Vig, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/13550

to look at the new patch set (#10).

Change subject: IMPALA-8484: Run queries on disjoint executor groups
......................................................................

IMPALA-8484: Run queries on disjoint executor groups

This change adds support for running queries inside a single admission
control pool on one of several, disjoint sets of executors called
"executor groups".

Executors can be configured with an executor group through the newly
added '--executor_groups' flag. Note that in anticipation of future
changes, the flag already uses the plural form, but only a single
executor group may be specified for now. Each executor group
specification can optionally contain a minimum size, separated by a
':', e.g. --executor_groups default-pool-1:3. Only when the cluster
membership contains at least that number of executors for the groups
will it be considered for admission.

Executor groups are mapped to resource pools by their name: An executor
group can service queries from a resource pool if the pool name is a
prefix of the group name separated by a '-'. For example, queries in
poll poolA can be serviced by executor groups named poolA-1 and poolA-2,
but not by groups name foo or poolB-1.

During scheduling, executor groups are considered in alphabetical order.
This means that one group is filled up entirely before a subsequent
group is considered for admission. Groups also need to pass a health
check before considered. In particular, they must contain at least the
minimum number of executors specified.

If no group is specified during startup, executors are added to the
a default executor group. If - during admission - no executor group for
a pool can be found and the default group is non-empty, then the default
group is considered. The default group does not have a minimum size.

This change inverts the order of scheduling and admission. Prior to this
change, queries were scheduled before submitting them to the admission
controller. Now the admission controller computes schedules for all
candidate executor groups before each admission attempt. If the cluster
membership has not changed, then the schedules of the previous attempt
will be reused. This means that queries will no longer fail if the
cluster membership changes while they are queued in the admission
controller.

Testing:

This change adds a new custom cluster test for executor groups. It
makes use of new capabilities added to start-impala-cluster.py to bring
up additional executors into an already running cluster.

Additionally, this change adds an instructional implementation of
executor group based autoscaling, which can be used during development.
It also adds a helper to run queries concurrently. Both are used in a
new test to exercise the executor group logic and to prevent regressions
to these tools.

In addition to these tests, the existing tests for the admission
controller (both BE and EE tests) thoroughly exercise the changed code.
Some of them required changes themselves to reflect the new behavior.

TODO: Loop the new tests for a day to get some confidence that they're
not flaky.

Known limitations:

When using executor groups, only a single coordinator and a single AC
pool are supported. Executors to not include the number of currently
running queries in their statestore updates and so admission controllers
are not aware of the number of queries admitted by other controllers per
host.

Change-Id: I8a1d0900f2a82bd2fc0a906cc094e442cffa189b
---
M be/src/runtime/coordinator.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/scheduling/admission-controller-test.cc
M be/src/scheduling/admission-controller.cc
M be/src/scheduling/admission-controller.h
M be/src/scheduling/cluster-membership-mgr-test.cc
M be/src/scheduling/cluster-membership-mgr.cc
M be/src/scheduling/cluster-membership-mgr.h
M be/src/scheduling/cluster-membership-test-util.cc
M be/src/scheduling/cluster-membership-test-util.h
M be/src/scheduling/executor-group-test.cc
M be/src/scheduling/executor-group.cc
M be/src/scheduling/executor-group.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/query-schedule.h
M be/src/scheduling/scheduler-test-util.cc
M be/src/scheduling/scheduler.cc
M be/src/scheduling/scheduler.h
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-http-handler.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-options.h
M be/src/util/runtime-profile.h
M bin/start-impala-cluster.py
M common/thrift/StatestoreService.thrift
M tests/common/custom_cluster_test_suite.py
M tests/common/impala_cluster.py
M tests/common/impala_service.py
M tests/common/resource_pool_config.py
M tests/custom_cluster/test_admission_controller.py
A tests/custom_cluster/test_auto_scaling.py
M tests/custom_cluster/test_catalog_wait.py
A tests/custom_cluster/test_executor_groups.py
M tests/custom_cluster/test_restart_services.py
M tests/query_test/test_observability.py
A tests/util/auto_scaler.py
A tests/util/concurrent_workload.py
M www/backends.tmpl
41 files changed, 2,540 insertions(+), 665 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/13550/10
--
To view, visit http://gerrit.cloudera.org:8080/13550
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8a1d0900f2a82bd2fc0a906cc094e442cffa189b
Gerrit-Change-Number: 13550
Gerrit-PatchSet: 10
Gerrit-Owner: Lars Volker <[email protected]>
Gerrit-Reviewer: Andrew Sherman <[email protected]>
Gerrit-Reviewer: Bikramjeet Vig <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Lars Volker <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>

Reply via email to