[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/12244 ) Change subject: IMPALA-8092: Add an admission controller debug page .. IMPALA-8092: Add an admission controller debug page This patch adds a new debug page "admission" that provides the following details about resource pools: - Pool configuration - Relevant pool stats - Queued Queries in order of being queued (local to the coordinator) - Running Queries (local to this coordinator) - Histogram of the distribution of peak memory used by queries admitted to the pool The aforementioned details can also be viewed for a single resource pool using a search string and are also available as a JSON object from the same http endpoint. Testing: - Added a test that checks if the admission debug page loads correctly and checks if the reset informational stats http end point works as expected. - Did manual stress testing by running the e2e tests and constantly fetching the admission debug page. Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Reviewed-on: http://gerrit.cloudera.org:8080/12244 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/catalog/catalog-server.cc M be/src/rpc/rpc-trace.cc M be/src/runtime/coordinator.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/service/impala-http-handler.cc M be/src/service/impala-http-handler.h M be/src/statestore/statestore.cc M be/src/util/default-path-handlers.cc M be/src/util/logging-support.cc M be/src/util/metrics.cc M be/src/util/thread.cc M be/src/util/webserver-test.cc M be/src/util/webserver.h M tests/webserver/test_web_pages.py A www/admission_controller.tmpl M www/common-header.tmpl 17 files changed, 787 insertions(+), 38 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/12244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Gerrit-Change-Number: 12244 Gerrit-PatchSet: 8 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12244 ) Change subject: IMPALA-8092: Add an admission controller debug page .. Patch Set 7: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/12244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Gerrit-Change-Number: 12244 Gerrit-PatchSet: 7 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 01 Feb 2019 22:52:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12244 ) Change subject: IMPALA-8092: Add an admission controller debug page .. Patch Set 7: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3698/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/12244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Gerrit-Change-Number: 12244 Gerrit-PatchSet: 7 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 01 Feb 2019 18:47:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12244 ) Change subject: IMPALA-8092: Add an admission controller debug page .. Patch Set 7: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/12244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Gerrit-Change-Number: 12244 Gerrit-PatchSet: 7 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 01 Feb 2019 18:47:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12244 ) Change subject: IMPALA-8092: Add an admission controller debug page .. Patch Set 6: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/12244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Gerrit-Change-Number: 12244 Gerrit-PatchSet: 6 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 01 Feb 2019 14:14:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12244 ) Change subject: IMPALA-8092: Add an admission controller debug page .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1951/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Gerrit-Change-Number: 12244 Gerrit-PatchSet: 6 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 01 Feb 2019 03:23:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Hello Pooja Nilangekar, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12244 to look at the new patch set (#6). Change subject: IMPALA-8092: Add an admission controller debug page .. IMPALA-8092: Add an admission controller debug page This patch adds a new debug page "admission" that provides the following details about resource pools: - Pool configuration - Relevant pool stats - Queued Queries in order of being queued (local to the coordinator) - Running Queries (local to this coordinator) - Histogram of the distribution of peak memory used by queries admitted to the pool The aforementioned details can also be viewed for a single resource pool using a search string and are also available as a JSON object from the same http endpoint. Testing: - Added a test that checks if the admission debug page loads correctly and checks if the reset informational stats http end point works as expected. - Did manual stress testing by running the e2e tests and constantly fetching the admission debug page. Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 --- M be/src/catalog/catalog-server.cc M be/src/rpc/rpc-trace.cc M be/src/runtime/coordinator.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/service/impala-http-handler.cc M be/src/service/impala-http-handler.h M be/src/statestore/statestore.cc M be/src/util/default-path-handlers.cc M be/src/util/logging-support.cc M be/src/util/metrics.cc M be/src/util/thread.cc M be/src/util/webserver-test.cc M be/src/util/webserver.h M tests/webserver/test_web_pages.py A www/admission_controller.tmpl M www/common-header.tmpl 17 files changed, 787 insertions(+), 38 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/12244/6 -- To view, visit http://gerrit.cloudera.org:8080/12244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Gerrit-Change-Number: 12244 Gerrit-PatchSet: 6 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/12244 ) Change subject: IMPALA-8092: Add an admission controller debug page .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/12244/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12244/3//COMMIT_MSG@15 PS3, Line 15: - Histogram of the distribution of peak memory used by queries admitted > Yeah at the host level - it would be useful to expose that for debugging to Let me take it up in a separate commit. I think '/backends' might be the best place to put that info http://gerrit.cloudera.org:8080/#/c/12244/5/tests/webserver/test_web_pages.py File tests/webserver/test_web_pages.py: http://gerrit.cloudera.org:8080/#/c/12244/5/tests/webserver/test_web_pages.py@402 PS5, Line 402: > Does this need to run serially? Presumably a concurrent test could admit a Done -- To view, visit http://gerrit.cloudera.org:8080/12244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Gerrit-Change-Number: 12244 Gerrit-PatchSet: 3 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 01 Feb 2019 02:58:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12244 ) Change subject: IMPALA-8092: Add an admission controller debug page .. Patch Set 5: Code-Review+2 (5 comments) I think this is basically good except for my question about running the test serially. http://gerrit.cloudera.org:8080/#/c/12244/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12244/3//COMMIT_MSG@15 PS3, Line 15: - Histogram of the distribution of peak memory used by queries admitted > AC already keeps track of mem admitted and reserved per host, but it is onl Yeah at the host level - it would be useful to expose that for debugging too (in a separate section from the per-pool metrics I guess). We have seen cases before where queries couldn't be admitted because of a single host being oversubscribed. http://gerrit.cloudera.org:8080/#/c/12244/4/be/src/scheduling/admission-controller.h File be/src/scheduling/admission-controller.h: http://gerrit.cloudera.org:8080/#/c/12244/4/be/src/scheduling/admission-controller.h@251 PS4, Line 251: int64_t > Ack I think the unsigned vs signed int thing is controversial and I've heard a lot of conflicting opinions. E.g. http://wesmckinney.com/blog/avoid-unsigned-integers/,http://blog.robertelder.org/signed-or-unsigned/ We generally prefer signed integers, consistent with the google C++ style guide: https://google.github.io/styleguide/cppguide.html#Integer_Types. In practice int64_t is large enough for any practical value of bytes and underflow when unsigned ints get involved does seem to lead to hard-to-detect bugs in practice. http://gerrit.cloudera.org:8080/#/c/12244/4/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/12244/4/be/src/scheduling/admission-controller.cc@44 PS4, Line 44: const int64_t AdmissionController::PoolStats::HISTOGRAM_NUM_OF_BINS = 128; : const int64_t AdmissionController::PoolStats::HISTOGRAM_BIN_SIZE = 1024L * 1024L * 1024L; > I hear you about adding more startup flags but a static granularity might m It doesn't seem worth adding a startup flag here (it may also be difficult to use since it would require restarting the coordinators to take effect). Agree these histograms won't be able to answer every question but I'm ok with deferring figuring that out until we've either got time to think about it or have some feedback on how people are using the histograms (or if they are). Functionality for more complex analysis of query workloads might better belong in a separate tool too. http://gerrit.cloudera.org:8080/#/c/12244/4/tests/webserver/test_web_pages.py File tests/webserver/test_web_pages.py: http://gerrit.cloudera.org:8080/#/c/12244/4/tests/webserver/test_web_pages.py@383 PS4, Line 383: """Sanity check for the admission debug page's http end points (both admission and : reset stats end points).""" > Makes sense. Yeah I generally feel like writing complex end-to-end tests for validating this is not necessarily worth the cost (both writing the tests initially and the cost of maintaining them if we tweak the debug pages). We should be at least manually inspecting the debug pages if we're changing them anyway - it's very difficult to write tests that would catch all UI issues, even those that are really obvious to a human. http://gerrit.cloudera.org:8080/#/c/12244/5/tests/webserver/test_web_pages.py File tests/webserver/test_web_pages.py: http://gerrit.cloudera.org:8080/#/c/12244/5/tests/webserver/test_web_pages.py@402 PS5, Line 402: assert response_json[0]['total_admitted'] == 0 Does this need to run serially? Presumably a concurrent test could admit a query after the stats got reset? -- To view, visit http://gerrit.cloudera.org:8080/12244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Gerrit-Change-Number: 12244 Gerrit-PatchSet: 5 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 31 Jan 2019 09:12:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Pooja Nilangekar has posted comments on this change. ( http://gerrit.cloudera.org:8080/12244 ) Change subject: IMPALA-8092: Add an admission controller debug page .. Patch Set 5: Code-Review+1 (3 comments) http://gerrit.cloudera.org:8080/#/c/12244/4/be/src/scheduling/admission-controller.h File be/src/scheduling/admission-controller.h: http://gerrit.cloudera.org:8080/#/c/12244/4/be/src/scheduling/admission-controller.h@251 PS4, Line 251: int64_t > You are right. I used int64_t just because the structs that are used to kee Ack http://gerrit.cloudera.org:8080/#/c/12244/4/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/12244/4/be/src/scheduling/admission-controller.cc@44 PS4, Line 44: const int64_t AdmissionController::PoolStats::HISTOGRAM_NUM_OF_BINS = 128; : const int64_t AdmissionController::PoolStats::HISTOGRAM_BIN_SIZE = 1024L * 1024L * 1024L; > I was considering that myself but was reluctant to add more startup flags a I hear you about adding more startup flags but a static granularity might make the histogram less useful for debugging (especially if most queries fall in the < 1GB memory range). Also a small number of queries would be using up large amounts of memory > 100 GB which got me thinking it might make sense to have some flexibility. http://gerrit.cloudera.org:8080/#/c/12244/4/tests/webserver/test_web_pages.py File tests/webserver/test_web_pages.py: http://gerrit.cloudera.org:8080/#/c/12244/4/tests/webserver/test_web_pages.py@383 PS4, Line 383: """Sanity check for the admission debug page's http end points (both admission and : reset stats end points).""" > I initially thought about writing a test to verify all fields on the page, Makes sense. -- To view, visit http://gerrit.cloudera.org:8080/12244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Gerrit-Change-Number: 12244 Gerrit-PatchSet: 5 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 31 Jan 2019 00:09:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12244 ) Change subject: IMPALA-8092: Add an admission controller debug page .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1938/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Gerrit-Change-Number: 12244 Gerrit-PatchSet: 5 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 30 Jan 2019 21:56:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/12244 ) Change subject: IMPALA-8092: Add an admission controller debug page .. Patch Set 4: (8 comments) http://gerrit.cloudera.org:8080/#/c/12244/4/be/src/scheduling/admission-controller.h File be/src/scheduling/admission-controller.h: http://gerrit.cloudera.org:8080/#/c/12244/4/be/src/scheduling/admission-controller.h@251 PS4, Line 251: int64_t > I might be missing something, but why are we using int64_t here and in the You are right. I used int64_t just because the structs that are used to keep track of this metric use int64_t so there was no point switching to uint http://gerrit.cloudera.org:8080/#/c/12244/4/be/src/scheduling/admission-controller.h@264 PS4, Line 264: that > Nit: that -> to which Done http://gerrit.cloudera.org:8080/#/c/12244/4/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/12244/4/be/src/scheduling/admission-controller.cc@44 PS4, Line 44: const int64_t AdmissionController::PoolStats::HISTOGRAM_NUM_OF_BINS = 128; : const int64_t AdmissionController::PoolStats::HISTOGRAM_BIN_SIZE = 1024L * 1024L * 1024L; > Would it help if these were tunable parameters? (eg. startup flags) I was considering that myself but was reluctant to add more startup flags and considered these values to generally cover most use cases. @pooja @tim would you recommend I add these as startup flags? http://gerrit.cloudera.org:8080/#/c/12244/4/be/src/service/impala-http-handler.cc File be/src/service/impala-http-handler.cc: http://gerrit.cloudera.org:8080/#/c/12244/4/be/src/service/impala-http-handler.cc@882 PS4, Line 882: int64_t mem_limit; : int64_t mem_limit_for_admission; > Similar question as before since num_backends can be unsigned. I am wonderi same reason as the previous one. http://gerrit.cloudera.org:8080/#/c/12244/4/tests/webserver/test_web_pages.py File tests/webserver/test_web_pages.py: http://gerrit.cloudera.org:8080/#/c/12244/4/tests/webserver/test_web_pages.py@383 PS4, Line 383: """Sanity check for the admission debug page's http end points (both admission and : reset stats end points).""" > Would it be possible to create a long running query which takes up all the I initially thought about writing a test to verify all fields on the page, but that would entail writing a pretty long custom cluster test and I wasn't sure how extensively we should write tests for debug pages. I dont feel too strongly either way, let me know what you think http://gerrit.cloudera.org:8080/#/c/12244/4/tests/webserver/test_web_pages.py@411 PS4, Line 411: > flake8: E251 unexpected spaces around keyword / parameter equals Done http://gerrit.cloudera.org:8080/#/c/12244/4/tests/webserver/test_web_pages.py@411 PS4, Line 411: > flake8: E251 unexpected spaces around keyword / parameter equals Done http://gerrit.cloudera.org:8080/#/c/12244/4/tests/webserver/test_web_pages.py@421 PS4, Line 421: > flake8: E222 multiple spaces after operator Done -- To view, visit http://gerrit.cloudera.org:8080/12244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Gerrit-Change-Number: 12244 Gerrit-PatchSet: 4 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 30 Jan 2019 21:18:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Hello Pooja Nilangekar, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12244 to look at the new patch set (#5). Change subject: IMPALA-8092: Add an admission controller debug page .. IMPALA-8092: Add an admission controller debug page This patch adds a new debug page "admission" that provides the following details about resource pools: - Pool configuration - Relevant pool stats - Queued Queries in order of being queued (local to the coordinator) - Running Queries (local to this coordinator) - Histogram of the distribution of peak memory used by queries admitted to the pool The aforementioned details can also be viewed for a single resource pool using a search string and are also available as a JSON object from the same http endpoint. Testing: - Added a test that checks if the admission debug page loads correctly and checks if the reset informational stats http end point works as expected. - Did manual stress testing by running the e2e tests and constantly fetching the admission debug page. Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 --- M be/src/catalog/catalog-server.cc M be/src/rpc/rpc-trace.cc M be/src/runtime/coordinator.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/service/impala-http-handler.cc M be/src/service/impala-http-handler.h M be/src/statestore/statestore.cc M be/src/util/default-path-handlers.cc M be/src/util/logging-support.cc M be/src/util/metrics.cc M be/src/util/thread.cc M be/src/util/webserver-test.cc M be/src/util/webserver.h M tests/webserver/test_web_pages.py A www/admission_controller.tmpl 16 files changed, 780 insertions(+), 37 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/12244/5 -- To view, visit http://gerrit.cloudera.org:8080/12244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Gerrit-Change-Number: 12244 Gerrit-PatchSet: 5 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Pooja Nilangekar has posted comments on this change. ( http://gerrit.cloudera.org:8080/12244 ) Change subject: IMPALA-8092: Add an admission controller debug page .. Patch Set 4: Code-Review+1 (5 comments) Hi Bikram, This looks good mostly, I don't have any experience with css/tmpl. Just reading through that part made sense to me but I am not entirely sure about it. Rest of change makes sense. Thanks, Pooja http://gerrit.cloudera.org:8080/#/c/12244/4/be/src/scheduling/admission-controller.h File be/src/scheduling/admission-controller.h: http://gerrit.cloudera.org:8080/#/c/12244/4/be/src/scheduling/admission-controller.h@251 PS4, Line 251: int64_t I might be missing something, but why are we using int64_t here and in the ResourceUtilization structure? Given the memory consumption is always non-negative would it make sense to use uint64_t instead? http://gerrit.cloudera.org:8080/#/c/12244/4/be/src/scheduling/admission-controller.h@264 PS4, Line 264: that Nit: that -> to which (Not super critical, I initially had some trouble understanding the context). http://gerrit.cloudera.org:8080/#/c/12244/4/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/12244/4/be/src/scheduling/admission-controller.cc@44 PS4, Line 44: const int64_t AdmissionController::PoolStats::HISTOGRAM_NUM_OF_BINS = 128; : const int64_t AdmissionController::PoolStats::HISTOGRAM_BIN_SIZE = 1024L * 1024L * 1024L; Would it help if these were tunable parameters? (eg. startup flags) Since different clusters and workloads would have varying ranges of query memory requirements, it would help if there was some flexibility in making these ranges more/less granular as required. Just a question/suggestion. This could probably be a followup. http://gerrit.cloudera.org:8080/#/c/12244/4/be/src/service/impala-http-handler.cc File be/src/service/impala-http-handler.cc: http://gerrit.cloudera.org:8080/#/c/12244/4/be/src/service/impala-http-handler.cc@882 PS4, Line 882: int64_t mem_limit; : int64_t mem_limit_for_admission; Similar question as before since num_backends can be unsigned. I am wondering why we're storing these mem limits as signed. http://gerrit.cloudera.org:8080/#/c/12244/4/tests/webserver/test_web_pages.py File tests/webserver/test_web_pages.py: http://gerrit.cloudera.org:8080/#/c/12244/4/tests/webserver/test_web_pages.py@383 PS4, Line 383: """Sanity check for the admission debug page's http end points (both admission and : reset stats end points).""" Would it be possible to create a long running query which takes up all the resources in a pool and then add submit another query to it. That way the test could verify that the queued queries are displayed as expected and also the queuing reason is added to the page. -- To view, visit http://gerrit.cloudera.org:8080/12244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Gerrit-Change-Number: 12244 Gerrit-PatchSet: 4 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 30 Jan 2019 19:38:07 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12244 ) Change subject: IMPALA-8092: Add an admission controller debug page .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1929/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Gerrit-Change-Number: 12244 Gerrit-PatchSet: 4 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 30 Jan 2019 18:46:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12244 ) Change subject: IMPALA-8092: Add an admission controller debug page .. Patch Set 4: (3 comments) http://gerrit.cloudera.org:8080/#/c/12244/4/tests/webserver/test_web_pages.py File tests/webserver/test_web_pages.py: http://gerrit.cloudera.org:8080/#/c/12244/4/tests/webserver/test_web_pages.py@411 PS4, Line 411: flake8: E251 unexpected spaces around keyword / parameter equals http://gerrit.cloudera.org:8080/#/c/12244/4/tests/webserver/test_web_pages.py@411 PS4, Line 411: flake8: E251 unexpected spaces around keyword / parameter equals http://gerrit.cloudera.org:8080/#/c/12244/4/tests/webserver/test_web_pages.py@421 PS4, Line 421: flake8: E222 multiple spaces after operator -- To view, visit http://gerrit.cloudera.org:8080/12244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Gerrit-Change-Number: 12244 Gerrit-PatchSet: 4 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 30 Jan 2019 17:52:00 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Hello Pooja Nilangekar, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12244 to look at the new patch set (#4). Change subject: IMPALA-8092: Add an admission controller debug page .. IMPALA-8092: Add an admission controller debug page This patch adds a new debug page "admission" that provides the following details about resource pools: - Pool configuration - Relevant pool stats - Queued Queries in order of being queued (local to the coordinator) - Running Queries (local to this coordinator) - Histogram of the distribution of peak memory used by queries admitted to the pool The aforementioned details can also be viewed for a single resource pool using a search string and are also available as a JSON object from the same http endpoint. Testing: Added a test that checks if the admission debug page loads correctly and checks if the reset informational stats http end point works as expected. Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 --- M be/src/catalog/catalog-server.cc M be/src/rpc/rpc-trace.cc M be/src/runtime/coordinator.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/service/impala-http-handler.cc M be/src/service/impala-http-handler.h M be/src/statestore/statestore.cc M be/src/util/default-path-handlers.cc M be/src/util/logging-support.cc M be/src/util/metrics.cc M be/src/util/thread.cc M be/src/util/webserver-test.cc M be/src/util/webserver.h M tests/webserver/test_web_pages.py A www/admission_controller.tmpl 16 files changed, 780 insertions(+), 37 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/12244/4 -- To view, visit http://gerrit.cloudera.org:8080/12244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Gerrit-Change-Number: 12244 Gerrit-PatchSet: 4 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/12244 ) Change subject: IMPALA-8092: Add an admission controller debug page .. Patch Set 3: (4 comments) http://gerrit.cloudera.org:8080/#/c/12244/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12244/3//COMMIT_MSG@15 PS3, Line 15: - Histogram of the distribution of peak memory used by queries admitted > Another thought: an important bit of the AC state is the memory available , AC already keeps track of mem admitted and reserved per host, but it is only at the host level and not a per pool per host level. is that what you were referring to ? or did you mean a per pool per host granularity? http://gerrit.cloudera.org:8080/#/c/12244/3//COMMIT_MSG@21 PS3, Line 21: > Missed this first time around, but can we add a test to test_web_pages that done. will update the result of the manual testing soon http://gerrit.cloudera.org:8080/#/c/12244/3/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/12244/3/be/src/scheduling/admission-controller.cc@1095 PS3, Line 1095: using namespace rapidjson; > Why not include the namespace for the whole function? Done http://gerrit.cloudera.org:8080/#/c/12244/3/be/src/service/impala-http-handler.cc File be/src/service/impala-http-handler.cc: http://gerrit.cloudera.org:8080/#/c/12244/3/be/src/service/impala-http-handler.cc@879 PS3, Line 879: form > from Done -- To view, visit http://gerrit.cloudera.org:8080/12244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Gerrit-Change-Number: 12244 Gerrit-PatchSet: 3 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 30 Jan 2019 04:28:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12244 ) Change subject: IMPALA-8092: Add an admission controller debug page .. Patch Set 3: (4 comments) The code is looking good to me. Main request is testing (which I missed first time around). Does Pooja intend to review this too? http://gerrit.cloudera.org:8080/#/c/12244/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12244/3//COMMIT_MSG@15 PS3, Line 15: - Histogram of the distribution of peak memory used by queries admitted Another thought: an important bit of the AC state is the memory available , admitted and reserved on each host - can we expose this in this patch on in a follow-on patch? http://gerrit.cloudera.org:8080/#/c/12244/3//COMMIT_MSG@21 PS3, Line 21: Missed this first time around, but can we add a test to test_web_pages that loads the /admission webpage (maybe also with a request_pool parameter and also invokes the refresh). It would be good to have just as a sanity check. It also wouldn't be a bad idea to do a manual stress test just to try and flush out any races or crashes - e.g. run a workload against a minicluster and have a script that loads the page in a loop. I did manually look through the code for anything suspicious and it seems fine, but best to be sure. http://gerrit.cloudera.org:8080/#/c/12244/3/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/12244/3/be/src/scheduling/admission-controller.cc@1095 PS3, Line 1095: using namespace rapidjson; Why not include the namespace for the whole function? http://gerrit.cloudera.org:8080/#/c/12244/3/be/src/service/impala-http-handler.cc File be/src/service/impala-http-handler.cc: http://gerrit.cloudera.org:8080/#/c/12244/3/be/src/service/impala-http-handler.cc@879 PS3, Line 879: form from -- To view, visit http://gerrit.cloudera.org:8080/12244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Gerrit-Change-Number: 12244 Gerrit-PatchSet: 3 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 25 Jan 2019 23:41:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12244 ) Change subject: IMPALA-8092: Add an admission controller debug page .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1890/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Gerrit-Change-Number: 12244 Gerrit-PatchSet: 3 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 25 Jan 2019 19:48:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/12244 ) Change subject: IMPALA-8092: Add an admission controller debug page .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/12244/1/www/admission_controller.tmpl File www/admission_controller.tmpl: http://gerrit.cloudera.org:8080/#/c/12244/1/www/admission_controller.tmpl@283 PS1, Line 283: > This came up because I started running a stress test and was looking at the Done. added the total admitted/rejected/timed out metrics -- To view, visit http://gerrit.cloudera.org:8080/12244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Gerrit-Change-Number: 12244 Gerrit-PatchSet: 2 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 25 Jan 2019 19:09:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Hello Pooja Nilangekar, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12244 to look at the new patch set (#3). Change subject: IMPALA-8092: Add an admission controller debug page .. IMPALA-8092: Add an admission controller debug page This patch adds a new debug page "admission" that provides the following details about resource pools: - Pool configuration - Relevant pool stats - Queued Queries in order of being queued (local to the coordinator) - Running Queries (local to this coordinator) - Histogram of the distribution of peak memory used by queries admitted to the pool The aforementioned details can also be viewed for a single resource pool using a search string and are also available as a JSON object from the same http endpoint. Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 --- M be/src/catalog/catalog-server.cc M be/src/rpc/rpc-trace.cc M be/src/runtime/coordinator.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/service/impala-http-handler.cc M be/src/service/impala-http-handler.h M be/src/statestore/statestore.cc M be/src/util/default-path-handlers.cc M be/src/util/logging-support.cc M be/src/util/metrics.cc M be/src/util/thread.cc M be/src/util/webserver-test.cc M be/src/util/webserver.h A www/admission_controller.tmpl 15 files changed, 734 insertions(+), 37 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/12244/3 -- To view, visit http://gerrit.cloudera.org:8080/12244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Gerrit-Change-Number: 12244 Gerrit-PatchSet: 3 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12244 ) Change subject: IMPALA-8092: Add an admission controller debug page .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/12244/1/www/admission_controller.tmpl File www/admission_controller.tmpl: http://gerrit.cloudera.org:8080/#/c/12244/1/www/admission_controller.tmpl@283 PS1, Line 283: Queries Currently Running > That might be difficult to get right with the current metrics since we dont This came up because I started running a stress test and was looking at the page to see if queries were actually getting submitted to the pool - and it was hard to tell whether the queries had actually run without switching back over to my stress test terminal. Maybe we could expose the admitted/rejected/timed out metrics? Since in aggregate that would show the number of queries that had "exited" the admission control. -- To view, visit http://gerrit.cloudera.org:8080/12244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Gerrit-Change-Number: 12244 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 25 Jan 2019 17:55:16 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12244 ) Change subject: IMPALA-8092: Add an admission controller debug page .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1885/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Gerrit-Change-Number: 12244 Gerrit-PatchSet: 2 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 25 Jan 2019 04:31:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Hello Pooja Nilangekar, Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/12244 to look at the new patch set (#2). Change subject: IMPALA-8092: Add an admission controller debug page .. IMPALA-8092: Add an admission controller debug page This patch adds a new debug page "admission" that provides the following details about resource pools: - Pool configuration - Relevant pool stats - Queued Queries in order of being queued (local to the coordinator) - Running Queries (local to this coordinator) - Histogram of the distribution of peak memory used by queries admitted to the pool The aforementioned details can also be viewed for a single resource pool using a search string and are also available as a JSON object from the same http endpoint. Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 --- M be/src/catalog/catalog-server.cc M be/src/rpc/rpc-trace.cc M be/src/runtime/coordinator.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/service/impala-http-handler.cc M be/src/service/impala-http-handler.h M be/src/statestore/statestore.cc M be/src/util/default-path-handlers.cc M be/src/util/logging-support.cc M be/src/util/metrics.cc M be/src/util/thread.cc M be/src/util/webserver-test.cc M be/src/util/webserver.h A www/admission_controller.tmpl 15 files changed, 716 insertions(+), 37 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/12244/2 -- To view, visit http://gerrit.cloudera.org:8080/12244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Gerrit-Change-Number: 12244 Gerrit-PatchSet: 2 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/12244 ) Change subject: IMPALA-8092: Add an admission controller debug page .. Patch Set 1: (18 comments) Please let me know if you have any suggestions on improving the UI http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/scheduling/admission-controller.h File be/src/scheduling/admission-controller.h: http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/scheduling/admission-controller.h@403 PS1, Line 403: void ResetStats(); > We might want to rename this - it seems like this only resets informational Yup, i struggled with the naming part too. Going with informational for now. http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/scheduling/admission-controller.h@455 PS1, Line 455: EMA > I think lower-case ema is more consistent with elsewhere. Done http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/scheduling/admission-controller.h@593 PS1, Line 593: Query > lower case? transformed the method to a lambda instead http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/scheduling/admission-controller.cc@1098 PS1, Line 1098: bind(QueueNodeToJsonHelperCallback, &queued_queries, document, _1)); > Use a lambda? Done http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/service/impala-http-handler.cc File be/src/service/impala-http-handler.cc: http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/service/impala-http-handler.cc@145 PS1, Line 145: admission_controller > maybe just /admission to be more concise? Done http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/service/impala-http-handler.cc@148 PS1, Line 148: resource_pool_reset > This shows up at the top of the debug page now. I think you need to pass in Done http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/service/impala-http-handler.cc@866 PS1, Line 866: void ImpalaHttpHandler::AdmissionControllerStateHandler(const Webserver::ArgumentMap& args, > line too long (91 > 90) Done http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/service/impala-http-handler.cc@873 PS1, Line 873: QueryInfo(const TUniqueId& q_id, int64_t memory_limit, int64_t memory_limit_for_ac, > We might be able to use a struct initializer {} and avoid this constructor. Done http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/service/impala-http-handler.cc@886 PS1, Line 886: & > Consider just passing in the variables that need to be accessed, i.e. runni Done http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/service/impala-http-handler.cc@889 PS1, Line 889: if (request_state->operation_state() != TOperationState::INITIALIZED_STATE > I think we probably want to read operation_state() once and put it in a loc Done. For some reason i assumed the iterator took lock before executing the lambda, turns out it was the map lock http://gerrit.cloudera.org:8080/#/c/12244/1/www/admission_controller.tmpl File www/admission_controller.tmpl: PS1: > It would be nice if we had a way to show only a single resource pool on a p Done http://gerrit.cloudera.org:8080/#/c/12244/1/www/admission_controller.tmpl@194 PS1, Line 194: This page lists all resource pools are their corresponding stats. > It actually only lists resource pools that queries have been submitted to ( Done http://gerrit.cloudera.org:8080/#/c/12244/1/www/admission_controller.tmpl@210 PS1, Line 210: Requests > I think "Max Concurrent Queries" would be more intuitive (we use "Queries" Done http://gerrit.cloudera.org:8080/#/c/12244/1/www/admission_controller.tmpl@223 PS1, Line 223: min_query_mem_limit > should be max Done http://gerrit.cloudera.org:8080/#/c/12244/1/www/admission_controller.tmpl@231 PS1, Line 231: local > maybe "submitted to this coordinator" instead of "local to this coordinator Done http://gerrit.cloudera.org:8080/#/c/12244/1/www/admission_controller.tmpl@283 PS1, Line 283: Queries Currently Running > Is is reasonably easy to show the total number of queries that were submitt That might be difficult to get right with the current metrics since we dont keep track of the queries cancelled while in admission. Otherwise we can get the num of locally submitted queries by adding (total_admitted + total_rejected + total_timed_out + agg_num_queued). The easiest way would be to add a "total_num_submitted" metric to the pool stats. btw what is the use case that you were considering for this metric? http://gerrit.cloudera.org:8080/#/c/12244/1/www/admission_controller.tmpl@293 PS1, Line 293: Total Memory Reserved > Might be worth being more verbose here - "Total Memory Reserved Across Clus Done http://gerrit.cloudera.org:8080/#/c/12244/1/www/admission_controller.tmpl@298 PS1, Line 298: Local Memory Admitted > Maybe "Memory Admitted on this Coordinator" Done -- To view, visit http://ger
[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12244 ) Change subject: IMPALA-8092: Add an admission controller debug page .. Patch Set 1: (19 comments) This is cool - I checked it out and played around with it a bit. I think we can improve the ergonomics a bit, but I really liked just being able to see what's going on in the admission controller in real time. http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/scheduling/admission-controller.h File be/src/scheduling/admission-controller.h: http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/scheduling/admission-controller.h@403 PS1, Line 403: void ResetStats(); We might want to rename this - it seems like this only resets informational statistics, not statistics that are used by the admission control algorithms. Not sure if there's a better name than "Informational" http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/scheduling/admission-controller.h@449 PS1, Line 449: std::vector peak_mem_histogram_; > just looked at HdrHistogram, it seems like a full blown implementation of a Yeah I took a look too and it seemed non-trivial to use. I'm fine with leaving as-is for now. http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/scheduling/admission-controller.h@455 PS1, Line 455: EMA I think lower-case ema is more consistent with elsewhere. http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/scheduling/admission-controller.h@593 PS1, Line 593: Query lower case? http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/scheduling/admission-controller.cc@1098 PS1, Line 1098: bind(QueueNodeToJsonHelperCallback, &queued_queries, document, _1)); Use a lambda? http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/service/impala-http-handler.cc File be/src/service/impala-http-handler.cc: http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/service/impala-http-handler.cc@145 PS1, Line 145: admission_controller maybe just /admission to be more concise? http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/service/impala-http-handler.cc@148 PS1, Line 148: resource_pool_reset This shows up at the top of the debug page now. I think you need to pass in false like some of the above cases. This might be a good time to make the is_on_nav_bar argument to RegisterUrlCallback non-default. http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/service/impala-http-handler.cc@873 PS1, Line 873: QueryInfo(const TUniqueId& q_id, int64_t memory_limit, int64_t memory_limit_for_ac, We might be able to use a struct initializer {} and avoid this constructor. http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/service/impala-http-handler.cc@886 PS1, Line 886: & Consider just passing in the variables that need to be accessed, i.e. running_queries http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/service/impala-http-handler.cc@889 PS1, Line 889: if (request_state->operation_state() != TOperationState::INITIALIZED_STATE I think we probably want to read operation_state() once and put it in a local rather than on two branches of the condition. It's hard to otherwise reason about what might happen if the state changes in-between the conditions being evaluated. http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/service/impala-http-handler.cc@923 PS1, Line 923: // In order to embed a plain json inside the webpage generated by mustache, we need This is a little unfortunate but I think we can live with it. http://gerrit.cloudera.org:8080/#/c/12244/1/www/admission_controller.tmpl File www/admission_controller.tmpl: PS1: It would be nice if we had a way to show only a single resource pool on a page (maybe just with an argument to filter them). Then if the pool headings linked to those pages. I'm thinking it would be convenient to be able to look at a single page and F5 it. http://gerrit.cloudera.org:8080/#/c/12244/1/www/admission_controller.tmpl@194 PS1, Line 194: This page lists all resource pools are their corresponding stats. It actually only lists resource pools that queries have been submitted to (since they are lazily created). http://gerrit.cloudera.org:8080/#/c/12244/1/www/admission_controller.tmpl@210 PS1, Line 210: Requests I think "Max Concurrent Queries" would be more intuitive (we use "Queries" elsewhere in the page. http://gerrit.cloudera.org:8080/#/c/12244/1/www/admission_controller.tmpl@223 PS1, Line 223: min_query_mem_limit should be max http://gerrit.cloudera.org:8080/#/c/12244/1/www/admission_controller.tmpl@231 PS1, Line 231: local maybe "submitted to this coordinator" instead of "local to this coordinator" - might match user-facing concepts better. http://gerrit.cloudera.org:8080/#/c/12244/1/www/admission_controller.tmpl@283 PS1, Line 283: Queries Currently Running Is is reas
[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/12244 ) Change subject: IMPALA-8092: Add an admission controller debug page .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/scheduling/admission-controller.h File be/src/scheduling/admission-controller.h: http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/scheduling/admission-controller.h@449 PS1, Line 449: std::vector peak_mem_histogram_; > Quick initial question, can't remember if we discussed this offline - why n just looked at HdrHistogram, it seems like a full blown implementation of a histogram, not sure it its an overkill for our use case. I just went with fixed bucket for easy lookup of appropriate bucket and to avoid keeping track of the bucket-to-value mapping. Definitely a lot of ways we can do this and something more clever to allow variable range and buckets. open to suggestions. -- To view, visit http://gerrit.cloudera.org:8080/12244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Gerrit-Change-Number: 12244 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 Jan 2019 19:53:43 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12244 ) Change subject: IMPALA-8092: Add an admission controller debug page .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/scheduling/admission-controller.h File be/src/scheduling/admission-controller.h: http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/scheduling/admission-controller.h@449 PS1, Line 449: std::vector peak_mem_histogram_; Quick initial question, can't remember if we discussed this offline - why not use something like HdrHistogram? I guess a histogram with fixed buckets is easier to plot? -- To view, visit http://gerrit.cloudera.org:8080/12244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Gerrit-Change-Number: 12244 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 23 Jan 2019 17:56:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12244 ) Change subject: IMPALA-8092: Add an admission controller debug page .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1821/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/12244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Gerrit-Change-Number: 12244 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 Jan 2019 20:51:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/12244 ) Change subject: IMPALA-8092: Add an admission controller debug page .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/service/impala-http-handler.cc File be/src/service/impala-http-handler.cc: http://gerrit.cloudera.org:8080/#/c/12244/1/be/src/service/impala-http-handler.cc@866 PS1, Line 866: void ImpalaHttpHandler::AdmissionControllerStateHandler(const Webserver::ArgumentMap& args, line too long (91 > 90) -- To view, visit http://gerrit.cloudera.org:8080/12244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Gerrit-Change-Number: 12244 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Pooja Nilangekar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 18 Jan 2019 20:14:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8092: Add an admission controller debug page
Bikramjeet Vig has uploaded this change for review. ( http://gerrit.cloudera.org:8080/12244 Change subject: IMPALA-8092: Add an admission controller debug page .. IMPALA-8092: Add an admission controller debug page This patch adds a new debug page "admission_controller" that provides the following details about all resource pools: - Pool configuration - Relevant pool stats - Queued Queries in order of being queued (local to the coordinator) - Running Queries (local to this coordinator) - Histogram of the distribution of peak memory used by queries admitted to the pool The aforementioned details are also available as JSON from the same http endpoint. Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 --- M be/src/runtime/coordinator.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/service/impala-http-handler.cc M be/src/service/impala-http-handler.h A www/admission_controller.tmpl 6 files changed, 659 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/44/12244/1 -- To view, visit http://gerrit.cloudera.org:8080/12244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Gerrit-Change-Number: 12244 Gerrit-PatchSet: 1 Gerrit-Owner: Bikramjeet Vig Gerrit-Reviewer: Bikramjeet Vig