Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12244 )
Change subject: IMPALA-8092: Add an admission controller debug page ...................................................................... Patch Set 5: Code-Review+2 (5 comments) I think this is basically good except for my question about running the test serially. http://gerrit.cloudera.org:8080/#/c/12244/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12244/3//COMMIT_MSG@15 PS3, Line 15: - Histogram of the distribution of peak memory used by queries admitted > AC already keeps track of mem admitted and reserved per host, but it is onl Yeah at the host level - it would be useful to expose that for debugging too (in a separate section from the per-pool metrics I guess). We have seen cases before where queries couldn't be admitted because of a single host being oversubscribed. http://gerrit.cloudera.org:8080/#/c/12244/4/be/src/scheduling/admission-controller.h File be/src/scheduling/admission-controller.h: http://gerrit.cloudera.org:8080/#/c/12244/4/be/src/scheduling/admission-controller.h@251 PS4, Line 251: int64_t > Ack I think the unsigned vs signed int thing is controversial and I've heard a lot of conflicting opinions. E.g. http://wesmckinney.com/blog/avoid-unsigned-integers/,http://blog.robertelder.org/signed-or-unsigned/ We generally prefer signed integers, consistent with the google C++ style guide: https://google.github.io/styleguide/cppguide.html#Integer_Types. In practice int64_t is large enough for any practical value of bytes and underflow when unsigned ints get involved does seem to lead to hard-to-detect bugs in practice. http://gerrit.cloudera.org:8080/#/c/12244/4/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/12244/4/be/src/scheduling/admission-controller.cc@44 PS4, Line 44: const int64_t AdmissionController::PoolStats::HISTOGRAM_NUM_OF_BINS = 128; : const int64_t AdmissionController::PoolStats::HISTOGRAM_BIN_SIZE = 1024L * 1024L * 1024L; > I hear you about adding more startup flags but a static granularity might m It doesn't seem worth adding a startup flag here (it may also be difficult to use since it would require restarting the coordinators to take effect). Agree these histograms won't be able to answer every question but I'm ok with deferring figuring that out until we've either got time to think about it or have some feedback on how people are using the histograms (or if they are). Functionality for more complex analysis of query workloads might better belong in a separate tool too. http://gerrit.cloudera.org:8080/#/c/12244/4/tests/webserver/test_web_pages.py File tests/webserver/test_web_pages.py: http://gerrit.cloudera.org:8080/#/c/12244/4/tests/webserver/test_web_pages.py@383 PS4, Line 383: """Sanity check for the admission debug page's http end points (both admission and : reset stats end points).""" > Makes sense. Yeah I generally feel like writing complex end-to-end tests for validating this is not necessarily worth the cost (both writing the tests initially and the cost of maintaining them if we tweak the debug pages). We should be at least manually inspecting the debug pages if we're changing them anyway - it's very difficult to write tests that would catch all UI issues, even those that are really obvious to a human. http://gerrit.cloudera.org:8080/#/c/12244/5/tests/webserver/test_web_pages.py File tests/webserver/test_web_pages.py: http://gerrit.cloudera.org:8080/#/c/12244/5/tests/webserver/test_web_pages.py@402 PS5, Line 402: assert response_json[0]['total_admitted'] == 0 Does this need to run serially? Presumably a concurrent test could admit a query after the stats got reset? -- To view, visit http://gerrit.cloudera.org:8080/12244 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iff055d9709ea1bcc2f492adcde92241b6149f766 Gerrit-Change-Number: 12244 Gerrit-PatchSet: 5 Gerrit-Owner: Bikramjeet Vig <bikramjeet....@cloudera.com> Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Pooja Nilangekar <pooja.nilange...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Comment-Date: Thu, 31 Jan 2019 09:12:18 +0000 Gerrit-HasComments: Yes