[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 42: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 42 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 28 Aug 2020 20:22:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. IMPALA-9989 Improve admission control pool stats logging This work addresses the current limitation in admission controller by appending the last known memory consumption statistics about the set of queries running or waiting on a host or in a pool to the existing memory exhaustion message. The statistics is logged in impalad.INFO when a query is queued or queued and then timed out due to memory pressure in the pool or on the host. The statistics can also be part of the query profile. The new memory consumption statistics can be either stats on host or aggregated pool stats. The stats on host describes memory consumption for every pool on a host. The aggregated pool stats describes the aggregated memory consumption on all hosts for a pool. For each stats type, information such as query Ids and memory consumption of up to top 5 queries is provided, in addition to the min, the max, the average and the total memory consumption for the query set. When a query request is queued due to memory exhaustion, the above new consumption statistics is logged when the BE logging level is set at 2. When a query request is timed out due to memory exhaustion, the above new consumption statistics is logged when the BE logging level is set at 1. Testing: 1. Added a new test TopNQueryCheck in admission-controller-test.cc to verify that the topN query memory consumption details are reported correctly. 2. Add two new tests in test_admission_controller.py to simulate queries being queued and then timed out due to pool or host memory pressure. 3. Added a new test TopN in mem-tracker-test.cc to verify that the topN query memory consumption details are computed correctly from a mem tracker hierarchy. 4. Ran Core tests successfully. Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Reviewed-on: http://gerrit.cloudera.org:8080/16220 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/runtime/mem-tracker-test.cc M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/scheduling/admission-controller-test.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/util/container-util.h M common/thrift/StatestoreService.thrift M common/thrift/generate_error_codes.py M tests/custom_cluster/test_admission_controller.py 10 files changed, 924 insertions(+), 47 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 43 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 42: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6355/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 42 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 28 Aug 2020 15:10:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 42: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 42 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 28 Aug 2020 15:10:46 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 41: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 41 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 28 Aug 2020 15:10:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 41: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7031/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 41 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 28 Aug 2020 13:19:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Qifan Chen has uploaded a new patch set (#41). ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. IMPALA-9989 Improve admission control pool stats logging This work addresses the current limitation in admission controller by appending the last known memory consumption statistics about the set of queries running or waiting on a host or in a pool to the existing memory exhaustion message. The statistics is logged in impalad.INFO when a query is queued or queued and then timed out due to memory pressure in the pool or on the host. The statistics can also be part of the query profile. The new memory consumption statistics can be either stats on host or aggregated pool stats. The stats on host describes memory consumption for every pool on a host. The aggregated pool stats describes the aggregated memory consumption on all hosts for a pool. For each stats type, information such as query Ids and memory consumption of up to top 5 queries is provided, in addition to the min, the max, the average and the total memory consumption for the query set. When a query request is queued due to memory exhaustion, the above new consumption statistics is logged when the BE logging level is set at 2. When a query request is timed out due to memory exhaustion, the above new consumption statistics is logged when the BE logging level is set at 1. Testing: 1. Added a new test TopNQueryCheck in admission-controller-test.cc to verify that the topN query memory consumption details are reported correctly. 2. Add two new tests in test_admission_controller.py to simulate queries being queued and then timed out due to pool or host memory pressure. 3. Added a new test TopN in mem-tracker-test.cc to verify that the topN query memory consumption details are computed correctly from a mem tracker hierarchy. 4. Ran Core tests successfully. Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 --- M be/src/runtime/mem-tracker-test.cc M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/scheduling/admission-controller-test.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/util/container-util.h M common/thrift/StatestoreService.thrift M common/thrift/generate_error_codes.py M tests/custom_cluster/test_admission_controller.py 10 files changed, 924 insertions(+), 47 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/16220/41 -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 41 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 40: (1 comment) http://gerrit.cloudera.org:8080/#/c/16220/38/common/thrift/StatestoreService.thrift File common/thrift/StatestoreService.thrift: http://gerrit.cloudera.org:8080/#/c/16220/38/common/thrift/StatestoreService.thrift@72 PS38, Line 72: num_running > num_running is the sum of queries tracked by all query mem trackers in a po would be nice to clarify this in the comments here. -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 40 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 27 Aug 2020 23:06:25 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 40: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/7008/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 40 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 26 Aug 2020 20:07:58 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Qifan Chen has uploaded a new patch set (#40). ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. IMPALA-9989 Improve admission control pool stats logging This work addresses the current limitation in admission controller by appending the last known memory consumption statistics about the set of queries running or waiting on a host or in a pool to the existing memory exhaustion message. The statistics is logged in impalad.INFO when a query is queued or queued and then timed out due to memory pressure in the pool or on the host. The statistics can also be part of the query profile. The new memory consumption statistics can be either stats on host or aggregated pool stats. The stats on host describes memory consumption for every pool on a host. The aggregated pool stats describes the aggregated memory consumption on all hosts for a pool. For each stats type, information such as query Ids and memory consumption of up to top 5 queries is provided, in addition to the min, the max, the average and the total memory consumption for the query set. When a query request is queued due to memory exhaustion, the above new consumption statistics is logged when the BE logging level is set at 2. When a query request is timed out due to memory exhaustion, the above new consumption statistics is logged when the BE logging level is set at 1. Testing: 1. Added a new test TopNQueryCheck in admission-controller-test.cc to verify that the topN query memory consumption details are reported correctly. 2. Add two new tests in test_admission_controller.py to simulate queries being queued and then timed out due to pool or host memory pressure. 3. Added a new test TopN in mem-tracker-test.cc to verify that the topN query memory consumption details are computed correctly from a mem tracker hierarchy. 4. Ran Core tests successfully. Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 --- M be/src/runtime/mem-tracker-test.cc M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/scheduling/admission-controller-test.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/util/container-util.h M common/thrift/StatestoreService.thrift M common/thrift/generate_error_codes.py M tests/custom_cluster/test_admission_controller.py 10 files changed, 922 insertions(+), 47 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/16220/40 -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 40 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Sahil Takiar has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 38: Code-Review+1 (8 comments) I didn't look through the actual functionality that closely, but generally the approach lgtm. mostly comments on code. http://gerrit.cloudera.org:8080/#/c/16220/38//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16220/38//COMMIT_MSG@7 PS38, Line 7: IMPALA-9989 Improve admission control pool stats logging this commit message is really long, consider moving some of this info in the JIRA, and shortening the commit message http://gerrit.cloudera.org:8080/#/c/16220/38/be/src/runtime/mem-tracker.cc File be/src/runtime/mem-tracker.cc: http://gerrit.cloudera.org:8080/#/c/16220/38/be/src/runtime/mem-tracker.cc@459 PS38, Line 459: heavMemoryQuery nit: typo and it should be heavy_memory_query not heavMemoryQuery http://gerrit.cloudera.org:8080/#/c/16220/38/be/src/scheduling/admission-controller.h File be/src/scheduling/admission-controller.h: http://gerrit.cloudera.org:8080/#/c/16220/38/be/src/scheduling/admission-controller.h@553 PS38, Line 553: const PoolMetrics* metrics() const { return _; } why is this needed for and can it return a const-reference instead? http://gerrit.cloudera.org:8080/#/c/16220/38/be/src/scheduling/admission-controller.h@642 PS38, Line 642: public: do the two methods below need to be public? http://gerrit.cloudera.org:8080/#/c/16220/38/be/src/scheduling/admission-controller.h@696 PS38, Line 696: not_admitted_details("Not Applicable"), does this need to be set? http://gerrit.cloudera.org:8080/#/c/16220/38/be/src/scheduling/admission-controller.h@1054 PS38, Line 1054: /// A helper type to glue information together to compute the topN queries : /// out of topM queries. : typedef std::tuple Item; : const int64_t& getMemConsumed(const Item& item) const { return std::get<0>(item); } : /// Get either the pool or host name. : const string& getName(const Item& item) const { return std::get<1>(item); } : /// Get the query Id. : const TUniqueId& getTUniqueId(const Item& item) const { return std::get<2>(item); } : const TPoolStats* getTPoolStats(const Item& item) const { return std::get<3>(item); } this seems like it should just be a struct, and each field in the struct should have a brief description. its also unclear what "Item" is suppose to represent. the name itself is too generic. I see it used in the vector 'listOfTopNs' below, is it suppose to be a query that consumes a lot of memory? so similar to a "heavy query"? http://gerrit.cloudera.org:8080/#/c/16220/38/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/16220/38/be/src/scheduling/admission-controller.cc@327 PS38, Line 327: output_indented_string nit: OutputIndentedString http://gerrit.cloudera.org:8080/#/c/16220/38/common/thrift/StatestoreService.thrift File common/thrift/StatestoreService.thrift: http://gerrit.cloudera.org:8080/#/c/16220/38/common/thrift/StatestoreService.thrift@72 PS38, Line 72: num_running what is the difference between this field and num_admitted_running? -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 38 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 26 Aug 2020 00:30:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 38: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 38 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 14 Aug 2020 23:39:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 38: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6941/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 38 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 14 Aug 2020 22:59:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 37: (2 comments) http://gerrit.cloudera.org:8080/#/c/16220/37/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/16220/37/be/src/scheduling/admission-controller.cc@911 PS37, Line 911: if ( not_admitted_details ) : *not_admitted_details = ""; nit: dont think this is necessary since a string object has a default initialization of an empty string. But in any case if you want to explicitly init it, it would be better to do this during QueueNode's initializer list http://gerrit.cloudera.org:8080/#/c/16220/36/common/thrift/generate_error_codes.py File common/thrift/generate_error_codes.py: http://gerrit.cloudera.org:8080/#/c/16220/36/common/thrift/generate_error_codes.py@337 PS36, Line 337: Details: > Addressed in AdmissionController::CanAdmitRequest() by init the details str What I meant was that if I see an error message like: "Admission for query exceeded timeout 6ms in pool root.poolA. Queued reason: Queue non-empty Details: " Here the details is empty, so i was proposing to get rid of "Details" if it does not exist and only print the error msg like: "Admission for query exceeded timeout 6ms in pool root.poolA. Queued reason: Queue non-empty" -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 37 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 14 Aug 2020 21:45:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 37: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6940/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 37 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 14 Aug 2020 21:30:04 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Qifan Chen has uploaded a new patch set (#37). ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. IMPALA-9989 Improve admission control pool stats logging This work addresses the current limitation in admission controller by appending the last known memory consumption statistics about a pool to the existing memory exhaustion message. The statistics is logged in impalad.INFO when a query is queued or timed out due to memory pressure in the pool or on the host. The statistics can also be part of the query profile. The BNF of the new memory consumption statistics is as follows. topN_query_stats ::= queries: a list of query Ids and memory consumed for up to 5 queries with top memory consumptions total_consumed: total memory consumed by these topN queries fraction_of_pool_total_mem: total memory consumed divided by pool memory usage (if feasible to report) all_query_stats ::= num_running: the total number of queries running min: the minimal memory consumption of all running queries max: the maximal memory consumption of all running queries pool_total_mem: the total memory consumption of all running queries average: the average memory consumption of all running queries (if feasible to report) pool_stats ::= ":" stats_on_host ::= "Stats for host " List of aggregated_pool_stats ::= "Aggregated stats for pool " memory_consumption_statistics ::= | The stats_on_host describes memory consumption for every pool on a host and is useful in analyzing memory exhaustion on that host. The aggregated_pool_stats describes the aggregated memory consumption on all hosts for a pool for a set of queries and is useful in analyzing memory exhaustion in that pool. Example of stats_on_host for pool root.queueB and root.queueC on host host1:25000. Stats for host host1:25000 pool_name=root.queueB: topN_query_stats: queries=[ id=0001:0004, consumed=20.00 MB, id=0001:0003, consumed=19.00 MB, id=0001:0002, consumed=8.00 MB ], total_consumed=47.00 MB fraction_of_pool_total_mem=0.47 all_query_stats: num_running=4, min=5.00 MB, max=20.00 MB, pool_total_mem=100.00 MB, average=25.00 MB pool_name=root.queueC: topN_query_stats: queries=[ id=0002:, consumed=18.00 MB, id=0002:0001, consumed=12.00 MB ], total_consumed=30.00 MB fraction_of_pool_total_mem=0.06 all_query_stats: num_running=40, min=10.00 MB, max=200.00 MB, pool_total_mem=500.00 MB, average=12.50 MB Example of aggregated_pool_stats over all hosts for pool root.queueC: Aggregated stats for pool root.queueC: topN_query_stats: queries=[ id=0002:0001, consumed=32.00 MB, id=0002:0004, consumed=26.00 MB, id=0002:, consumed=21.00 MB, id=0002:0002, consumed=17.00 MB, id=0002:000e, consumed=9.00 MB ], total_consumed=105.00 MB fraction_of_pool_total_mem=0.82 When a query request is queued due to memory exhaustion, the above memory_consumption_statistics is logged when the logging is set at level 2 or higher. When a query request is timed out due to memory exhaustion, the above memory_consumption_statistics is reported when the logging is set at level 1 or higher. Testing: 1. Added a new test TopNQueryCheck in admission-controller-test.cc to verify that the topN query memory consumption details are reported correctly. 2. Add two new tests in test_admission_controller.py to simulate queries being queued and then timed out due to pool or host memory pressure. 3. Added a new test TopN in mem-tracker-test.cc to verify that the topN query memory consumption details are computed correctly from a mem tracker hierarchy. 4. Ran Core tests successfully. Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 --- M be/src/runtime/mem-tracker-test.cc M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/scheduling/admission-controller-test.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/util/container-util.h M common/thrift/StatestoreService.thrift M common/thrift/generate_error_codes.py M tests/custom_cluster/test_admission_controller.py 10 files changed, 916 insertions(+), 47
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 36: (4 comments) http://gerrit.cloudera.org:8080/#/c/16220/34/be/src/runtime/mem-tracker.cc File be/src/runtime/mem-tracker.cc: http://gerrit.cloudera.org:8080/#/c/16220/34/be/src/runtime/mem-tracker.cc@461 PS34, Line 461: heavMemoryQuery.__set_queryId(tracker->query_id_); > nit: can you print the mem-tracker->label here so thats its easy to debug i can you address this too http://gerrit.cloudera.org:8080/#/c/16220/36/be/src/scheduling/admission-controller.h File be/src/scheduling/admission-controller.h: http://gerrit.cloudera.org:8080/#/c/16220/36/be/src/scheduling/admission-controller.h@640 PS36, Line 640: friend class MemTracker; why do we need to add this as a friend class? http://gerrit.cloudera.org:8080/#/c/16220/36/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/16220/36/be/src/scheduling/admission-controller.cc@1626 PS36, Line 1626: << " Details:" << queue_node->not_admitted_details; see comment in generate_error_codes.py http://gerrit.cloudera.org:8080/#/c/16220/36/common/thrift/generate_error_codes.py File common/thrift/generate_error_codes.py: http://gerrit.cloudera.org:8080/#/c/16220/36/common/thrift/generate_error_codes.py@337 PS36, Line 337: Details: nit: details might not exist. So can get confusing if its left empty. You can probably just add $3 and append an empty string if it doesnt exist. -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 36 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 14 Aug 2020 18:48:51 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 36: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6928/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 36 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 14 Aug 2020 14:26:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Qifan Chen has uploaded a new patch set (#36). ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. IMPALA-9989 Improve admission control pool stats logging This work addresses the current limitation in admission controller by appending the last known memory consumption statistics about a pool to the existing memory exhaustion message. The statistics is logged in impalad.INFO when a query is queued or timed out due to memory pressure in the pool or on the host. The statistics can also be part of the query profile. The BNF of the new memory consumption statistics is as follows. topN_query_stats ::= queries: a list of query Ids and memory consumed for up to 5 queries with top memory consumptions total_consumed: total memory consumed by these topN queries fraction_of_pool_total_mem: total memory consumed divided by pool memory usage (if feasible to report) all_query_stats ::= num_running: the total number of queries running min: the minimal memory consumption of all running queries max: the maximal memory consumption of all running queries pool_total_mem: the total memory consumption of all running queries average: the average memory consumption of all running queries (if feasible to report) pool_stats ::= ":" stats_on_host ::= "Stats for host " List of aggregated_pool_stats ::= "Aggregated stats for pool " memory_consumption_statistics ::= | The stats_on_host describes memory consumption for every pool on a host and is useful in analyzing memory exhaustion on that host. The aggregated_pool_stats describes the aggregated memory consumption on all hosts for a pool for a set of queries and is useful in analyzing memory exhaustion in that pool. Example of stats_on_host for pool root.queueB and root.queueC on host host1:25000. Stats for host host1:25000 pool_name=root.queueB: topN_query_stats: queries=[ id=0001:0004, consumed=20.00 MB, id=0001:0003, consumed=19.00 MB, id=0001:0002, consumed=8.00 MB ], total_consumed=47.00 MB fraction_of_pool_total_mem=0.47 all_query_stats: num_running=4, min=5.00 MB, max=20.00 MB, pool_total_mem=100.00 MB, average=25.00 MB pool_name=root.queueC: topN_query_stats: queries=[ id=0002:, consumed=18.00 MB, id=0002:0001, consumed=12.00 MB ], total_consumed=30.00 MB fraction_of_pool_total_mem=0.06 all_query_stats: num_running=40, min=10.00 MB, max=200.00 MB, pool_total_mem=500.00 MB, average=12.50 MB Example of aggregated_pool_stats over all hosts for pool root.queueC: Aggregated stats for pool root.queueC: topN_query_stats: queries=[ id=0002:0001, consumed=32.00 MB, id=0002:0004, consumed=26.00 MB, id=0002:, consumed=21.00 MB, id=0002:0002, consumed=17.00 MB, id=0002:000e, consumed=9.00 MB ], total_consumed=105.00 MB fraction_of_pool_total_mem=0.82 When a query request is queued due to memory exhaustion, the above memory_consumption_statistics is logged when the logging is set at level 2 or higher. When a query request is timed out due to memory exhaustion, the above memory_consumption_statistics is reported when the logging is set at level 1 or higher. Testing: 1. Added a new test TopNQueryCheck in admission-controller-test.cc to verify that the topN query memory consumption details are reported correctly. 2. Add two new tests in test_admission_controller.py to simulate queries being queued and then timed out due to pool or host memory pressure. 3. Added a new test TopN in mem-tracker-test.cc to verify that the topN query memory consumption details are computed correctly from a mem tracker hierarchy. 4. Ran Core tests successfully. Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 --- M be/src/runtime/mem-tracker-test.cc M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/scheduling/admission-controller-test.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/util/container-util.h M common/thrift/StatestoreService.thrift M common/thrift/generate_error_codes.py M tests/custom_cluster/test_admission_controller.py 10 files changed, 914 insertions(+), 47
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 35: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6921/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 35 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 13 Aug 2020 21:26:17 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Qifan Chen has uploaded a new patch set (#35). ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. IMPALA-9989 Improve admission control pool stats logging This work addresses the current limitation in admission controller by appending the last known memory consumption statistics about a pool to the existing memory exhaustion message. The statistics is logged in impalad.INFO when a query is queued or timed out due to memory pressure in the pool or on the host. The statistics can also be part of the query profile. The BNF of the new memory consumption statistics is as follows. topN_query_stats ::= queries: a list of query Ids and memory consumed for up to 5 queries with top memory consumptions total_consumed: total memory consumed by these topN queries fraction_of_pool_total_mem: total memory consumed divided by pool memory usage (if feasible to report) all_query_stats ::= num_running: the total number of queries running min: the minimal memory consumption of all running queries max: the maximal memory consumption of all running queries total: the total memory consumption of all running queries average: the average memory consumption of all running queries (if feasible to report) pool_stats ::= ':' pool_stats_on_host ::= List of aggregated_pool_stats ::= ':' memory_consumption_statistics ::= | The pool_stats_on_host describes memory consumption in all pools on a host and is useful in analyzing memory exhaustion on that host. The aggregated_pool_stats describes the aggregated memory consumption on all hosts for a pool for a set of queries and is useful in analyzing memory exhaustion in that pool. Example of pool_stats_on_host for pool root.queueB and root.queueC on some host: Stats for host host:22000 pool_name=root.queueB: topN_query_stats: queries=[ id=0001:0004, consumed=20.00 MB, id=0001:0003, consumed=19.00 MB, id=0001:0002, consumed=8.00 MB ], total_consumed=47.00 MB fraction_of_pool_total_mem=0.47 all_query_stats: num_running=4, min=5.00 MB, max=20.00 MB, pool_total_mem=100.00 MB, average=25.00 MB pool_name=root.queueC: topN_query_stats: queries=[ id=0002:, consumed=18.00 MB, id=0002:0001, consumed=12.00 MB ], total_consumed=30.00 MB fraction_of_pool_total_mem=0.06 all_query_stats: num_running=40, min=10.00 MB, max=200.00 MB, pool_total_mem=500.00 MB, average=12.50 MB Example of aggregated_pool_stats over all hosts for pool root.queueC: Aggregated stats for pool root.queueC: topN_query_stats: queries=[ id=00020002:0001, consumed=20.00 MB, id=00020002:0004, consumed=18.00 MB, id=00010002:, consumed=18.00 MB, id=00010002:0001, consumed=12.00 MB, id=00020002:0002, consumed=9.00 MB ], total_consumed=77.00 MB fraction_of_pool_total_mem=0.6 When a query request is queued due to memory exhaustion, the above memory_consumption_statistics is logged when the logging is set at level 2 or higher. When a query request is timed out due to memory exhaustion, the above memory_consumption_statistics is reported when the logging is set at level 1 or higher. Testing: 1. Added a new test TopNQueryCheck in admission-controller-test.cc to verify that the topN query memory consumption details are reported correctly. 2. Add two new tests in test_admission_controller.py to simulate queries being queued and then timed out due to pool or host memory pressure. 3. Added a new test TopN in mem-tracker-test.cc to verify that the topN query memory consumption details are computed correctly from a mem tracker hierarchy. 4. Ran Core tests successfully. Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 --- M be/src/runtime/mem-tracker-test.cc M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/scheduling/admission-controller-test.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/util/container-util.h M common/thrift/StatestoreService.thrift M common/thrift/generate_error_codes.py M tests/custom_cluster/test_admission_controller.py 10 files changed, 914 insertions(+), 47 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 34: Code-Review+1 (8 comments) Looks good, just a few more nits http://gerrit.cloudera.org:8080/#/c/16220/34//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16220/34//COMMIT_MSG@115 PS34, Line 115: Core tests nit: Ran core tests successfully. http://gerrit.cloudera.org:8080/#/c/16220/34/be/src/runtime/mem-tracker-test.cc File be/src/runtime/mem-tracker-test.cc: http://gerrit.cloudera.org:8080/#/c/16220/34/be/src/runtime/mem-tracker-test.cc@389 PS34, Line 389: 100 nit: can use NUM_QUERY_MEM_TRACKERS here and the line above http://gerrit.cloudera.org:8080/#/c/16220/34/be/src/runtime/mem-tracker.cc File be/src/runtime/mem-tracker.cc: http://gerrit.cloudera.org:8080/#/c/16220/34/be/src/runtime/mem-tracker.cc@449 PS34, Line 449: // Collect the top N queries into a priority queue, and computes the min, the max, : // the total memory consumption, and the total number of all queries that are : // memory tracked by query memory trackers. This method should only be called for : // mem-trackers that are either query mem trackers or higher in the mem tracker : // hierarchy. nit: the comment in the header file is sufficient since this is echoing the same thing here. http://gerrit.cloudera.org:8080/#/c/16220/34/be/src/runtime/mem-tracker.cc@461 PS34, Line 461: DCHECK(tracker->is_query_mem_tracker_); nit: can you print the mem-tracker->label here so thats its easy to debug in case it is hit http://gerrit.cloudera.org:8080/#/c/16220/34/be/src/runtime/mem-tracker.cc@476 PS34, Line 476: // Append to ss a debug string for memory consumption part of the pool stats. : // Here is one example. : // topN_query_stats: queries=[554b016cf0f3a37f:9a1bfcfd, : // 464dcd9cc47d724b:9e6a3f64, 2844275a1458bf1f:0bc58875, : // a449dbc7bcbd2af1:647e6ded, 8c430ea5ad38e94a:3c27bf44], : // total_mem_consumed=1.26 MB, fraction_of_pool_total_mem=0.61; pool_level_stats: : // num_running=10, min=0, max=257.48 KB, pool_total_mem=2.06 MB, average_per_query=210.74 : // KB : void MemTracker::AppendStatsForConsumedMemory(stringstream& ss, const TPoolStats& stats) { : ss << "topN_query_stats: "; : ss << "queries=["; : int num_ids = stats.heavy_memory_queries.size(); : int64_t total_memory_consumed_by_top_queries = 0; : for (int i = 0; i < num_ids; i++) { : auto& query = stats.heavy_memory_queries[i]; : total_memory_consumed_by_top_queries += query.memory_consumed; : ss << PrintId(query.queryId); : if (i < num_ids - 1) ss << ", "; : } : ss << "], "; : ss << "total_mem_consumed=" : << PrettyPrinter::PrintBytes(total_memory_consumed_by_top_queries); : int64_t total_memory_consumed = stats.total_memory_consumed; : if (total_memory_consumed > 0) { : ss << ", fraction_of_pool_total_mem=" << setprecision(2) :<< float(total_memory_consumed_by_top_queries) / total_memory_consumed; : } : ss << "; "; : : ss << "pool_level_stats: "; : ss << "num_running=" << stats.num_running << ", "; : ss << "min=" << PrettyPrinter::PrintBytes(stats.min_memory_consumed) << ", "; : ss << "max=" << PrettyPrinter::PrintBytes(stats.max_memory_consumed) << ", "; : ss << "pool_total_mem=" << PrettyPrinter::PrintBytes(total_memory_consumed); : if (stats.num_running > 0) { : ss << ", average_per_query=" :<< PrettyPrinter::PrintBytes(total_memory_consumed / stats.num_running); : } : } nit: this functionality is very admission control focused and it also only uses its related structures. It would be great if you can move this back there. http://gerrit.cloudera.org:8080/#/c/16220/34/be/src/scheduling/admission-controller.h File be/src/scheduling/admission-controller.h: http://gerrit.cloudera.org:8080/#/c/16220/34/be/src/scheduling/admission-controller.h@1053 PS34, Line 1053: getName nit: can you mention that this can be both the pool name and the host http://gerrit.cloudera.org:8080/#/c/16220/34/be/src/scheduling/admission-controller.h@1054 PS34, Line 1054: getTUniqueId nit: mention here that this is the query id
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 34: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6884/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 34 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 03:14:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 33: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6883/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 33 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 03:01:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 34: > Uploaded patch set 28. -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 34 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 02:57:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 34: > Uploaded patch set 29. -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 34 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 02:56:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 34: > Uploaded patch set 28. -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 34 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 02:56:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Qifan Chen has uploaded a new patch set (#34). ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. IMPALA-9989 Improve admission control pool stats logging This work addresses the current limitation in admission controller by appending the last known memory consumption statistics about a pool to the existing memory exhaustion message. The statistics is logged in impalad.INFO when a query is queued or timed out due to memory pressure in the pool or on the host. The statistics can also be part of the query profile. The BNF of the new memory consumption statistics is as follows. topN_query_stats ::= queries: a list of query Ids and memory consumed for up to 5 queries with top memory consumptions total_consumed: total memory consumed by these topN queries fraction_of_pool_total_mem: total memory consumed divided by pool memory usage (if feasible to report) all_query_stats ::= num_running: the total number of queries running min: the minimal memory consumption of all running queries max: the maximal memory consumption of all running queries total: the total memory consumption of all running queries average: the average memory consumption of all running queries (if feasible to report) pool_stats ::= ':' pool_stats_on_host ::= List of aggregated_pool_stats ::= ':' memory_consumption_statistics ::= | The pool_stats_on_host describes memory consumption in all pools on a host and is useful in analyzing memory exhaustion on that host. The aggregated_pool_stats describes the aggregated memory consumption on all hosts for a pool for a set of queries and is useful in analyzing memory exhaustion in that pool. Example of pool_stats_on_host for pool root.queueB and root.queueC on some host: pool_name=root.queueB: topN_query_stats: queries=[ id=0001:0004, consumed=20.00 MB, id=0001:0003, consumed=19.00 MB, id=0001:0002, consumed=8.00 MB ], total_consumed=47.00 MB fraction_of_pool_total_mem=0.47 all_query_stats: num_running=4, min=5.00 MB, max=20.00 MB, pool_total_mem=100.00 MB, average=25.00 MB pool_name=root.queueC: topN_query_stats: queries=[ id=0002:, consumed=18.00 MB, id=0002:0001, consumed=12.00 MB ], total_consumed=30.00 MB fraction_of_pool_total_mem=0.06 all_query_stats: num_running=40, min=10.00 MB, max=200.00 MB, pool_total_mem=500.00 MB, average=12.50 MB Example of aggregated_pool_stats over all hosts for pool root.queueC: Aggregated stats for pool_name=root.queueC: topN_query_stats: queries=[ id=00020002:0001, consumed=20.00 MB, id=00020002:0004, consumed=18.00 MB, id=00010002:, consumed=18.00 MB, id=00010002:0001, consumed=12.00 MB, id=00020002:0002, consumed=9.00 MB ], total_consumed=77.00 MB fraction_of_pool_total_mem=0.6 When a query request is queued due to memory exhaustion, the above memory_consumption_statistics is logged when the logging is set at level 2 or higher. When a query request is timed out due to memory exhaustion, the above memory_consumption_statistics is reported when the logging is set at level 1 or higher. Testing: 1. Added a new test TopNQueryCheck in admission-controller-test.cc to verify that the topN query memory consumption details are reported correctly. 2. Add two new tests in test_admission_controller.py to simulate queries being queued and then timed out due to pool or host memory pressure. 3. Added a new test TopN in mem-tracker-test.cc to verify that the topN query memory consumption details are computed correctly from a mem tracker hierarchy. 4. Core tests. Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 --- M be/src/runtime/mem-tracker-test.cc M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/scheduling/admission-controller-test.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/util/container-util.h M common/thrift/StatestoreService.thrift M tests/custom_cluster/test_admission_controller.py 9 files changed, 901 insertions(+), 45 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/16220/34 -- To view, visit http://gerrit.cloudera.org:8080/16220 To
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 33: (1 comment) http://gerrit.cloudera.org:8080/#/c/16220/33/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/16220/33/be/src/scheduling/admission-controller.cc@471 PS33, Line 471: ReportTopNQueriesAtIndices(ss, listOfAggregatedItems, indices, indent, total_mem_consumed); line too long (93 > 90) -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 33 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 02:41:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 33: > Uploaded patch set 33. -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 33 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 12 Aug 2020 02:40:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Qifan Chen has uploaded a new patch set (#33). ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. IMPALA-9989 Improve admission control pool stats logging This work addresses the current limitation in admission controller by appending the last known memory consumption statistics about a pool to the existing memory exhaustion message. The statistics is logged in impalad.INFO when a query is queued or timed out due to memory pressure in the pool or on the host. The statistics can also be part of the query profile. The BNF of the new memory consumption statistics is as follows. topN_query_stats ::= queries: a list of query Ids and memory consumed for up to 5 queries with top memory consumptions total_consumed: total memory consumed by these topN queries fraction_of_pool_total_mem: total memory consumed divided by pool memory usage (if feasible to report) all_query_stats ::= num_running: the total number of queries running min: the minimal memory consumption of all running queries max: the maximal memory consumption of all running queries total: the total memory consumption of all running queries average: the average memory consumption of all running queries (if feasible to report) pool_stats ::= ':' pool_stats_on_host ::= List of aggregated_pool_stats ::= ':' memory_consumption_statistics ::= | The pool_stats_on_host describes memory consumption in all pools on a host and is useful in analyzing memory exhaustion on that host. The aggregated_pool_stats describes the aggregated memory consumption on all hosts for a pool for a set of queries and is useful in analyzing memory exhaustion in that pool. Example of pool_stats_on_host for pool root.queueB and root.queueC on some host: pool_name=root.queueB: topN_query_stats: queries=[ id=0001:0004, consumed=20.00 MB, id=0001:0003, consumed=19.00 MB, id=0001:0002, consumed=8.00 MB ], total_consumed=47.00 MB fraction_of_pool_total_mem=0.47 all_query_stats: num_running=4, min=5.00 MB, max=20.00 MB, pool_total_mem=100.00 MB, average=25.00 MB pool_name=root.queueC: topN_query_stats: queries=[ id=0002:, consumed=18.00 MB, id=0002:0001, consumed=12.00 MB ], total_consumed=30.00 MB fraction_of_pool_total_mem=0.06 all_query_stats: num_running=40, min=10.00 MB, max=200.00 MB, pool_total_mem=500.00 MB, average=12.50 MB Example of aggregated_pool_stats over all hosts for pool root.queueC: Aggregated stats for pool_name=root.queueC: topN_query_stats: queries=[ id=00020002:0001, consumed=20.00 MB, id=00020002:0004, consumed=18.00 MB, id=00010002:, consumed=18.00 MB, id=00010002:0001, consumed=12.00 MB, id=00020002:0002, consumed=9.00 MB ], total_consumed=77.00 MB fraction_of_pool_total_mem=0.6 When a query request is queued due to memory exhaustion, the above memory_consumption_statistics is logged when the logging is set at level 2 or higher. When a query request is timed out due to memory exhaustion, the above memory_consumption_statistics is reported when the logging is set at level 1 or higher. Testing: 1. Added a new test TopNQueryCheck in admission-controller-test.cc to verify that the topN query memory consumption details are reported correctly. 2. Add two new tests in test_admission_controller.py to simulate queries being queued and then timed out due to pool or host memory pressure. 3. Added a new test TopN in mem-tracker-test.cc to verify that the topN query memory consumption details are computed correctly from a mem tracker hierarchy. 4. Core tests. Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 --- M be/src/runtime/mem-tracker-test.cc M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/scheduling/admission-controller-test.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/util/container-util.h M common/thrift/StatestoreService.thrift M tests/custom_cluster/test_admission_controller.py 9 files changed, 901 insertions(+), 45 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/16220/33 -- To view, visit http://gerrit.cloudera.org:8080/16220 To
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 32: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6862/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 32 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 10 Aug 2020 21:15:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 32: (1 comment) http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.cc@428 PS27, Line 428: //id=00020002:0004, consumed=18.00 MB, : //id=00010002:, consumed=18.00 MB, : //id=00010002:0001, consumed=12.00 MB, : //id=00020002:0002, consumed=9.00 MB : // ], : // total_consumed=77.00 MB : // fraction_of_pool_total_mem=0.6 : string AdmissionController::GetLogStringForTopNQueriesInPool( : const std::string& pool_name) { : // All stats in pool_stats are the starting point to collect top N queries. : PoolStats* pool_stats = GetPoolStats(pool_name, true); : : std::vector listOfTopNs; : : // Collect for local stats : const TPoolStats& local = pool_stats->local_stats(); : for (auto& query : local.heavy_memory_queries) { : listOfTopNs.emplace_back( : Item(query.memory_consumed, host_id_, query.queryId, nullptr)); : } > Currently we report top mem usage per host. This is due to the following fa An aggregated version over all hosts for a pool is in place. -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 32 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 10 Aug 2020 20:56:32 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 32: (1 comment) http://gerrit.cloudera.org:8080/#/c/16220/32/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/16220/32/be/src/scheduling/admission-controller.cc@510 PS32, Line 510: ReportTopNQueriesAtIndices(ss, listOfAggregatedItems, indices, indent, total_mem_consumed); line too long (93 > 90) -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 32 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 10 Aug 2020 20:56:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Qifan Chen has uploaded a new patch set (#32). ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. IMPALA-9989 Improve admission control pool stats logging This work addresses the current limitation in admission controller by appending the last known memory consumption statistics about a pool to the existing memory exhaustion message. The statistics is logged in impalad.INFO when a query is queued or timed out due to memory pressure in the pool or on the host. The statistics can also be part of the query profile. The BNF of the new memory consumption statistics is as follows. topN_query_stats ::= queries: a list of query Ids and memory consumed for up to 5 queries with top memory consumptions total_consumed: total memory consumed by these topN queries fraction_of_pool_total_mem: total memory consumed divided by pool memory usage (if feasible to report) all_query_stats ::= num_running: the total number of queries running min: the minimal memory consumption of all running queries max: the maximal memory consumption of all running queries total: the total memory consumption of all running queries average: the average memory consumption of all running queries (if feasible to report) pool_stats ::= ':' pool_stats_on_host ::= List of aggregated_pool_stats ::= ':' memory_consumption_statistics ::= | The pool_stats_on_host describes memory consumption in all pools on a host and is useful in analyzing memory exhaustion on that host. The aggregated_pool_stats describes the aggregated memory consumption on all hosts for a pool for a set of queries and is useful in analyzing memory exhaustion in that pool. Example of pool_stats_on_host for pool root.queueB and root.queueC on some host: pool_name=root.queueB: topN_query_stats: queries=[ id=0001:0004, consumed=20.00 MB, id=0001:0003, consumed=19.00 MB, id=0001:0002, consumed=8.00 MB ], total_consumed=47.00 MB fraction_of_pool_total_mem=0.47 all_query_stats: num_running=4, min=5.00 MB, max=20.00 MB, pool_total_mem=100.00 MB, average=25.00 MB pool_name=root.queueC: topN_query_stats: queries=[ id=0002:, consumed=18.00 MB, id=0002:0001, consumed=12.00 MB ], total_consumed=30.00 MB fraction_of_pool_total_mem=0.06 all_query_stats: num_running=40, min=10.00 MB, max=200.00 MB, pool_total_mem=500.00 MB, average=12.50 MB Example of aggregated_pool_stats over all hosts for pool root.queueC: Aggregated stats for pool_name=root.queueC: topN_query_stats: queries=[ id=00020002:0001, consumed=20.00 MB, id=00020002:0004, consumed=18.00 MB, id=00010002:, consumed=18.00 MB, id=00010002:0001, consumed=12.00 MB, id=00020002:0002, consumed=9.00 MB ], total_consumed=77.00 MB fraction_of_pool_total_mem=0.6 When a query request is queued due to memory exhaustion, the above memory_consumption_statistics is logged when the logging is set at level 2 or higher. When a query request is timed out due to memory exhaustion, the above memory_consumption_statistics is reported when the logging is set at level 1 or higher. Testing: 1. Added a new test TopNQueryCheck in admission-controller-test.cc to verify that the topN query memory consumption details are reported correctly. 2. Add two new tests in test_admission_controller.py to simulate queries being queued and then timed out due to pool or host memory pressure. 3. Core tests. Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 --- M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/scheduling/admission-controller-test.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/util/container-util.h M common/thrift/StatestoreService.thrift M tests/custom_cluster/test_admission_controller.py 8 files changed, 864 insertions(+), 45 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/16220/32 -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 27: (19 comments) http://gerrit.cloudera.org:8080/#/c/16220/27//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16220/27//COMMIT_MSG@77 PS27, Line 77: loggerd nit: typo http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/runtime/mem-tracker.h File be/src/runtime/mem-tracker.h: http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/runtime/mem-tracker.h@448 PS27, Line 448: /// memory tracked by query memory trackers. The top element in the queue is the nit: by all children query mem trackers. http://gerrit.cloudera.org:8080/#/c/16220/23/be/src/runtime/mem-tracker.cc File be/src/runtime/mem-tracker.cc: http://gerrit.cloudera.org:8080/#/c/16220/23/be/src/runtime/mem-tracker.cc@481 PS23, Line 481: MemTracker* MemTracker::GetRootMemTracker() { : MemTracker* ancestor = this; : while (ancestor && ancestor->parent()) { : ancestor = ancestor->parent(); : } : return ancestor; : } is this used anywhere? http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/runtime/mem-tracker.cc File be/src/runtime/mem-tracker.cc: http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/runtime/mem-tracker.cc@422 PS27, Line 422: UpdatePoolStatsForQueries can you also add a test for this in mem-tracker-test http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/runtime/mem-tracker.cc@458 PS27, Line 458: else { Add a DCHECK(tracker->is_query_mem_tracker_) to make sure these stats are collected only for query memtrackers since they dont make sense for trackers lower in the mem tracker hierarchy. Also mention in the method comment for UpdatePoolStatsForQueries() that it should only be called for mem-trackers that are either query mem trackers or higher in the mem tracker hierarchy. http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller-test.cc File be/src/scheduling/admission-controller-test.cc: http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller-test.cc@972 PS27, Line 972: void hook_me(const char* hook) : { : bool echo = false; : FILE* fp = nullptr; : : while (!(fp = fopen(hook, "r"))) { : if ( !echo ) { :std::cout << "gdb -pid " << getpid() << ", then touch " << hook << std::endl; :echo = true; : } : } : fclose(fp); : } remove? http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.h File be/src/scheduling/admission-controller.h: http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.h@735 PS27, Line 735: contains nit: contain http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.h@1055 PS27, Line 1055: /// A helper type to glue information together to compute the topN queries : /// out of topM queries. : typedef std::tuple Item; : const int64_t& getMemConsumed(const Item& item) const { return std::get<0>(item); } : const string& getName(const Item& item) const { return std::get<1>(item); } : const TUniqueId& getTUniqueId(const Item& item) const { return std::get<2>(item); } : const TPoolStats* getTPoolStats(const Item& item) const { return std::get<3>(item); } Guess i am a bit biased for using structs since i am kinda used to using them in this codebase, but this seems good too. http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.cc@272 PS27, Line 272: DebugPoolStatsForConsumedMemory nit: AppendStatsForConsumedMemory http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.cc@303 PS27, Line 303: // Return a debug string for memory consumption part of the pool stats. : string AdmissionController::PoolStats::DebugPoolStatsForConsumedMemory( : const TPoolStats& stats) const { : stringstream ss; : DebugPoolStatsForConsumedMemory(ss, stats); : return ss.str(); : } since DebugPoolStats is the only method using this, we can probably get rid of it and pass the ss object in DebugPoolStats directly, since this would generate the string twice (2 seperate calls to ss.str()) http://gerrit.cloudera.org:8080/#/c/16220/27/be/src/scheduling/admission-controller.cc@361 PS27, Line 361: DebugTopNQueriesForAllPoolsInHost nit:
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 27: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6811/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 27 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 06 Aug 2020 17:46:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 26: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6810/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 26 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 06 Aug 2020 17:44:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Qifan Chen has uploaded a new patch set (#27). ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. IMPALA-9989 Improve admission control pool stats logging This work addresses the current limitation in admission controller by appending the last known memory consumption statistics about a pool or a host to the existing memory exhaustion message. The statistics is logged in impalad.INFO when a query is queued or timed out due to memory pressure on the pool or on the host. The statistics can also be part of the query profile. The BNF of the new memory consumption statistics is as follows. topN_query_stats ::= queries: a list of query Ids for up to 5 queries with top memory consumptions total_mem_consumed: total memory consumed by these topN queries percentage_mem_consumed_per_pool: total memory consumed divided by pool memory usage (if feasible to report) all_query_stats ::= min: the minimal memory consumption of all running queries max: the maximal memory consumption of all running queries total: the total memory consumption of all running queries average: the average memory consumption of all running queries (if feasible to report) pool_stats_per_host ::= ':' pool_stats ::= List of host_stats_per_pool ::= ':' host_stats ::= List of memory_consumption_statistics ::= | The pool_stats describes memory consumption in all pools in a host and is useful in analyzing memory exhaustion in that host. The host_stats describes the memory consumption in all hosts for a pool and is useful in analyzing memory exhaustion in that pool. Example of pool_stats_per_host: pool_name=root.queueD: topN_query_stats: queries=[ 0003:0012, 0003:0011 ], total_mem_consumed=18.00 MB fraction_of_pool_total_mem=0.19 all_query_stats: num_running=20, min=1.00 MB, max=9.00 MB, total_mem_consumed=95.00 MB, average=4.75 MB Example of host_stats_per_pool: host_name=host2:25000: topN_query_stats: queries=[ 00020002:0001, 00020002:0002, 00020002:, 00020002:0004 ], total_mem_consumed=55.00 MB When a query request is queued due to memory exhaustion, the above memory_consumption_statistics is loggerd when the logging is set at level 2 or higher. When a query request is timed out due to memory exhaustion, the above memory_consumption_statistics is reported when the logging is set at level 1 or higher. Testing: 1. Added a new test TopNQueryCheck in admission-controller-test.cc to verify that the topN query memory consumption details are reported correctly. 2. Add two new tests in test_admission_controller.py to simulate queries being queued and then timed out due to pool or host memory pressure. 3. Core tests. Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 --- M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/scheduling/admission-controller-test.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/util/container-util.h M common/thrift/StatestoreService.thrift M tests/custom_cluster/test_admission_controller.py 8 files changed, 885 insertions(+), 45 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/16220/27 -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 27 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Qifan Chen has uploaded a new patch set (#26). ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. IMPALA-9989 Improve admission control pool stats logging This work addresses the current limitation in admission controller by appending the last known memory consumption statistics about a pool or a host to the existing memory exhaustion message. The statistics is logged in impalad.INFO when a query is queued or timed out due to memory pressure on the pool or on the host. The statistics can also be part of the query profile. The BNF of the new memory consumption statistics is as follows. topN_query_stats ::= queries: a list of query Ids for up to 5 queries with top memory consumptions total_mem_consumed: total memory consumed by these topN queries percentage_mem_consumed_per_pool: total memory consumed divided by pool memory usage (if feasible to report) all_query_stats ::= min: the minimal memory consumption of all running queries max: the maximal memory consumption of all running queries total: the total memory consumption of all running queries average: the average memory consumption of all running queries (if feasible to report) pool_stats_per_host ::= ':' pool_stats ::= List of host_stats_per_pool ::= ':' host_stats ::= List of memory_consumption_statistics ::= | The pool_stats describes memory consumption in all pools in a host and is useful in analyzing memory exhaustion in that host. The host_stats describes the memory consumption in all hosts for a pool and is useful in analyzing memory exhaustion in that pool. Example of pool_stats_per_host: pool_name=root.queueD: topN_query_stats: queries=[ 0003:0012, 0003:0011 ], total_mem_consumed=18.00 MB fraction_of_pool_total_mem=0.19 all_query_stats: num_running=20, min=1.00 MB, max=9.00 MB, total_mem_consumed=95.00 MB, average=4.75 MB Example of host_stats_per_pool: host_name=host2:25000: topN_query_stats: queries=[ 00020002:0001, 00020002:0002, 00020002:, 00020002:0004 ], total_mem_consumed=55.00 MB When a query request is queued due to memory exhaustion, the above memory_consumption_statistics is loggerd when the logging is set at level 2 or higher. When a query request is timed out due to memory exhaustion, the above memory_consumption_statistics is reported when the logging is set at level 1 or higher. Testing: 1. Added a new test TopNQueryCheck in admission-controller-test.cc to verify that the topN query memory consumption details are reported correctly. 2. Add two new tests in test_admission_controller.py to simulate queries being queued and then timed out due to pool or host memory pressure. 3. Core tests. Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 --- M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/scheduling/admission-controller-test.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/util/container-util.h M common/thrift/StatestoreService.thrift M tests/custom_cluster/test_admission_controller.py 8 files changed, 885 insertions(+), 45 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/16220/26 -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 26 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 25: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6809/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 25 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 06 Aug 2020 15:12:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 24: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6808/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 24 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 06 Aug 2020 15:08:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 25: (2 comments) http://gerrit.cloudera.org:8080/#/c/16220/25/tests/custom_cluster/test_admission_controller.py File tests/custom_cluster/test_admission_controller.py: http://gerrit.cloudera.org:8080/#/c/16220/25/tests/custom_cluster/test_admission_controller.py@896 PS25, Line 896: " flake8: E122 continuation line missing indentation or outdented http://gerrit.cloudera.org:8080/#/c/16220/25/tests/custom_cluster/test_admission_controller.py@923 PS25, Line 923: " flake8: E122 continuation line missing indentation or outdented -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 25 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 06 Aug 2020 14:42:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Qifan Chen has uploaded a new patch set (#25). ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. IMPALA-9989 Improve admission control pool stats logging This work addresses the current limitation in admission controller by appending the last known memory consumption statistics about a pool or a host to the existing memory exhaustion message. The message is logged in impalad.INFO when a query is queued or timed out due to memory pressure on the pool or on the host. This new memory consumption statistics covers the following content: topN_query_stats ::= queries: a list of query Ids for up to 5 queries with top memory consumptions total_mem_consumed: total memory consumed by these topN queries percentage_mem_consumed_per_pool: total memory consumed divided by pool memory usage (if feasible to report) all_query_stats ::= min: the minimal memory consumption of all running queries max: the maximal memory consumption of all running queries total: the total memory consumption of all running queries average: the average memory consumption of all running queries (if feasible to report) pool_stats_per_host ::= : pool_stats::= List of host_stats_per_pool ::= : host_stats::= List of memory_consumption_statistics ::= | pool_stats describes memory consumption in all pools in a host and is useful in analyzing memory exhaustion in that host. host_stats describes the memory consumption for all hosts in a pool and is useful in analyzing memory exhaustion in that pool. Example of pool_stats_per_host: pool_name=root.queueD: topN_query_stats: queries=[ 0003:0012, 0003:0011 ], total_mem_consumed=18.00 MB fraction_of_pool_total_mem=0.19 all_query_stats: num_running=20, min=1.00 MB, max=9.00 MB, total_mem_consumed=95.00 MB, average=4.75 MB Example of host_stats_per_pool: host_name=host2:25000: topN_query_stats: queries=[ 00020002:0001, 00020002:0002, 00020002:, 00020002:0004 ], total_mem_consumed=55.00 MB When a query request is queued due to memory exhaustion, the above memory_consumption_statistics is loggerd when the logging is set at level 2 or higher. When a query request is timed out due to memory exhaustion, the above memory_consumption_statistics is reported when the logging is set at level 1 or higher. Testing: 1. Added a new test TopNQueryCheck in admission-controller-test.cc to verify that the topN query memory consumption details are reported correctly. 2. Add two new tests in test_admission_controller.py to simulate queries being queued and then timed out due to pool or host memory pressure. 3. Core tests. Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 --- M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/scheduling/admission-controller-test.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/util/container-util.h M common/thrift/StatestoreService.thrift M tests/custom_cluster/test_admission_controller.py 8 files changed, 885 insertions(+), 45 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/16220/25 -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 25 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 24: (2 comments) http://gerrit.cloudera.org:8080/#/c/16220/24/tests/custom_cluster/test_admission_controller.py File tests/custom_cluster/test_admission_controller.py: http://gerrit.cloudera.org:8080/#/c/16220/24/tests/custom_cluster/test_admission_controller.py@896 PS24, Line 896: " flake8: E122 continuation line missing indentation or outdented http://gerrit.cloudera.org:8080/#/c/16220/24/tests/custom_cluster/test_admission_controller.py@923 PS24, Line 923: " flake8: E122 continuation line missing indentation or outdented -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 24 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 06 Aug 2020 14:38:28 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Qifan Chen has uploaded a new patch set (#24). ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. IMPALA-9989 Improve admission control pool stats logging This work addresses the current limitation in admission controller by appending the last known memory consumption statistics about a pool or a host to the existing memory exhaustion message. The message is logged in impalad.INFO when a query is queued or timed out due to memory pressure on the pool or on the host. This new memory consumption statistics covers the following content: topN_query_stats ::= queries: a list of query Ids for up to 5 queries with top memory consumptions total_mem_consumed: total memory consumed by these topN queries percentage_mem_consumed_per_pool: total memory consumed divided by pool memory usage (if feasible to report) all_query_stats ::= min: the minimal memory consumption of all running queries max: the maximal memory consumption of all running queries total: the total memory consumption of all running queries average: the average memory consumption of all running queries (if feasible to report) pool_stats_per_host ::= : pool_stats::= List of host_stats_per_pool ::= : host_stats::= List of memory_consumption_statistics ::= | pool_stats describes memory consumption in all pools in a host and is useful in analyzing memory exhaustion in that host. host_stats describes the memory consumption for all hosts in a pool and is useful in analyzing memory exhaustion in that pool. Example of pool_stats_per_host: pool_name=root.queueD: topN_query_stats: queries=[ 0003:0012, 0003:0011 ], total_mem_consumed=18.00 MB fraction_of_pool_total_mem=0.19 all_query_stats: num_running=20, min=1.00 MB, max=9.00 MB, total_mem_consumed=95.00 MB, average=4.75 MB Example of host_stats_per_pool: host_name=host2:25000: topN_query_stats: queries=[ 00020002:0001, 00020002:0002, 00020002:, 00020002:0004 ], total_mem_consumed=55.00 MB When a query request is queued due to memory exhaustion, the above memory_consumption_statistics is loggerd when the logging is set at level 2 or higher. When a query request is timed out due to memory exhaustion, the above memory_consumption_statistics is reported when the logging is set at level 1 or higher. Testing: 1. Added a new test TopNQueryCheck in admission-controller-test.cc to verify that the topN query memory consumption details are reported correctly. 2. Add two new tests in test_admission_controller.py to simulate queries being queued and then timed out due to pool or host memory pressure. 3. Core tests. Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 --- M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/scheduling/admission-controller-test.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/util/container-util.h M common/thrift/StatestoreService.thrift M tests/custom_cluster/test_admission_controller.py 8 files changed, 885 insertions(+), 45 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/16220/24 -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 24 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 23: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6798/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 23 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 05 Aug 2020 21:42:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Qifan Chen has uploaded a new patch set (#23). ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. IMPALA-9989 Improve admission control pool stats logging This work addresses the current limitation in admission controller by appending the last known memory consumption statistics about a pool or a host to the existing memory exhaustion message. The message is logged in impalad.INFO when a query is queued or timed out due to memory pressure on the pool or on the host. This new memory consumption statistics covers the following content: topN_query_stats ::= queries: a list of query Ids for up to 5 queries with top memory consumptions total_mem_consumed: total memory consumed by these topN queries percentage_mem_consumed_per_pool: total memory consumed divided by pool memory usage (if feasible to report) all_query_stats ::= min: the minimal memory consumption of all running queries max: the maximal memory consumption of all running queries total: the total memory consumption of all running queries average: the average memory consumption of all running queries (if feasible to report) pool_stats_per_host ::= : pool_stats::= List of host_stats_per_pool ::= : host_stats::= List of memory_consumption_statistics ::= | pool_stats describes memory consumption in all pools in a host and is useful in analyzing memory exhaustion in that host. host_stats describes the memory consumption for all hosts in a pool and is useful in analyzing memory exhaustion in that pool. Example of pool_stats_per_host: pool_name=root.queueD: topN_query_stats: queries=[ 0003:0012, 0003:0011 ], total_mem_consumed=18.00 MB fraction_of_pool_total_mem=0.19 all_query_stats: num_running=20, min=1.00 MB, max=9.00 MB, total_mem_consumed=95.00 MB, average=4.75 MB Example of host_stats_per_pool: host_name=host2:25000: topN_query_stats: queries=[ 00020002:0001, 00020002:0002, 00020002:, 00020002:0004 ], total_mem_consumed=55.00 MB When a query request is queued due to memory exhaustion, the above memory_consumption_statistics is loggerd when the logging is set at level 2 or higher. When a query request is timed out due to memory exhaustion, the above memory_consumption_statistics is reported when the logging is set at level 1 or higher. Testing: 1. Added a new test TopNQueryCheck in admission-controller-test.cc to verify that the topN query memory consumption details are reported correctly. 2. Add two new tests in test_admission_controller.py to simulate queries being queued and then timed out due to pool or host memory pressure. 3. Core tests. Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 --- M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/scheduling/admission-controller-test.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/util/container-util.h M common/thrift/StatestoreService.thrift M tests/custom_cluster/test_admission_controller.py 8 files changed, 876 insertions(+), 45 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/16220/23 -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 23 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 23: (11 comments) http://gerrit.cloudera.org:8080/#/c/16220/23/tests/custom_cluster/test_admission_controller.py File tests/custom_cluster/test_admission_controller.py: http://gerrit.cloudera.org:8080/#/c/16220/23/tests/custom_cluster/test_admission_controller.py@896 PS23, Line 896: " flake8: E122 continuation line missing indentation or outdented http://gerrit.cloudera.org:8080/#/c/16220/23/tests/custom_cluster/test_admission_controller.py@897 PS23, Line 897: l flake8: E501 line too long (93 > 90 characters) http://gerrit.cloudera.org:8080/#/c/16220/23/tests/custom_cluster/test_admission_controller.py@905 PS23, Line 905: l flake8: E501 line too long (97 > 90 characters) http://gerrit.cloudera.org:8080/#/c/16220/23/tests/custom_cluster/test_admission_controller.py@907 PS23, Line 907: l flake8: E501 line too long (106 > 90 characters) http://gerrit.cloudera.org:8080/#/c/16220/23/tests/custom_cluster/test_admission_controller.py@909 PS23, Line 909: \ flake8: E501 line too long (108 > 90 characters) http://gerrit.cloudera.org:8080/#/c/16220/23/tests/custom_cluster/test_admission_controller.py@919 PS23, Line 919: " flake8: E122 continuation line missing indentation or outdented http://gerrit.cloudera.org:8080/#/c/16220/23/tests/custom_cluster/test_admission_controller.py@920 PS23, Line 920: l flake8: E501 line too long (93 > 90 characters) http://gerrit.cloudera.org:8080/#/c/16220/23/tests/custom_cluster/test_admission_controller.py@928 PS23, Line 928: l flake8: E501 line too long (97 > 90 characters) http://gerrit.cloudera.org:8080/#/c/16220/23/tests/custom_cluster/test_admission_controller.py@929 PS23, Line 929: N flake8: E501 line too long (108 > 90 characters) http://gerrit.cloudera.org:8080/#/c/16220/23/tests/custom_cluster/test_admission_controller.py@930 PS23, Line 930: l flake8: E501 line too long (106 > 90 characters) http://gerrit.cloudera.org:8080/#/c/16220/23/tests/custom_cluster/test_admission_controller.py@932 PS23, Line 932: \ flake8: E501 line too long (108 > 90 characters) -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 23 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 05 Aug 2020 21:13:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 22: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6786/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 22 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 04 Aug 2020 13:34:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Qifan Chen has uploaded a new patch set (#22). ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. IMPALA-9989 Improve admission control pool stats logging This work addresses the current limitation in admission controller by appending the last known memory consumption statistics about a pool or a host to the existing memory exhaustion message. The message is logged in impalad.INFO when a query is queued or timed out due to memory pressure on the pool or on the host. This new memory consumption statistics covers the following content: topN_query_stats ::= queries: a list of query Ids for up to 5 queries with top memory consumptions total_mem_consumed: total memory consumed by these topN queries percentage_mem_consumed_per_pool: total memory consumed divided by pool memory usage (if feasible to report) all_query_stats ::= min: the minimal memory consumption of all running queries max: the maximal memory consumption of all running queries total: the total memory consumption of all running queries average: the average memory consumption of all running queries (if feasible to report) pool_stats_per_host ::= : pool_stats::= List of host_stats_per_pool ::= : host_stats::= List of memory_consumption_statistics ::= | pool_stats describes memory consumption in all pools in a host and is useful in analyzing memory exhaustion in that host. host_stats describes the memory consumption for all hosts in a pool and is useful in analyzing memory exhaustion in that pool. Example of pool_stats_per_host: pool_name=root.queueD: topN_query_stats: queries=[ 0003:0012, 0003:0011 ], total_mem_consumed=18.00 MB fraction_of_pool_total_mem=0.19 all_query_stats: num_running=20, min=1.00 MB, max=9.00 MB, total_mem_consumed=95.00 MB, average=4.75 MB Example of host_stats_per_pool: host_name=host2:25000: topN_query_stats: queries=[ 00020002:0001, 00020002:0002, 00020002:, 00020002:0004 ], total_mem_consumed=55.00 MB When a query request is queued due to memory exhaustion, the above memory_consumption_statistics is loggerd when the logging is set at level 2 or higher. When a query request is timed out due to memory exhaustion, the above memory_consumption_statistics is reported when the logging is set at level 1 or higher. Testing: 1. Added a new test TopNQueryCheck in admission-controller-test.cc to simulate queries running in 4 pools in 3 hosts. This new test identifies the following: a. Top 5 queries among 4 pools in host 0; a. Top 5 queries among 4 pools in host 1; c. Top 5 queries among 3 hosts for a pool. 2. Core tests. Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 --- M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/scheduling/admission-controller-test.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/util/container-util.h M common/thrift/StatestoreService.thrift 7 files changed, 828 insertions(+), 45 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/16220/22 -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 22 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 21: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6772/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 21 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 03 Aug 2020 18:27:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Qifan Chen has uploaded a new patch set (#21). ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. IMPALA-9989 Improve admission control pool stats logging This work addresses the current limitation in admission controller by appending the last known memory consumption statistics about a pool or a host to the existing memory exhaustion message. The message is logged in impalad.INFO when a query is queued or timed out due to memory pressure on the pool or on the host. This new memory consumption statistics covers the following content: topN_query_stats ::= queries: a list of query Ids for up to 5 queries with top memory consumptions total_mem_consumed: total memory consumed by these topN queries percentage_mem_consumed_per_pool: total memory consumed divided by pool memory usage (if feasible to report) all_query_stats ::= min: the minimal memory consumption of all running queries max: the maximal memory consumption of all running queries total: the total memory consumption of all running queries average: the average memory consumption of all running queries (if feasible to report) pool_stats_per_host ::= : pool_stats::= List of host_stats_per_pool ::= : host_stats::= List of memory_consumption_statistics ::= | pool_stats describes memory consumption in all pools in a host and is useful in analyzing memory exhaustion in that host. host_stats describes the memory consumption for all hosts in a pool and is useful in analyzing memory exhaustion in that pool. Example of pool_stats_per_host: pool_name=root.queueD: topN_query_stats: queries=[ 0003:0012, 0003:0011 ], total_mem_consumed=18.00 MB fraction_of_pool_total_mem=0.19 all_query_stats: num_running=20, min=1.00 MB, max=9.00 MB, total_mem_consumed=95.00 MB, average=4.75 MB Example of host_stats_per_pool: host_name=host2:25000: topN_query_stats: queries=[ 00020002:0001, 00020002:0002, 00020002:, 00020002:0004 ], total_mem_consumed=55.00 MB When a query request is queued due to memory exhaustion, the above memory_consumption_statistics is loggerd when the logging is set at level 2 or higher. When a query request is timed out due to memory exhaustion, the above memory_consumption_statistics is reported when the logging is set at level 1 or higher. Testing: 1. Added a new test TopNQueryCheck in admission-controller-test.cc to simulate queries running in 4 pools in 3 hosts. This new test identifies the following: a. Top 5 queries among 4 pools in host 0; a. Top 5 queries among 4 pools in host 1; c. Top 5 queries among 3 hosts for a pool. 2. Core tests. Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 --- M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/scheduling/admission-controller-test.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/util/container-util.h M common/thrift/StatestoreService.thrift 7 files changed, 827 insertions(+), 45 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/16220/21 -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 21 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 19: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6757/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 19 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 31 Jul 2020 19:57:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 19: (3 comments) http://gerrit.cloudera.org:8080/#/c/16220/18/be/src/scheduling/admission-controller-test.cc File be/src/scheduling/admission-controller-test.cc: http://gerrit.cloudera.org:8080/#/c/16220/18/be/src/scheduling/admission-controller-test.cc@1073 PS18, Line 1073: 5 * MEGABYTE /*min*/, 20 * MEGABYTE /*max*/, 100 * MEGABYTE /*total*/, > line too long (97 > 90) Done http://gerrit.cloudera.org:8080/#/c/16220/18/be/src/scheduling/admission-controller-test.cc@1076 PS18, Line 1076: MakePoolStats(5000, 10, 0, MakeHeavyMemoryQueryList(HOST_1, QUEUE_C, 2), > line too long (100 > 90) Done http://gerrit.cloudera.org:8080/#/c/16220/18/be/src/scheduling/admission-controller-test.cc@1079 PS18, Line 1079: AddStatsToTopic(, HOST_2, QUEUE_C, > line too long (103 > 90) Done -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 19 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 31 Jul 2020 19:31:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Qifan Chen has uploaded a new patch set (#19). ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. IMPALA-9989 Improve admission control pool stats logging This work addresses the current limitation in admission controller by appending the last known memory consumption statistics about a pool or a host to the existing memory exhaustion message. The message is logged in impalad.INFO when a query is queued or timed out due to memory pressure on the pool or on the host. This new memory consumption statistics covers the following content: topN_query_stats ::= queries: a list of query Ids for up to 5 queries with top memory consumptions total_mem_consumed: total memory consumed by these topN queries percentage_mem_consumed_per_pool: total memory consumed divided by pool memory usage (if feasible to report) all_query_stats ::= min: the minimal memory consumption of all running queries max: the maximal memory consumption of all running queries total: the total memory consumption of all running queries average: the average memory consumption of all running queries (if feasible to report) pool_stats_per_host ::= : pool_stats::= List of host_stats_per_pool ::= : host_stats::= List of memory_consumption_statistics ::= | pool_stats describes memory consumption in all pools in a host and is useful in analyzing memory exhaustion in that host. host_stats describes the memory consumption for all hosts in a pool and is useful in analyzing memory exhaustion in that pool. Example of pool_stats_per_host: pool_name=root.queueD: topN_query_stats: queries=[ 0003:0012, 0003:0011 ], total_mem_consumed=18.00 MB fraction_of_pool_total_mem=0.19 all_query_stats: num_running=20, min=1.00 MB, max=9.00 MB, total_mem_consumed=95.00 MB, average=4.75 MB Example of host_stats_per_pool: host_name=host2:25000: topN_query_stats: queries=[ 00020002:0001, 00020002:0002, 00020002:, 00020002:0004 ], total_mem_consumed=55.00 MB When a query request is queued due to memory exhaustion, the above memory_consumption_statistics is loggerd when the logging is set at level 2 or higher. When a query request is timed out due to memory exhaustion, the above memory_consumption_statistics is reported when the logging is set at level 1 or higher. Testing: 1. Added a new test TopNQueryCheck in admission-controller-test.cc to simulate queries running in 4 pools in 3 hosts. This new test identifies the following: a. Top 5 queries among 4 pools in host 0; a. Top 5 queries among 4 pools in host 1; c. Top 5 queries among 3 hosts for a pool. Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 --- M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/scheduling/admission-controller-test.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/util/container-util.h M common/thrift/StatestoreService.thrift 7 files changed, 809 insertions(+), 45 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/16220/19 -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 19 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 18: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6756/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 18 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 31 Jul 2020 18:59:15 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 18: (4 comments) http://gerrit.cloudera.org:8080/#/c/16220/18/be/src/scheduling/admission-controller-test.cc File be/src/scheduling/admission-controller-test.cc: http://gerrit.cloudera.org:8080/#/c/16220/18/be/src/scheduling/admission-controller-test.cc@1073 PS18, Line 1073: 5 * MEGABYTE /*min*/, 20 * MEGABYTE /*max*/, 100 * MEGABYTE /*total*/, 4 /*running*/)); line too long (97 > 90) http://gerrit.cloudera.org:8080/#/c/16220/18/be/src/scheduling/admission-controller-test.cc@1076 PS18, Line 1076: 10 * MEGABYTE /*min*/, 200 * MEGABYTE /*max*/, 500 * MEGABYTE /*total*/, 40 /*running*/)); line too long (100 > 90) http://gerrit.cloudera.org:8080/#/c/16220/18/be/src/scheduling/admission-controller-test.cc@1079 PS18, Line 1079: 10 * MEGABYTE /*min*/, 2000 * MEGABYTE /*max*/,1 * MEGABYTE /*total*/, 100 /*running*/)); line too long (103 > 90) http://gerrit.cloudera.org:8080/#/c/16220/18/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/16220/18/be/src/scheduling/admission-controller.cc@350 PS18, Line 350: string AdmissionController::DebugTopNQueriesForAllPoolsInHost(const std::string& host_id) { line too long (91 > 90) -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 18 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 31 Jul 2020 18:31:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Qifan Chen has uploaded a new patch set (#18). ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. IMPALA-9989 Improve admission control pool stats logging This work addresses the current limitation in admission controller by appending the last known memory consumption statistics about a pool or a host to the existing memory exhaustion message. The message is logged in impalad.INFO when a query is queued or timed out due to memory pressure on the pool or on the host. This new memory consumption statistics covers the following content: topN_query_stats ::= queries: a list of query Ids for up to 5 queries with top memory consumptions total_mem_consumed: total memory consumed by these topN queries percentage_mem_consumed_per_pool: total memory consumed divided by pool memory usage (if feasible to report) all_query_stats ::= min: the minimal memory consumption of all running queries max: the maximal memory consumption of all running queries total: the total memory consumption of all running queries average: the average memory consumption of all running queries (if feasible to report) pool_stats_per_host ::= : pool_stats::= List of host_stats_per_pool ::= : host_stats::= List of memory_consumption_statistics ::= | pool_stats describes memory consumption in all pools in a host and is useful in analyzing memory exhaustion in that host. host_stats describes the memory consumption for all hosts in a pool and is useful in analyzing memory exhaustion in that pool. Example of pool_stats_per_host: pool_name=root.queueD: topN_query_stats: queries=[ 0003:0012, 0003:0011 ], total_mem_consumed=18.00 MB fraction_of_pool_total_mem=0.19 all_query_stats: num_running=20, min=1.00 MB, max=9.00 MB, total_mem_consumed=95.00 MB, average=4.75 MB Example of host_stats_per_pool: host_name=host2:25000: topN_query_stats: queries=[ 00020002:0001, 00020002:0002, 00020002:, 00020002:0004 ], total_mem_consumed=55.00 MB When a query request is queued due to memory exhaustion, the above memory_consumption_statistics is loggerd when the logging is set at level 2 or higher. When a query request is timed out due to memory exhaustion, the above memory_consumption_statistics is reported when the logging is set at level 1 or higher. Testing: 1. Added a new test TopNQueryCheck in admission-controller-test.cc to simulate queries running in 4 pools in 3 hosts. This new test identifies the following: a. Top 5 queries among 4 pools in host 0; a. Top 5 queries among 4 pools in host 1; c. Top 5 queries among 3 hosts for a pool. Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 --- M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/scheduling/admission-controller-test.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/util/container-util.h M common/thrift/StatestoreService.thrift 7 files changed, 805 insertions(+), 45 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/16220/18 -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 18 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 17: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/6755/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 17 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 31 Jul 2020 18:06:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 17: (4 comments) http://gerrit.cloudera.org:8080/#/c/16220/17/be/src/scheduling/admission-controller-test.cc File be/src/scheduling/admission-controller-test.cc: http://gerrit.cloudera.org:8080/#/c/16220/17/be/src/scheduling/admission-controller-test.cc@1073 PS17, Line 1073: 5 * MEGABYTE /*min*/, 20 * MEGABYTE /*max*/, 100 * MEGABYTE /*total*/, 4 /*running*/)); line too long (97 > 90) http://gerrit.cloudera.org:8080/#/c/16220/17/be/src/scheduling/admission-controller-test.cc@1076 PS17, Line 1076: 10 * MEGABYTE /*min*/, 200 * MEGABYTE /*max*/, 500 * MEGABYTE /*total*/, 40 /*running*/)); line too long (100 > 90) http://gerrit.cloudera.org:8080/#/c/16220/17/be/src/scheduling/admission-controller-test.cc@1079 PS17, Line 1079: 10 * MEGABYTE /*min*/, 2000 * MEGABYTE /*max*/,1 * MEGABYTE /*total*/, 100 /*running*/)); line too long (103 > 90) http://gerrit.cloudera.org:8080/#/c/16220/17/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/16220/17/be/src/scheduling/admission-controller.cc@350 PS17, Line 350: string AdmissionController::DebugTopNQueriesForAllPoolsInHost(const std::string& host_id) { line too long (91 > 90) -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 17 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 31 Jul 2020 17:39:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Qifan Chen has uploaded a new patch set (#17). ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. IMPALA-9989 Improve admission control pool stats logging This work addresses the current limitation in admission controller by appending the last known memory consumption statistics about a pool or a host to the existing memory exhaustion message. The message is logged in impalad.INFO when a query is queued or timed out due to memory pressure on the pool or on the host. This new memory consumption statistics covers the following content: topN_query_stats ::= queries: a list of query Ids for up to 5 queries with top memory consumptions total_mem_consumed: total memory consumed by these topN queries percentage_mem_consumed_per_pool: total memory consumed divided by pool memory usage (if feasible to report) all_query_stats ::= min: the minimal memory consumption of all running queries max: the maximal memory consumption of all running queries total: the total memory consumption of all running queries average: the average memory consumption of all running queries (if feasible to report) pool_stats_per_host ::= : pool_stats::= List of host_stats_per_pool ::= : host_stats::= List of memory_consumption_statistics ::= | pool_stats describes memory consumption in all pools in a host and is useful in analyzing memory exhaustion in that host. host_stats describes the memory consumption for all hosts in a pool and is useful in analyzing memory exhaustion in that pool. Example of pool_stats_per_host: pool_name=root.queueD: topN_query_stats: queries=[ 0003:0012, 0003:0011 ], total_mem_consumed=18.00 MB fraction_of_pool_total_mem=0.19 all_query_stats: num_running=20, min=1.00 MB, max=9.00 MB, total_mem_consumed=95.00 MB, average=4.75 MB Example of host_stats_per_pool: host_name=host2:25000: topN_query_stats: queries=[ 00020002:0001, 00020002:0002, 00020002:, 00020002:0004 ], total_mem_consumed=55.00 MB When a query request is queued due to memory exhaustion, the above memory_consumption_statistics is loggerd when the logging is set at level 2 or higher. When a query request is timed out due to memory exhaustion, the above memory_consumption_statistics is reported when the logging is set at level 1 or higher. Testing: 1. Added a new test TopNQueryCheck in admission-controller-test.cc to simulate queries running in 4 pools in 3 hosts. This new test identifies the following: a. Top 5 queries among 4 pools in host 0; a. Top 5 queries among 4 pools in host 1; c. Top 5 queries among 3 hosts for a pool. Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 --- M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/scheduling/admission-controller-test.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/util/container-util.h M common/thrift/StatestoreService.thrift 7 files changed, 805 insertions(+), 45 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/16220/17 -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 17 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 16: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/6754/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 16 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 31 Jul 2020 16:52:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 16: (4 comments) http://gerrit.cloudera.org:8080/#/c/16220/16/be/src/scheduling/admission-controller-test.cc File be/src/scheduling/admission-controller-test.cc: http://gerrit.cloudera.org:8080/#/c/16220/16/be/src/scheduling/admission-controller-test.cc@1073 PS16, Line 1073: 5 * MEGABYTE /*min*/, 20 * MEGABYTE /*max*/, 100 * MEGABYTE /*total*/, 4 /*running*/)); line too long (97 > 90) http://gerrit.cloudera.org:8080/#/c/16220/16/be/src/scheduling/admission-controller-test.cc@1076 PS16, Line 1076: 10 * MEGABYTE /*min*/, 200 * MEGABYTE /*max*/, 500 * MEGABYTE /*total*/, 40 /*running*/)); line too long (100 > 90) http://gerrit.cloudera.org:8080/#/c/16220/16/be/src/scheduling/admission-controller-test.cc@1079 PS16, Line 1079: 10 * MEGABYTE /*min*/, 2000 * MEGABYTE /*max*/,1 * MEGABYTE /*total*/, 100 /*running*/)); line too long (103 > 90) http://gerrit.cloudera.org:8080/#/c/16220/16/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/16220/16/be/src/scheduling/admission-controller.cc@350 PS16, Line 350: string AdmissionController::DebugTopNQueriesForAllPoolsInHost(const std::string& host_id) { line too long (91 > 90) -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 16 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 31 Jul 2020 16:31:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Qifan Chen has uploaded a new patch set (#16). ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. IMPALA-9989 Improve admission control pool stats logging This work addresses the current limitation in admission controller by appending the last known memory consumption statistics about a pool or a host to the existing memory exhaustion message. The message is logged in impalad.INFO when a query is queued or timed out due to memory pressure on the pool or on the host. This new memory consumption statistics covers the following content: topN_query_stats ::= queries: a list of query Ids for up to 5 queries with top memory consumptions total_mem_consumed: total memory consumed by these topN queries percentage_mem_consumed_per_pool: total memory consumed divided by pool memory usage (if feasible to report) all_query_stats ::= min: the minimal memory consumption of all running queries max: the maximal memory consumption of all running queries total: the total memory consumption of all running queries average: the average memory consumption of all running queries (if feasible to report) pool_stats_per_host ::= : pool_stats::= List of host_stats_per_pool ::= : host_stats::= List of memory_consumption_statistics ::= | pool_stats describes memory consumption in all pools in a host and is useful in analyzing memory exhaustion in that host. host_stats describes the memory consumption for all hosts in a pool and is useful in analyzing memory exhaustion in that pool. Example of pool_stats_per_host: pool_name=root.queueD: topN_query_stats: queries=[ 0003:0012, 0003:0011 ], total_mem_consumed=18.00 MB fraction_of_pool_total_mem=0.19 all_query_stats: num_running=20, min=1.00 MB, max=9.00 MB, total_mem_consumed=95.00 MB, average=4.75 MB Example of host_stats_per_pool: host_name=host2:25000: topN_query_stats: queries=[ 00020002:0001, 00020002:0002, 00020002:, 00020002:0004 ], total_mem_consumed=55.00 MB When a query request is queued due to memory exhaustion, the above memory_consumption_statistics is loggerd when the logging is set at level 2 or higher. When a query request is timed out due to memory exhaustion, the above memory_consumption_statistics is reported when the logging is set at level 1 or higher. Testing: 1. Added a new test TopNQueryCheck in admission-controller-test.cc to simulate queries running in 4 pools in 3 hosts. This new test identifies the following: a. Top 5 queries among 4 pools in host 0; a. Top 5 queries among 4 pools in host 1; c. Top 5 queries among 3 hosts for a pool. Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 --- M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/scheduling/admission-controller-test.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/util/container-util.h M common/thrift/StatestoreService.thrift 7 files changed, 792 insertions(+), 32 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/16220/16 -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 16 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 12: (23 comments) http://gerrit.cloudera.org:8080/#/c/16220/12//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16220/12//COMMIT_MSG@57 PS12, Line 57: percentage nit: fraction_of_pool_total_mem http://gerrit.cloudera.org:8080/#/c/16220/12//COMMIT_MSG@78 PS12, Line 78: reported logged http://gerrit.cloudera.org:8080/#/c/16220/12//COMMIT_MSG@81 PS12, Line 81: dequeued due to memory exhaustion did you mean to say that the query timed out in queue? http://gerrit.cloudera.org:8080/#/c/16220/12/be/src/runtime/mem-tracker.h File be/src/runtime/mem-tracker.h: http://gerrit.cloudera.org:8080/#/c/16220/12/be/src/runtime/mem-tracker.h@462 PS12, Line 462: ResetMemConsumedForAllMemTrackers can this be moved to the test class since its a test only functionality that a friend class can implement http://gerrit.cloudera.org:8080/#/c/16220/1/be/src/runtime/mem-tracker.cc File be/src/runtime/mem-tracker.cc: http://gerrit.cloudera.org:8080/#/c/16220/1/be/src/runtime/mem-tracker.cc@411 PS1, Line 411: : // Update the memory consumption related fields in pool_stats. : void MemTracker::UpdatePoolStatsForMemoryConsumed( : int64_t mem_consumed, TPoolStats& pool_stats) { : if (pool_stats.min_memory_consumed > mem_consume > multi-line if statements need curly braces. nit: can you address this in your next patch http://gerrit.cloudera.org:8080/#/c/16220/1/be/src/runtime/mem-tracker.cc@568 PS1, Line 568: : if (bytes_freed_by_last_gc_metric_ != NULL) { : bytes_freed_by_last_gc_metric_->SetValue(pre_gc_consumption - curr_consumption); : } : return curr_consumption > max_consumption; : } : : // Recursively append info about this memory tracker and all its children to ss. : void MemTracker::GetAllMemTracker(std::stringstream& ss, int indent) { : ss << std::string(indent, ' ') << " MemTracker: label=" << label_; : : if (!pool_name_.empty()) { : ss << ", pool_name =" << pool_name_; : } : : if (is_query_mem_tracker_) { : ss << ", qid=" << PrintId(query_id_); : } : : ss << std::endl; : indent += 3; : for (MemTracker* child : child_trackers_) { : child->GetAllMemTracker(ss, indent); : } : } : : // Return a debug string for all memory trackers reachable from the root memory : / > These two new methods are for debugging purpose which I found very useful. Historically we have avoided adding methods only for debugging probably to avoid adding dead code. http://gerrit.cloudera.org:8080/#/c/16220/12/be/src/runtime/mem-tracker.cc File be/src/runtime/mem-tracker.cc: http://gerrit.cloudera.org:8080/#/c/16220/12/be/src/runtime/mem-tracker.cc@134 PS12, Line 134: if (consumption_->current_value() != 0) { : int x = 0; : x++; : } ?? http://gerrit.cloudera.org:8080/#/c/16220/12/be/src/runtime/mem-tracker.cc@441 PS12, Line 441: query_ids nit: maybe rename this since this is no longer just the id http://gerrit.cloudera.org:8080/#/c/16220/12/be/src/scheduling/admission-controller.h File be/src/scheduling/admission-controller.h: http://gerrit.cloudera.org:8080/#/c/16220/12/be/src/scheduling/admission-controller.h@539 PS12, Line 539: typedef boost::unordered_map RemoteStatsMap; nit: can you add a comment about what the key is http://gerrit.cloudera.org:8080/#/c/16220/12/be/src/scheduling/admission-controller.h@619 PS12, Line 619: void DebugPoolStatsForConsumedMemory( : std::stringstream& ss, const TPoolStats& stats) const; : std::string DebugPoolStats(const TPoolStats& stats) const; nit: maybe add a short method comment http://gerrit.cloudera.org:8080/#/c/16220/12/be/src/scheduling/admission-controller.h@722 PS12, Line 722: /// The list of topN queries in pool/host when this request could not be admitted also add context as to under which queuing conditions this will be populated. http://gerrit.cloudera.org:8080/#/c/16220/12/be/src/scheduling/admission-controller.h@723 PS12, Line 723: topN_queries nit: I feel like calling this topN_queries makes it confusing since those can be top among pool or host depending on the not_Admitted_reason. We can probably make this more generic by calling it not_admitted_details and write the comment appropriately describing the context about when this is populated. This would also
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 12: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/6734/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 12 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 29 Jul 2020 21:38:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 12: (1 comment) http://gerrit.cloudera.org:8080/#/c/16220/12/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/16220/12/be/src/scheduling/admission-controller.cc@1207 PS12, Line 1207: queue_node.not_admitted_reason + " top memory consuming queries: " + queue_node.topN_queries); line too long (108 > 90) -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 12 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 29 Jul 2020 21:20:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Qifan Chen has uploaded a new patch set (#12). ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. IMPALA-9989 Improve admission control pool stats logging This work addresses the current limitation in admission controller by appending the last known memory consumption statistics about a pool or a host to the existing memory exhaustion message. The message is logged in impalad.INFO when a query is queued or timed out due to memory pressure on the pool or on the host. This new memory consumption statistics covers the following content: topN_query_stats ::= queries: a list of query Ids for up to 5 queries with top memory consumptions total_mem_consumed: total memory consumed by these topN queries percentage_mem_consumed_per_pool: total memory consumed divided by pool memory usage (if feasible to report) all_query_stats ::= min: the minimal memory consumption of all running queries max: the maximal memory consumption of all running queries total: the total memory consumption of all running queries average: the average memory consumption of all running queries (if feasible to report) pool_stats_per_host ::= : pool_stats::= List of host_stats_per_pool ::= : host_stats::= List of memory_consumption_statistics ::= | pool_stats describes memory consumption in all pools in a host and is useful in analyzing memory exhaustion in a host, while host_stats describes the memory consumption for all hosts in a pool and is useful in analyzing memory exhaustion in a pool. Example of pool_stats_per_host: pool_name=root.queueD: topN_query_stats: queries=[ 0003:0012, 0003:0011 ], total_mem_consumed=18.00 MB percentage_of_pool_total=0.19 all_query_stats: num_running=20, min=1.00 MB, max=9.00 MB, total_mem_consumed=95.00 MB, average=4.75 MB Example of host_stats_per_pool: host_name=host2:25000: topN_query_stats: queries=[ 00020002:0001, 00020002:0002, 00020002:, 00020002:0004 ], total_mem_consumed=55.00 MB When a query request is queued due to memory exhaustion, the above memory_consumption_statistics is reported when the logging is set at level 2 or higher. When a query request is dequeued due to memory exhaustion, the above memory_consumption_statistics is reported when the logging is set at level 1 or higher. Testing: 1. Added a new test TopNQueryCheck in admission-controller-test.cc to simulate queries running in 4 pools in 3 hosts. This new test identifies the following: a. Top 5 queries among 4 pools in local host; b. Top 5 queries among 3 hosts for a pool. Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 --- M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/scheduling/admission-controller-test.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/util/container-util.h M common/thrift/StatestoreService.thrift 7 files changed, 764 insertions(+), 34 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/16220/12 -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 12 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 9: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/6725/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 9 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 28 Jul 2020 22:33:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. Patch Set 9: (3 comments) http://gerrit.cloudera.org:8080/#/c/16220/9/be/src/scheduling/admission-controller.h File be/src/scheduling/admission-controller.h: http://gerrit.cloudera.org:8080/#/c/16220/9/be/src/scheduling/admission-controller.h@1034 PS9, Line 1034: // Append a new string to 'ss' describing queries running in a pool on a host. line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/16220/9/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/16220/9/be/src/scheduling/admission-controller.cc@814 PS9, Line 814: // Find info about the top-N queries with most memory consumption from all line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/16220/9/be/src/scheduling/admission-controller.cc@1592 PS9, Line 1592: // the total memory consumption, and the number of all queries running on this line has trailing whitespace -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 9 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 28 Jul 2020 22:06:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9989 Improve admission control pool stats logging
Qifan Chen has uploaded a new patch set (#9). ( http://gerrit.cloudera.org:8080/16220 ) Change subject: IMPALA-9989 Improve admission control pool stats logging .. IMPALA-9989 Improve admission control pool stats logging This work addresses the current limitation in admission controller by appending the last known memory consumption statistics about a pool or a host to the existing host memory exhaustion message. The message is logged in impalad.INFO when a query is queued or timed out due to memory pressure on the pool or on the host. This new memory consumption statistics covers the following content: topN_query_stats::= queries: a list of query Ids for up to 5 queries with top memory consumptions total_mem_consumed: total memory consumed by these topN queries percentage_mem_consumed_per_pool: total memory consumed divided by pool memory usage (if feasible to report) all_query_stats::= min: the minimal memory consumption of all running queries max: the maximal memory consumption of all running queries total: the total memory consumption of all running queries average: the average memory consumption of all running queries (if feasible to report) pool_stats_per_host::= : pool_stats::= List of host_stats_per_pool::= : host_stats::= List of pool_stats will be logged when the memory demand exceeds the host memory limit. host_stats will be logged when the memory demand exceeds the pool memory limit. Testing: 1. Added one new test TopNQueryCheck in admission-controller-test.cc to simulate queries running in 4 pools in 3 hosts. This new test will test the following: a. Identifying top 5 queries among 4 pools in local host; b. Identifying top 5 queries among 3 hosts for a pool. Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 --- M be/src/runtime/mem-tracker.cc M be/src/runtime/mem-tracker.h M be/src/scheduling/admission-controller-test.cc M be/src/scheduling/admission-controller.cc M be/src/scheduling/admission-controller.h M be/src/util/container-util.h M common/thrift/StatestoreService.thrift 7 files changed, 751 insertions(+), 26 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/16220/9 -- To view, visit http://gerrit.cloudera.org:8080/16220 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Id995a9d044082c3b8f044e1ec25bb4c64347f781 Gerrit-Change-Number: 16220 Gerrit-PatchSet: 9 Gerrit-Owner: Qifan Chen Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Qifan Chen Gerrit-Reviewer: Sahil Takiar Gerrit-Reviewer: Tim Armstrong