[Impala-ASF-CR] IMPALA-11068: Add query option to reduce scanner thread launch.
Riza Suminto has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/18126 ) Change subject: IMPALA-11068: Add query option to reduce scanner thread launch. .. IMPALA-11068: Add query option to reduce scanner thread launch. Under heavy decompression workload, Impala running with scanner thread parallelism (MT_DOP=0) can still hit OOM error due to launching too many threads too soon. We have logic in ScannerMemLimiter to limit the number of scanner threads by calculating the thread's memory requirement and estimating the memory growth rate of all threads. However, it does not prevent a scanner node from quickly launching many threads and immediately reaching the memtracker's spare capacity. Even after ScannerMemLimiter rejects a new thread launch, some existing threads might continue increasing their non-reserved memory for decompression work until the memory limit exceeded. IMPALA-7096 adds hdfs_scanner_thread_max_estimated_bytes flag as a heuristic to count for non-reserved memory growth. Increasing this flag value can help reduce thread count, but might severely regress other queries that do not have heavy decompression characteristics. Similarly with lowering the NUM_SCANNER_THREADS query option. This patch adds one more query option as an alternative to mitigate OOM called HDFS_SCANNER_NON_RESERVED_BYTES. This option is intended to offer the same control as hdfs_scanner_thread_max_estimated_bytes, but as a query option such that tuning can be done at per query granularity. If this query option not set, set to 0, or negative value, backend will revert to use the value of hdfs_scanner_thread_max_estimated_bytes flag. Testing: - Add test case in query-options-test.cc and TestScanMemLimit::test_hdfs_scanner_thread_mem_scaling. Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 Reviewed-on: http://gerrit.cloudera.org:8080/18126 Reviewed-by: Csaba Ringhofer Tested-by: Impala Public Jenkins --- M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift M testdata/workloads/functional-query/queries/QueryTest/hdfs-scanner-thread-mem-scaling.test 8 files changed, 89 insertions(+), 6 deletions(-) Approvals: Csaba Ringhofer: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/18126 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 Gerrit-Change-Number: 18126 Gerrit-PatchSet: 8 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] IMPALA-11068: Add query option to reduce scanner thread launch.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18126 ) Change subject: IMPALA-11068: Add query option to reduce scanner thread launch. .. Patch Set 7: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/18126 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 Gerrit-Change-Number: 18126 Gerrit-PatchSet: 7 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Sun, 08 Oct 2023 23:38:52 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11068: Add query option to reduce scanner thread launch.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18126 ) Change subject: IMPALA-11068: Add query option to reduce scanner thread launch. .. Patch Set 7: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9796/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/18126 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 Gerrit-Change-Number: 18126 Gerrit-PatchSet: 7 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Sun, 08 Oct 2023 19:01:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11068: Add query option to reduce scanner thread launch.
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/18126 ) Change subject: IMPALA-11068: Add query option to reduce scanner thread launch. .. Patch Set 7: Code-Review+2 (1 comment) http://gerrit.cloudera.org:8080/#/c/18126/1/be/src/runtime/scanner-mem-limiter.cc File be/src/runtime/scanner-mem-limiter.cc: http://gerrit.cloudera.org:8080/#/c/18126/1/be/src/runtime/scanner-mem-limiter.cc@90 PS1, Line 90: if (node == element.first) found_scan = scan.get(); > After some internal discussion, we come up with conclusion that query optio Thanks for adding the query option! -- To view, visit http://gerrit.cloudera.org:8080/18126 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 Gerrit-Change-Number: 18126 Gerrit-PatchSet: 7 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Sat, 07 Oct 2023 14:16:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11068: Add query option to reduce scanner thread launch.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18126 ) Change subject: IMPALA-11068: Add query option to reduce scanner thread launch. .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/14151/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/18126 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 Gerrit-Change-Number: 18126 Gerrit-PatchSet: 7 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Sat, 07 Oct 2023 01:04:40 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11068: Add query option to reduce scanner thread launch.
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/18126 ) Change subject: IMPALA-11068: Add query option to reduce scanner thread launch. .. Patch Set 7: (1 comment) http://gerrit.cloudera.org:8080/#/c/18126/6/be/src/service/query-options.cc File be/src/service/query-options.cc: http://gerrit.cloudera.org:8080/#/c/18126/6/be/src/service/query-options.cc@1146 PS6, Line 1146: query_options->__set_mem_limit_coordinators(mem_spec_val.value); > A break is missing from the end of the block here. Done -- To view, visit http://gerrit.cloudera.org:8080/18126 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 Gerrit-Change-Number: 18126 Gerrit-PatchSet: 7 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Sat, 07 Oct 2023 00:35:38 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11068: Add query option to reduce scanner thread launch.
Hello Kurt Deschler, Daniel Becker, Joe McDonnell, Csaba Ringhofer, Michael Smith, Wenzhe Zhou, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/18126 to look at the new patch set (#7). Change subject: IMPALA-11068: Add query option to reduce scanner thread launch. .. IMPALA-11068: Add query option to reduce scanner thread launch. Under heavy decompression workload, Impala running with scanner thread parallelism (MT_DOP=0) can still hit OOM error due to launching too many threads too soon. We have logic in ScannerMemLimiter to limit the number of scanner threads by calculating the thread's memory requirement and estimating the memory growth rate of all threads. However, it does not prevent a scanner node from quickly launching many threads and immediately reaching the memtracker's spare capacity. Even after ScannerMemLimiter rejects a new thread launch, some existing threads might continue increasing their non-reserved memory for decompression work until the memory limit exceeded. IMPALA-7096 adds hdfs_scanner_thread_max_estimated_bytes flag as a heuristic to count for non-reserved memory growth. Increasing this flag value can help reduce thread count, but might severely regress other queries that do not have heavy decompression characteristics. Similarly with lowering the NUM_SCANNER_THREADS query option. This patch adds one more query option as an alternative to mitigate OOM called HDFS_SCANNER_NON_RESERVED_BYTES. This option is intended to offer the same control as hdfs_scanner_thread_max_estimated_bytes, but as a query option such that tuning can be done at per query granularity. If this query option not set, set to 0, or negative value, backend will revert to use the value of hdfs_scanner_thread_max_estimated_bytes flag. Testing: - Add test case in query-options-test.cc and TestScanMemLimit::test_hdfs_scanner_thread_mem_scaling. Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 --- M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift M testdata/workloads/functional-query/queries/QueryTest/hdfs-scanner-thread-mem-scaling.test 8 files changed, 89 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/18126/7 -- To view, visit http://gerrit.cloudera.org:8080/18126 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 Gerrit-Change-Number: 18126 Gerrit-PatchSet: 7 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou
[Impala-ASF-CR] IMPALA-11068: Add query option to reduce scanner thread launch.
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/18126 ) Change subject: IMPALA-11068: Add query option to reduce scanner thread launch. .. Patch Set 6: (1 comment) Just one comment, otherwise +1. http://gerrit.cloudera.org:8080/#/c/18126/6/be/src/service/query-options.cc File be/src/service/query-options.cc: http://gerrit.cloudera.org:8080/#/c/18126/6/be/src/service/query-options.cc@1146 PS6, Line 1146: query_options->__set_mem_limit_coordinators(mem_spec_val.value); A break is missing from the end of the block here. -- To view, visit http://gerrit.cloudera.org:8080/18126 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 Gerrit-Change-Number: 18126 Gerrit-PatchSet: 6 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Sat, 07 Oct 2023 00:30:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11068: Add query option to reduce scanner thread launch.
Wenzhe Zhou has posted comments on this change. ( http://gerrit.cloudera.org:8080/18126 ) Change subject: IMPALA-11068: Add query option to reduce scanner thread launch. .. Patch Set 6: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/18126 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 Gerrit-Change-Number: 18126 Gerrit-PatchSet: 6 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Reviewer: Wenzhe Zhou Gerrit-Comment-Date: Thu, 05 Oct 2023 01:10:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11068: Add query option to reduce scanner thread launch.
Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/18126 ) Change subject: IMPALA-11068: Add query option to reduce scanner thread launch. .. Patch Set 6: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/18126 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 Gerrit-Change-Number: 18126 Gerrit-PatchSet: 6 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 02 Oct 2023 22:32:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11068: Add query option to reduce scanner thread launch.
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/18126 ) Change subject: IMPALA-11068: Add query option to reduce scanner thread launch. .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/18126/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/18126/5//COMMIT_MSG@29 PS5, Line 29: query option such that tuning can be done at per query granularity. If > nit: granularity Done -- To view, visit http://gerrit.cloudera.org:8080/18126 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 Gerrit-Change-Number: 18126 Gerrit-PatchSet: 6 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 02 Oct 2023 22:31:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11068: Add query option to reduce scanner thread launch.
Hello Kurt Deschler, Daniel Becker, Joe McDonnell, Csaba Ringhofer, Michael Smith, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/18126 to look at the new patch set (#6). Change subject: IMPALA-11068: Add query option to reduce scanner thread launch. .. IMPALA-11068: Add query option to reduce scanner thread launch. Under heavy decompression workload, Impala running with scanner thread parallelism (MT_DOP=0) can still hit OOM error due to launching too many threads too soon. We have logic in ScannerMemLimiter to limit the number of scanner threads by calculating the thread's memory requirement and estimating the memory growth rate of all threads. However, it does not prevent a scanner node from quickly launching many threads and immediately reaching the memtracker's spare capacity. Even after ScannerMemLimiter rejects a new thread launch, some existing threads might continue increasing their non-reserved memory for decompression work until the memory limit exceeded. IMPALA-7096 adds hdfs_scanner_thread_max_estimated_bytes flag as a heuristic to count for non-reserved memory growth. Increasing this flag value can help reduce thread count, but might severely regress other queries that do not have heavy decompression characteristics. Similarly with lowering the NUM_SCANNER_THREADS query option. This patch adds one more query option as an alternative to mitigate OOM called HDFS_SCANNER_NON_RESERVED_BYTES. This option is intended to offer the same control as hdfs_scanner_thread_max_estimated_bytes, but as a query option such that tuning can be done at per query granularity. If this query option not set, set to 0, or negative value, backend will revert to use the value of hdfs_scanner_thread_max_estimated_bytes flag. Testing: - Add test case in query-options-test.cc and TestScanMemLimit::test_hdfs_scanner_thread_mem_scaling. Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 --- M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift M testdata/workloads/functional-query/queries/QueryTest/hdfs-scanner-thread-mem-scaling.test 8 files changed, 88 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/18126/6 -- To view, visit http://gerrit.cloudera.org:8080/18126 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 Gerrit-Change-Number: 18126 Gerrit-PatchSet: 6 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-11068: Add query option to reduce scanner thread launch.
Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/18126 ) Change subject: IMPALA-11068: Add query option to reduce scanner thread launch. .. Patch Set 5: Code-Review+1 (1 comment) This seems like what IMPALA-7096 should have been originally. Reasonable enough. http://gerrit.cloudera.org:8080/#/c/18126/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/18126/5//COMMIT_MSG@29 PS5, Line 29: query option such that tuning can be done at per query granulality. If nit: granularity -- To view, visit http://gerrit.cloudera.org:8080/18126 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 Gerrit-Change-Number: 18126 Gerrit-PatchSet: 5 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Michael Smith Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 02 Oct 2023 22:24:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11068: Add query option to reduce scanner thread launch.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18126 ) Change subject: IMPALA-11068: Add query option to reduce scanner thread launch. .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/14126/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/18126 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 Gerrit-Change-Number: 18126 Gerrit-PatchSet: 5 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 02 Oct 2023 17:25:00 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11068: Add query option to reduce scanner thread launch.
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/18126 ) Change subject: IMPALA-11068: Add query option to reduce scanner thread launch. .. Patch Set 5: Patch set 4 is a rebase to resolve merge conflict. -- To view, visit http://gerrit.cloudera.org:8080/18126 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 Gerrit-Change-Number: 18126 Gerrit-PatchSet: 5 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 02 Oct 2023 16:57:39 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11068: Add query option to reduce scanner thread launch.
Hello Kurt Deschler, Daniel Becker, Joe McDonnell, Csaba Ringhofer, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/18126 to look at the new patch set (#5). Change subject: IMPALA-11068: Add query option to reduce scanner thread launch. .. IMPALA-11068: Add query option to reduce scanner thread launch. Under heavy decompression workload, Impala running with scanner thread parallelism (MT_DOP=0) can still hit OOM error due to launching too many threads too soon. We have logic in ScannerMemLimiter to limit the number of scanner threads by calculating the thread's memory requirement and estimating the memory growth rate of all threads. However, it does not prevent a scanner node from quickly launching many threads and immediately reaching the memtracker's spare capacity. Even after ScannerMemLimiter rejects a new thread launch, some existing threads might continue increasing their non-reserved memory for decompression work until the memory limit exceeded. IMPALA-7096 adds hdfs_scanner_thread_max_estimated_bytes flag as a heuristic to count for non-reserved memory growth. Increasing this flag value can help reduce thread count, but might severely regress other queries that do not have heavy decompression characteristics. Similarly with lowering the NUM_SCANNER_THREADS query option. This patch adds one more query option as an alternative to mitigate OOM called HDFS_SCANNER_NON_RESERVED_BYTES. This option is intended to offer the same control as hdfs_scanner_thread_max_estimated_bytes, but as a query option such that tuning can be done at per query granulality. If this query option not set, set to 0, or negative value, backend will revert to use the value of hdfs_scanner_thread_max_estimated_bytes flag. Testing: - Add test case in query-options-test.cc and TestScanMemLimit::test_hdfs_scanner_thread_mem_scaling. Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 --- M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift M testdata/workloads/functional-query/queries/QueryTest/hdfs-scanner-thread-mem-scaling.test 8 files changed, 88 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/18126/5 -- To view, visit http://gerrit.cloudera.org:8080/18126 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 Gerrit-Change-Number: 18126 Gerrit-PatchSet: 5 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-11068: Add query option to reduce scanner thread launch.
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/18126 ) Change subject: IMPALA-11068: Add query option to reduce scanner thread launch. .. Patch Set 4: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/18126 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 Gerrit-Change-Number: 18126 Gerrit-PatchSet: 4 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Wed, 27 Sep 2023 16:00:47 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11068: Add query option to reduce scanner thread launch.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18126 ) Change subject: IMPALA-11068: Add query option to reduce scanner thread launch. .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/14001/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/18126 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 Gerrit-Change-Number: 18126 Gerrit-PatchSet: 4 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Thu, 14 Sep 2023 16:10:03 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11068: Add query option to reduce scanner thread launch.
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/18126 ) Change subject: IMPALA-11068: Add query option to reduce scanner thread launch. .. Patch Set 4: (5 comments) http://gerrit.cloudera.org:8080/#/c/18126/3/be/src/exec/hdfs-scan-node.cc File be/src/exec/hdfs-scan-node.cc: http://gerrit.cloudera.org:8080/#/c/18126/3/be/src/exec/hdfs-scan-node.cc@222 PS3, Line 222: aking > Nit: taking Done http://gerrit.cloudera.org:8080/#/c/18126/3/be/src/exec/hdfs-scan-node.cc@222 PS3, Line 222: with > Nit: with Done http://gerrit.cloudera.org:8080/#/c/18126/3/be/src/exec/hdfs-scan-node.cc@222 PS3, Line 222: > Nit: precedence Done http://gerrit.cloudera.org:8080/#/c/18126/3/be/src/exec/hdfs-scan-node.cc@223 PS3, Line 223: ery op > "is set to a positive value"? Done http://gerrit.cloudera.org:8080/#/c/18126/3/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/18126/3/common/thrift/ImpalaService.thrift@849 PS3, Line 849: is not > "is not set to a positive value"? Done -- To view, visit http://gerrit.cloudera.org:8080/18126 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 Gerrit-Change-Number: 18126 Gerrit-PatchSet: 4 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Thu, 14 Sep 2023 15:43:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11068: Add query option to reduce scanner thread launch.
Hello Kurt Deschler, Daniel Becker, Joe McDonnell, Csaba Ringhofer, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/18126 to look at the new patch set (#4). Change subject: IMPALA-11068: Add query option to reduce scanner thread launch. .. IMPALA-11068: Add query option to reduce scanner thread launch. Under heavy decompression workload, Impala running with scanner thread parallelism (MT_DOP=0) can still hit OOM error due to launching too many threads too soon. We have logic in ScannerMemLimiter to limit the number of scanner threads by calculating the thread's memory requirement and estimating the memory growth rate of all threads. However, it does not prevent a scanner node from quickly launching many threads and immediately reaching the memtracker's spare capacity. Even after ScannerMemLimiter rejects a new thread launch, some existing threads might continue increasing their non-reserved memory for decompression work until the memory limit exceeded. IMPALA-7096 adds hdfs_scanner_thread_max_estimated_bytes flag as a heuristic to count for non-reserved memory growth. Increasing this flag value can help reduce thread count, but might severely regress other queries that do not have heavy decompression characteristics. Similarly with lowering the NUM_SCANNER_THREADS query option. This patch adds one more query option as an alternative to mitigate OOM called HDFS_SCANNER_NON_RESERVED_BYTES. This option is intended to offer the same control as hdfs_scanner_thread_max_estimated_bytes, but as a query option such that tuning can be done at per query granulality. If this query option not set, set to 0, or negative value, backend will revert to use the value of hdfs_scanner_thread_max_estimated_bytes flag. Testing: - Add test case in query-options-test.cc and TestScanMemLimit::test_hdfs_scanner_thread_mem_scaling. Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 --- M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift M testdata/workloads/functional-query/queries/QueryTest/hdfs-scanner-thread-mem-scaling.test 8 files changed, 87 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/18126/4 -- To view, visit http://gerrit.cloudera.org:8080/18126 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 Gerrit-Change-Number: 18126 Gerrit-PatchSet: 4 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-11068: Add query option to reduce scanner thread launch.
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/18126 ) Change subject: IMPALA-11068: Add query option to reduce scanner thread launch. .. Patch Set 3: (5 comments) http://gerrit.cloudera.org:8080/#/c/18126/3/be/src/exec/hdfs-scan-node.cc File be/src/exec/hdfs-scan-node.cc: http://gerrit.cloudera.org:8080/#/c/18126/3/be/src/exec/hdfs-scan-node.cc@222 PS3, Line 222: whith Nit: with http://gerrit.cloudera.org:8080/#/c/18126/3/be/src/exec/hdfs-scan-node.cc@222 PS3, Line 222: precedent Nit: precedence http://gerrit.cloudera.org:8080/#/c/18126/3/be/src/exec/hdfs-scan-node.cc@222 PS3, Line 222: takes Nit: taking http://gerrit.cloudera.org:8080/#/c/18126/3/be/src/exec/hdfs-scan-node.cc@223 PS3, Line 223: is set "is set to a positive value"? http://gerrit.cloudera.org:8080/#/c/18126/3/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/18126/3/common/thrift/ImpalaService.thrift@849 PS3, Line 849: not set "is not set to a positive value"? -- To view, visit http://gerrit.cloudera.org:8080/18126 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 Gerrit-Change-Number: 18126 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Thu, 14 Sep 2023 15:32:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11068: Add query option to reduce scanner thread launch.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18126 ) Change subject: IMPALA-11068: Add query option to reduce scanner thread launch. .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/13991/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/18126 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 Gerrit-Change-Number: 18126 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 12 Sep 2023 18:08:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11068: Add query option to reduce scanner thread launch.
Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/18126 ) Change subject: IMPALA-11068: Add query option to reduce scanner thread launch. .. Patch Set 3: (7 comments) http://gerrit.cloudera.org:8080/#/c/18126/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/18126/2//COMMIT_MSG@27 PS2, Line 27: n is i > Nit: could be "is intended to offer" Done http://gerrit.cloudera.org:8080/#/c/18126/2//COMMIT_MSG@30 PS2, Line 30: tion not s > Mention that we also fall back to the flag if the query option is set no ze Done http://gerrit.cloudera.org:8080/#/c/18126/2/be/src/exec/hdfs-scan-node.cc File be/src/exec/hdfs-scan-node.cc: http://gerrit.cloudera.org:8080/#/c/18126/2/be/src/exec/hdfs-scan-node.cc@220 PS2, Line 220: from : // either of the HDFS_SCANNER_NON_RESERVED_BYTES query op > Nit: it would be better like this: Done. Rephrased. http://gerrit.cloudera.org:8080/#/c/18126/2/be/src/exec/hdfs-scan-node.cc@221 PS2, Line 221: ion or the : // hdfs_scanner_thread_max_estimated_byte > Nit: it would be better like this: Done. Rephrased. http://gerrit.cloudera.org:8080/#/c/18126/2/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/18126/2/common/thrift/ImpalaService.thrift@847 PS2, Line 847: by the pla > Nit: by the planner. Done http://gerrit.cloudera.org:8080/#/c/18126/2/common/thrift/ImpalaService.thrift@849 PS2, Line 849: al sca > Nit: threads. Done http://gerrit.cloudera.org:8080/#/c/18126/2/common/thrift/Query.thrift File common/thrift/Query.thrift: http://gerrit.cloudera.org:8080/#/c/18126/2/common/thrift/Query.thrift@665 PS2, Line 665: 166: optional i64 hdfs_scanner_non_reserved_bytes = -1 > Shouldn't the default be unset, i.e. for example -1? That way the original Agree. Done. -- To view, visit http://gerrit.cloudera.org:8080/18126 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 Gerrit-Change-Number: 18126 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 12 Sep 2023 17:42:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11068: Add query option to reduce scanner thread launch.
Hello Kurt Deschler, Daniel Becker, Joe McDonnell, Csaba Ringhofer, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/18126 to look at the new patch set (#3). Change subject: IMPALA-11068: Add query option to reduce scanner thread launch. .. IMPALA-11068: Add query option to reduce scanner thread launch. Under heavy decompression workload, Impala running with scanner thread parallelism (MT_DOP=0) can still hit OOM error due to launching too many threads too soon. We have logic in ScannerMemLimiter to limit the number of scanner threads by calculating the thread's memory requirement and estimating the memory growth rate of all threads. However, it does not prevent a scanner node from quickly launching many threads and immediately reaching the memtracker's spare capacity. Even after ScannerMemLimiter rejects a new thread launch, some existing threads might continue increasing their non-reserved memory for decompression work until the memory limit exceeded. IMPALA-7096 adds hdfs_scanner_thread_max_estimated_bytes flag as a heuristic to count for non-reserved memory growth. Increasing this flag value can help reduce thread count, but might severely regress other queries that do not have heavy decompression characteristics. Similarly with lowering the NUM_SCANNER_THREADS query option. This patch adds one more query option as an alternative to mitigate OOM called HDFS_SCANNER_NON_RESERVED_BYTES. This option is intended to offer the same control as hdfs_scanner_thread_max_estimated_bytes, but as a query option such that tuning can be done at per query granulality. If this query option not set, set to 0, or negative value, backend will revert to use the value of hdfs_scanner_thread_max_estimated_bytes flag. Testing: - Add test case in query-options-test.cc and TestScanMemLimit::test_hdfs_scanner_thread_mem_scaling. Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 --- M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift M testdata/workloads/functional-query/queries/QueryTest/hdfs-scanner-thread-mem-scaling.test 8 files changed, 87 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/18126/3 -- To view, visit http://gerrit.cloudera.org:8080/18126 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 Gerrit-Change-Number: 18126 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Riza Suminto
[Impala-ASF-CR] IMPALA-11068: Add query option to reduce scanner thread launch.
Daniel Becker has posted comments on this change. ( http://gerrit.cloudera.org:8080/18126 ) Change subject: IMPALA-11068: Add query option to reduce scanner thread launch. .. Patch Set 2: (7 comments) http://gerrit.cloudera.org:8080/#/c/18126/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/18126/2//COMMIT_MSG@27 PS2, Line 27: intent Nit: could be "is intended to offer" http://gerrit.cloudera.org:8080/#/c/18126/2//COMMIT_MSG@30 PS2, Line 30: is not set Mention that we also fall back to the flag if the query option is set no zero or a negative value. http://gerrit.cloudera.org:8080/#/c/18126/2/be/src/exec/hdfs-scan-node.cc File be/src/exec/hdfs-scan-node.cc: http://gerrit.cloudera.org:8080/#/c/18126/2/be/src/exec/hdfs-scan-node.cc@220 PS2, Line 220: from : // either of hdfs_scanner_thread_max_estimated_bytes flag Nit: it would be better like this: "from either the hdfs_scanner_thread_max_estimated_bytes flag ..." http://gerrit.cloudera.org:8080/#/c/18126/2/be/src/exec/hdfs-scan-node.cc@221 PS2, Line 221: or : // hdfs_scanner_non_reserved_bytes option Nit: it would be better like this: "or the HDFS_SCANNER_NON_RESERVED_BYTES query option." We could mention that the query option takes precedence over the flag if it is set. http://gerrit.cloudera.org:8080/#/c/18126/2/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/18126/2/common/thrift/ImpalaService.thrift@847 PS2, Line 847: by planner Nit: by the planner. http://gerrit.cloudera.org:8080/#/c/18126/2/common/thrift/ImpalaService.thrift@849 PS2, Line 849: thread Nit: threads. http://gerrit.cloudera.org:8080/#/c/18126/2/common/thrift/Query.thrift File common/thrift/Query.thrift: http://gerrit.cloudera.org:8080/#/c/18126/2/common/thrift/Query.thrift@665 PS2, Line 665: 166: optional i64 hdfs_scanner_non_reserved_bytes = 33554432 // 32MB Shouldn't the default be unset, i.e. for example -1? That way the original behaviour would be conserved, i.e. the 'hdfs_scanner_thread_max_estimated_bytes' flag would be effective unless this option is set. If the default value stays this, we should mention it also in the commit message. My reason for having this option unset by default is that a user may set the 'hdfs_scanner_thread_max_estimated_bytes' flag but it won't have any effect and it could be difficult for the user to find out why. -- To view, visit http://gerrit.cloudera.org:8080/18126 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 Gerrit-Change-Number: 18126 Gerrit-PatchSet: 2 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Daniel Becker Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Tue, 12 Sep 2023 09:03:48 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-11068: Add query option to reduce scanner thread launch.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18126 ) Change subject: IMPALA-11068: Add query option to reduce scanner thread launch. .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/13974/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/18126 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 Gerrit-Change-Number: 18126 Gerrit-PatchSet: 2 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Riza Suminto Gerrit-Comment-Date: Mon, 11 Sep 2023 22:15:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-11068: Add query option to reduce scanner thread launch.
Hello Kurt Deschler, Joe McDonnell, Csaba Ringhofer, Bikramjeet Vig, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/18126 to look at the new patch set (#2). Change subject: IMPALA-11068: Add query option to reduce scanner thread launch. .. IMPALA-11068: Add query option to reduce scanner thread launch. Under heavy decompression workload, Impala running with scanner thread parallelism (MT_DOP=0) can still hit OOM error due to launching too many threads too soon. We have logic in ScannerMemLimiter to limit the number of scanner threads by calculating the thread's memory requirement and estimating the memory growth rate of all threads. However, it does not prevent a scanner node from quickly launching many threads and immediately reaching the memtracker's spare capacity. Even after ScannerMemLimiter rejects a new thread launch, some existing threads might continue increasing their non-reserved memory for decompression work until finally the memory limit is exceeded. IMPALA-7096 adds hdfs_scanner_thread_max_estimated_bytes flag as a heuristic to count for non-reserved memory growth. Increasing this flag value can help reduce thread count, but might severely regress other queries that do not have heavy decompression characteristics. Similarly with lowering the NUM_SCANNER_THREADS query option. This patch adds one more query option as an alternative to mitigate OOM called HDFS_SCANNER_NON_RESERVED_BYTES. This flag intent to offer the same control as hdfs_scanner_thread_max_estimated_bytes, but as a query option such that tuning can be done at per query granulality. If this query option is not set, revert to use the value of hdfs_scanner_thread_max_estimated_bytes flag. Testing: - Add test case in TestScanMemLimit::test_hdfs_scanner_thread_mem_scaling. Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 --- M be/src/exec/hdfs-scan-node.cc M be/src/exec/hdfs-scan-node.h M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift M testdata/workloads/functional-query/queries/QueryTest/hdfs-scanner-thread-mem-scaling.test 8 files changed, 86 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/26/18126/2 -- To view, visit http://gerrit.cloudera.org:8080/18126 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I03cadf1230eed00d69f2890c82476c6861e37466 Gerrit-Change-Number: 18126 Gerrit-PatchSet: 2 Gerrit-Owner: Riza Suminto Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Riza Suminto