[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14690 ) Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. IMPALA-9146: Add a configurable limit for the size of broadcast input. Impala's DistributedPlanner may sometimes accidentally choose broadcast distribution for inputs that are larger than the destination executor's total memory. This could potentially happen if the cluster membership is not accurately known and the planner's cost computation of the broadcastCost vs partitionCost happens to favor the broadcast distribution. This causes spilling and severely affects performance. Although the DistributedPlanner does a mem_limit check before picking broadcast, the mem_limit is not an accurate reflection since it is assigned during admission control. As a safety here we introduce an explicit configurable limit: broadcast_bytes_limit for the size of the broadcast input and set it to default of 32GB. The default is chosen based on analysis of existing benchmark queries and representative workloads such that in vast majority of the cases the parameter value does not need to be changed. If the estimated input size on the build side is greater than this threshold, the DistributedPlanner will fall back to a partition distribution. Setting this parameter to 0 causes it to be ignored. Testing: - Ran all regression tests on Jenkins successfully - Added a few unit testis in PlannerTest that (a) set the broadcast_bytes_limit to a small value and checks whether the distributed plan does hash partitioning on the build side instead of broadcast, (b) pass a broadcast hint to override the config setting, (c) verify the standard case where broadcast threshold is larger than the build input size. Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Reviewed-on: http://gerrit.cloudera.org:8080/14690 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins --- M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java A testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit-large.test A testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit.test 10 files changed, 131 insertions(+), 5 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 10 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14690 ) Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. Patch Set 9: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 9 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 26 Nov 2019 06:47:03 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14690 ) Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. Patch Set 9: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5295/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 9 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 26 Nov 2019 02:08:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14690 ) Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. Patch Set 9: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 9 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 26 Nov 2019 02:08:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14690 ) Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. Patch Set 8: Hit IMPALA-9152 and IMPALA-9165. I might wait until Joe's fix for IMPALA-9165 is merged before retrying. -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 8 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 21 Nov 2019 19:15:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14690 ) Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. Patch Set 8: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/5261/ -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 8 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 21 Nov 2019 07:13:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14690 ) Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. Patch Set 8: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5261/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 8 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 20 Nov 2019 21:13:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14690 ) Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. Patch Set 8: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 8 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 20 Nov 2019 21:13:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14690 ) Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. Patch Set 7: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 7 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 20 Nov 2019 21:12:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14690 ) Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5088/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 7 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 20 Nov 2019 21:02:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/14690 ) Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. Patch Set 7: (2 comments) Addressed comments. Pls take another look. Thanks ! http://gerrit.cloudera.org:8080/#/c/14690/6/fe/src/test/java/org/apache/impala/planner/PlannerTest.java File fe/src/test/java/org/apache/impala/planner/PlannerTest.java: http://gerrit.cloudera.org:8080/#/c/14690/6/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@1017 PS6, Line 1017: options.setBroadcast_bytes_limit(100); > Optional: you could probably put this test in the same file as the previous Made this change. http://gerrit.cloudera.org:8080/#/c/14690/6/testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit-hint.test File testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit-hint.test: http://gerrit.cloudera.org:8080/#/c/14690/6/testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit-hint.test@3 PS6, Line 3: > nit: trailing whitespace. The bot ignores .test files, but would be best to I have removed this file now since it is combined with broadcast-bytes-limit.test -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 7 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 20 Nov 2019 20:16:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Hello Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14690 to look at the new patch set (#7). Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. IMPALA-9146: Add a configurable limit for the size of broadcast input. Impala's DistributedPlanner may sometimes accidentally choose broadcast distribution for inputs that are larger than the destination executor's total memory. This could potentially happen if the cluster membership is not accurately known and the planner's cost computation of the broadcastCost vs partitionCost happens to favor the broadcast distribution. This causes spilling and severely affects performance. Although the DistributedPlanner does a mem_limit check before picking broadcast, the mem_limit is not an accurate reflection since it is assigned during admission control. As a safety here we introduce an explicit configurable limit: broadcast_bytes_limit for the size of the broadcast input and set it to default of 32GB. The default is chosen based on analysis of existing benchmark queries and representative workloads such that in vast majority of the cases the parameter value does not need to be changed. If the estimated input size on the build side is greater than this threshold, the DistributedPlanner will fall back to a partition distribution. Setting this parameter to 0 causes it to be ignored. Testing: - Ran all regression tests on Jenkins successfully - Added a few unit testis in PlannerTest that (a) set the broadcast_bytes_limit to a small value and checks whether the distributed plan does hash partitioning on the build side instead of broadcast, (b) pass a broadcast hint to override the config setting, (c) verify the standard case where broadcast threshold is larger than the build input size. Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 --- M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java A testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit-large.test A testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit.test 10 files changed, 131 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/14690/7 -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 7 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14690 ) Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. Patch Set 6: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5085/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 6 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 20 Nov 2019 17:27:55 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14690 ) Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5084/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 5 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 20 Nov 2019 17:23:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14690 ) Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. Patch Set 6: (2 comments) I'm ready to +2 except for a couple of minor things http://gerrit.cloudera.org:8080/#/c/14690/6/fe/src/test/java/org/apache/impala/planner/PlannerTest.java File fe/src/test/java/org/apache/impala/planner/PlannerTest.java: http://gerrit.cloudera.org:8080/#/c/14690/6/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@1017 PS6, Line 1017: runPlannerTestFile("broadcast-bytes-limit-hint", "tpch_parquet", options); Optional: you could probably put this test in the same file as the previous one, since it uses all the same settings. http://gerrit.cloudera.org:8080/#/c/14690/6/testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit-hint.test File testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit-hint.test: http://gerrit.cloudera.org:8080/#/c/14690/6/testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit-hint.test@3 PS6, Line 3: # greater than broadcast_bytes_limit because the query has a nit: trailing whitespace. The bot ignores .test files, but would be best to avoid. -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 6 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 20 Nov 2019 16:56:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/14690 ) Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. Patch Set 6: (2 comments) Thanks for the code review. I have addressed the review comments. For the unit tests, I ended up creating separate test files and changing the option through the Java code. I see some tests are setting QUERYOPTIONS within a single test file..not sure if there's a preference. http://gerrit.cloudera.org:8080/#/c/14690/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14690/4//COMMIT_MSG@30 PS4, Line 30: - Added a few unit testis in PlannerTest that (a) set the > Can you also add the option to query-options-test just to test the validati Done http://gerrit.cloudera.org:8080/#/c/14690/4/testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit.test File testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit.test: PS4: > Ah, sure will add that. Done -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 6 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 20 Nov 2019 16:56:07 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Hello Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14690 to look at the new patch set (#6). Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. IMPALA-9146: Add a configurable limit for the size of broadcast input. Impala's DistributedPlanner may sometimes accidentally choose broadcast distribution for inputs that are larger than the destination executor's total memory. This could potentially happen if the cluster membership is not accurately known and the planner's cost computation of the broadcastCost vs partitionCost happens to favor the broadcast distribution. This causes spilling and severely affects performance. Although the DistributedPlanner does a mem_limit check before picking broadcast, the mem_limit is not an accurate reflection since it is assigned during admission control. As a safety here we introduce an explicit configurable limit: broadcast_bytes_limit for the size of the broadcast input and set it to default of 32GB. The default is chosen based on analysis of existing benchmark queries and representative workloads such that in vast majority of the cases the parameter value does not need to be changed. If the estimated input size on the build side is greater than this threshold, the DistributedPlanner will fall back to a partition distribution. Setting this parameter to 0 causes it to be ignored. Testing: - Ran all regression tests on Jenkins successfully - Added a few unit testis in PlannerTest that (a) set the broadcast_bytes_limit to a small value and checks whether the distributed plan does hash partitioning on the build side instead of broadcast, (b) pass a broadcast hint to override the config setting, (c) verify the standard case where broadcast threshold is larger than the build input size. Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 --- M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java A testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit-hint.test A testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit-large.test A testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit.test 11 files changed, 135 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/14690/6 -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 6 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Hello Tim Armstrong, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14690 to look at the new patch set (#5). Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. IMPALA-9146: Add a configurable limit for the size of broadcast input. Impala's DistributedPlanner may sometimes accidentally choose broadcast distribution for inputs that are larger than the destination executor's total memory. This could potentially happen if the cluster membership is not accurately known and the planner's cost computation of the broadcastCost vs partitionCost happens to favor the broadcast distribution. This causes spilling and severely affects performance. Although the DistributedPlanner does a mem_limit check before picking broadcast, the mem_limit is not an accurate reflection since it is assigned during admission control. As a safety here we introduce an explicit configurable limit: broadcast_bytes_limit for the size of the broadcast input and set it to default of 32GB. The default is chosen based on analysis of existing benchmark queries and representative workloads such that in vast majority of the cases the parameter value does not need to be changed. If the estimated input size on the build side is greater than this threshold, the DistributedPlanner will fall back to a partition distribution. Setting this parameter to 0 causes it to be ignored. Testing: - Ran all regression tests on Jenkins successfully - Added a few unit testis in PlannerTest that (a) set the broadcast_bytes_limit to a small value and checks whether the distributed plan does hash partitioning on the build side instead of broadcast, (b) pass a broadcast hint to override the config setting, (c) verify the standard case where broadcast threshold is larger than the build input size. Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 --- M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java A testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit-hint.test A testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit-large.test A testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit.test 11 files changed, 136 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/14690/5 -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 5 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14690 ) Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. Patch Set 5: (1 comment) http://gerrit.cloudera.org:8080/#/c/14690/5/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/14690/5/common/thrift/ImpalaService.thrift@488 PS5, Line 488: line has trailing whitespace -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 5 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 20 Nov 2019 16:40:07 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/14690 ) Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/14690/4/testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit.test File testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit.test: PS4: > Oh I meant something simpler - same option values, but a plan that has an e Ah, sure will add that. -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 4 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 20 Nov 2019 05:40:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14690 ) Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/14690/4/testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit.test File testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit.test: PS4: > Good suggestion. Just want to clarify..the first negative test case can be Oh I meant something simpler - same option values, but a plan that has an estimate of < broadcast_bytes_limit so we're testing both sides of the boundary. We kinda get that coverage from all the other planner tests but it'd be nice to have it explicitly tested here. It should have read "Can you also add a negative test where it is below the threshold and stays as a broadcast join." Sorry about that. -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 4 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 20 Nov 2019 05:20:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Aman Sinha has posted comments on this change. ( http://gerrit.cloudera.org:8080/14690 ) Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/14690/4/testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit.test File testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit.test: PS4: > Can you also add a negative test where it is below the threshold and doesn' Good suggestion. Just want to clarify..the first negative test case can be something like setting the 'mem_limit' to a small number but leave the broadcast_bytes_limit as default (or sufficiently high). In this case, the planner should not pick broadcast. -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 4 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 20 Nov 2019 02:16:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14690 ) Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. Patch Set 4: (3 comments) Code looks good, just had a few asks as far as additional tests. http://gerrit.cloudera.org:8080/#/c/14690/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14690/4//COMMIT_MSG@30 PS4, Line 30: - Added a new unit test in PlannerTest that sets the Can you also add the option to query-options-test just to test the validations. I think you can just add it to the table of byte options in SetByteOptions() since its behaviour is consistent with the other options. http://gerrit.cloudera.org:8080/#/c/14690/4/be/src/service/query-options.cc File be/src/service/query-options.cc: http://gerrit.cloudera.org:8080/#/c/14690/4/be/src/service/query-options.cc@914 PS4, Line 914: ParseMemValue(value, "broadcast bytes limit for join operations", _bytes_limit)); > line too long (103 > 90) Assume you'll fix in next patch. You can run these checks locally the HEAD commit with: ./bin/jenkins/critique-gerrit-review.py --dryrun If you want to check whether they're resolved. Also ok to rely on the bot. http://gerrit.cloudera.org:8080/#/c/14690/4/testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit.test File testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit.test: PS4: Can you also add a negative test where it is below the threshold and doesn't switch to broadcast. And a negative test where it's above the threshold but the /* +broadcast */ hint overrides it. -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 4 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 19 Nov 2019 19:39:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14690 ) Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5067/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 4 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 19 Nov 2019 18:38:50 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14690 to look at the new patch set (#4). Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. IMPALA-9146: Add a configurable limit for the size of broadcast input. Impala's DistributedPlanner may sometimes accidentally choose broadcast distribution for inputs that are larger than the destination executor's total memory. This could potentially happen if the cluster membership is not accurately known and the planner's cost computation of the broadcastCost vs partitionCost happens to favor the broadcast distribution. This causes spilling and severely affects performance. Although the DistributedPlanner does a mem_limit check before picking broadcast, the mem_limit is not an accurate reflection since it is assigned during admission control. As a safety here we introduce an explicit configurable limit: broadcast_bytes_limit for the size of the broadcast input and set it to default of 32GB. The default is chosen based on analysis of existing benchmark queries and representative workloads such that in vast majority of the cases the parameter value does not need to be changed. If the estimated input size on the build side is greater than this threshold, the DistributedPlanner will fall back to a partition distribution. Setting this parameter to 0 causes it to be ignored. Testing: - Ran all regression tests on Jenkins successfully - Added a new unit test in PlannerTest that sets the broadcast_bytes_limit to a small value and checks whether the distributed plan does hash partitioning on the build side instead of broadcast. Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 --- M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java A testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit.test 8 files changed, 72 insertions(+), 5 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/14690/4 -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 4 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14690 ) Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/14690/4/be/src/service/query-options.cc File be/src/service/query-options.cc: http://gerrit.cloudera.org:8080/#/c/14690/4/be/src/service/query-options.cc@914 PS4, Line 914: ParseMemValue(value, "broadcast bytes limit for join operations", _bytes_limit)); line too long (103 > 90) http://gerrit.cloudera.org:8080/#/c/14690/4/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/14690/4/common/thrift/ImpalaService.thrift@488 PS4, Line 488: line has trailing whitespace -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 4 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Tue, 19 Nov 2019 18:21:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14690 ) Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. Patch Set 2: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/5039/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 2 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Sat, 16 Nov 2019 20:54:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/14690 to look at the new patch set (#3). Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. IMPALA-9146: Add a configurable limit for the size of broadcast input. Impala's DistributedPlanner may sometimes accidentally choose broadcast distribution for inputs that are larger than the destination executor's total memory. This could potentially happen if the cluster membership is not accurately known and the planner's cost computation of the broadcastCost vs partitionCost happens to favor the broadcast distribution. This causes spilling and severely affects performance. Although the DistributedPlanner does a mem_limit check before picking broadcast, the mem_limit is not an accurate reflection since it is assigned during admission control. As a safety here we introduce an explicit configurable limit: broadcast_bytes_limit for the size of the broadcast input and set it to default of 32GB. The default is chosen based on analysis of existing benchmark queries and representative workloads. If the estimated input size on the build side is greater than this threshold, the DistributedPlanner will fall back to a partition distribution. Testing: - Ran all regression tests on Jenkins successfully - Added a new unit test in PlannerTest that sets the broadcast_bytes_limit to a small value and checks whether the distributed plan does hash partitioning on the build side instead of broadcast. Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 --- M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M fe/src/main/java/org/apache/impala/planner/PlannerContext.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java A testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit.test 9 files changed, 75 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/14690/3 -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 3 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Aman Sinha has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/14690 ) Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. IMPALA-9146: Add a configurable limit for the size of broadcast input. Impala's DistributedPlanner may sometimes accidentally choose broadcast distribution for inputs that are larger than the destination executor's total memory. This could potentially happen if the cluster membership is not accurately known and the planner's cost computation of the broadcastCost vs partitionCost happens to favor the broadcast distribution. This causes spilling and severely affects performance. Although the DistributedPlanner does a mem_limit check before picking broadcast, the mem_limit is not an accurate reflection since it is assigned during admission control. As a safety here we introduce an explicit configurable limit: broadcast_bytes_limit for the size of the broadcast input and set it to default of 32GB. The default is chosen based on analysis of existing benchmark queries and representative workloads. If the estimated input size on the build side is greater than this threshold, the DistributedPlanner will fall back to a partition distribution. Testing: - Ran all regression tests on Jenkins successfully - Added a new unit test in PlannerTest that sets the broadcast_bytes_limit to a small value and checks whether the distributed plan does hash partitioning on the build side instead of broadcast. Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 --- M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java M fe/src/main/java/org/apache/impala/planner/PlannerContext.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java A testdata/workloads/functional-planner/queries/PlannerTest/broadcast-bytes-limit.test 9 files changed, 75 insertions(+), 6 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/14690/2 -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 2 Gerrit-Owner: Aman Sinha
[Impala-ASF-CR] IMPALA-9146: Add a configurable limit for the size of broadcast input.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/14690 ) Change subject: IMPALA-9146: Add a configurable limit for the size of broadcast input. .. Patch Set 2: (6 comments) http://gerrit.cloudera.org:8080/#/c/14690/2/be/src/service/query-options.cc File be/src/service/query-options.cc: http://gerrit.cloudera.org:8080/#/c/14690/2/be/src/service/query-options.cc@914 PS2, Line 914: ParseMemValue(value, "broadcast bytes limit for join operations", _bytes_limit)); line too long (103 > 90) http://gerrit.cloudera.org:8080/#/c/14690/2/common/thrift/ImpalaService.thrift File common/thrift/ImpalaService.thrift: http://gerrit.cloudera.org:8080/#/c/14690/2/common/thrift/ImpalaService.thrift@488 PS2, Line 488: line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/14690/2/common/thrift/ImpalaService.thrift@489 PS2, Line 489: // The max number of estimated bytes eligible for a Broadcast operation during a join. line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/14690/2/common/thrift/ImpalaService.thrift@490 PS2, Line 490: // If the planner thinks the total bytes sent to all destinations of a broadcast exchange line too long (91 > 90) http://gerrit.cloudera.org:8080/#/c/14690/2/common/thrift/ImpalaService.thrift@491 PS2, Line 491: // will exceed this limit, it will not consider a broadcast and instead fall back on a line has trailing whitespace http://gerrit.cloudera.org:8080/#/c/14690/2/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java File fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java: http://gerrit.cloudera.org:8080/#/c/14690/2/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java@861 PS2, Line 861: Collections.emptySet()); tab used for whitespace -- To view, visit http://gerrit.cloudera.org:8080/14690 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ibe5639ca38acb72e0194aa80bc6ebb6cafb2acd9 Gerrit-Change-Number: 14690 Gerrit-PatchSet: 2 Gerrit-Owner: Aman Sinha Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Sat, 16 Nov 2019 20:12:29 + Gerrit-HasComments: Yes