[Impala-ASF-CR] IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
Fang-Yu Rao has posted comments on this change. ( http://gerrit.cloudera.org:8080/18184 ) Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter .. Patch Set 13: > Patch Set 13: > > > > Patch Set 12: Verified+1 Code-Review+2 > > > > > > Looked at the failed build, the errors seem unrelated flaky tests > > (e.g. IMPALA-10754) > > > > According to > https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/15800/testReport/junit/org.apache.impala.planner/CardinalityTest/testAggregationNodeGroupByCardinalityCapping/, > > it seems this patch introduced some non-determinism. > > This shouldn't be related, as the failed test is COUNT DISTINCT related, so > it shouldn't be affected by changes in sort. + the test is about FE > estimations, and shouldn't be affected by BE only changes. > > One possibility I see is that the changed order of elements could have > affected the tables written during dataload - but this shouldn't be a new > thing, as the equal elements were never deterministic during sorting. Thanks for the analysis Csaba! Yeah I think you are right. According to https://github.com/apache/impala/blob/master/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java#L42, we have {{CARDINALITY_TOLERANCE}} that is used to tune how tolerant we are with respect to a given cardinality estimate. The value is currently set to 0.05. Maybe we could consider increaseing this a bit, e.g., to 0.07. -- To view, visit http://gerrit.cloudera.org:8080/18184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 Gerrit-Change-Number: 18184 Gerrit-PatchSet: 13 Gerrit-Owner: Noemi Pap-Takacs Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 21 Feb 2022 22:41:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/18184 ) Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter .. Patch Set 13: > > Patch Set 12: Verified+1 Code-Review+2 > > > > Looked at the failed build, the errors seem unrelated flaky tests > (e.g. IMPALA-10754) > > According to > https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/15800/testReport/junit/org.apache.impala.planner/CardinalityTest/testAggregationNodeGroupByCardinalityCapping/, > it seems this patch introduced some non-determinism. This shouldn't be related, as the failed test is COUNT DISTINCT related, so it shouldn't be affected by changes in sort. + the test is about FE estimations, and shouldn't be affected by BE only changes. One possibility I see is that the changed order of elements could have affected the tables written during dataload - but this shouldn't be a new thing, as the equal elements were never deterministic during sorting. -- To view, visit http://gerrit.cloudera.org:8080/18184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 Gerrit-Change-Number: 18184 Gerrit-PatchSet: 13 Gerrit-Owner: Noemi Pap-Takacs Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 21 Feb 2022 19:16:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
Fang-Yu Rao has posted comments on this change. ( http://gerrit.cloudera.org:8080/18184 ) Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter .. Patch Set 13: > Patch Set 12: Verified+1 Code-Review+2 > > Looked at the failed build, the errors seem unrelated flaky tests (e.g. > IMPALA-10754) According to https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/15800/testReport/junit/org.apache.impala.planner/CardinalityTest/testAggregationNodeGroupByCardinalityCapping/, it seems this patch introduced some non-determinism. -- To view, visit http://gerrit.cloudera.org:8080/18184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 Gerrit-Change-Number: 18184 Gerrit-PatchSet: 13 Gerrit-Owner: Noemi Pap-Takacs Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Fang-Yu Rao Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 21 Feb 2022 17:42:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
Csaba Ringhofer has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/18184 ) Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter .. IMPALA-10961: Implementing adaptive 3-way quicksort in sorter Based on a 3-way partitioning implementation by Kurt Deschler. 3-way quicksort performs much better on data with large number of duplicates, but has a small regression in case of large NDV. This adaptive implementation keeps the advantages of both 2-way and 3-way quicksort. If duplicates are found during pivot selection (among the 3 randomly selected candidates),the 3-way partitioning function is called in SortHelper, otherwise partitioning goes 2-way. Some benchmark results: On a view created from 4 tpch_parquet lineitem tables Full sort, 1 node, 1 run - no spills (only in-memory sort is changed) Time of sorting adaptively during query execution compared to the original implementation (sort node profile): +--+++ | Test | Original 2-way | Adaptive Quicksort | +--+++ | select * order by l_linestatus, NDV=2: | 1 | 0.67 | | select l_shipmode order by l_shipmode, NDV=7 | 1 | 0.42 | | select * order by l_shipmode, NDV=7 | 1 | 0.57 | | large NDV, unique data | 1 | 1 | (no difference) +--+++ Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 Reviewed-on: http://gerrit.cloudera.org:8080/18184 Reviewed-by: Impala Public Jenkins Reviewed-by: Csaba Ringhofer Tested-by: Csaba Ringhofer --- M be/src/runtime/sorter-internal.h M be/src/runtime/sorter-ir.cc M be/src/runtime/sorter.cc M be/src/util/tuple-row-compare.h 4 files changed, 189 insertions(+), 50 deletions(-) Approvals: Impala Public Jenkins: Looks good to me, approved Csaba Ringhofer: Looks good to me, approved; Verified -- To view, visit http://gerrit.cloudera.org:8080/18184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 Gerrit-Change-Number: 18184 Gerrit-PatchSet: 13 Gerrit-Owner: Noemi Pap-Takacs Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/18184 ) Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter .. Patch Set 12: Verified+1 Code-Review+2 Looked at the failed build, the errors seem unrelated flaky tests (e.g. IMPALA-10754) -- To view, visit http://gerrit.cloudera.org:8080/18184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 Gerrit-Change-Number: 18184 Gerrit-PatchSet: 12 Gerrit-Owner: Noemi Pap-Takacs Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Sat, 19 Feb 2022 18:57:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
Csaba Ringhofer has removed a vote on this change. Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter .. Removed Verified-1 by Impala Public Jenkins -- To view, visit http://gerrit.cloudera.org:8080/18184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: deleteVote Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 Gerrit-Change-Number: 18184 Gerrit-PatchSet: 12 Gerrit-Owner: Noemi Pap-Takacs Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18184 ) Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter .. Patch Set 12: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7855/ -- To view, visit http://gerrit.cloudera.org:8080/18184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 Gerrit-Change-Number: 18184 Gerrit-PatchSet: 12 Gerrit-Owner: Noemi Pap-Takacs Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 18 Feb 2022 14:44:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18184 ) Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter .. Patch Set 11: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/10183/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/18184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 Gerrit-Change-Number: 18184 Gerrit-PatchSet: 11 Gerrit-Owner: Noemi Pap-Takacs Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 18 Feb 2022 08:35:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18184 ) Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/10182/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/18184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 Gerrit-Change-Number: 18184 Gerrit-PatchSet: 10 Gerrit-Owner: Noemi Pap-Takacs Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 18 Feb 2022 08:16:33 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18184 ) Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter .. Patch Set 12: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7855/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/18184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 Gerrit-Change-Number: 18184 Gerrit-PatchSet: 12 Gerrit-Owner: Noemi Pap-Takacs Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 18 Feb 2022 08:11:49 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18184 ) Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter .. Patch Set 12: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/18184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 Gerrit-Change-Number: 18184 Gerrit-PatchSet: 12 Gerrit-Owner: Noemi Pap-Takacs Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 18 Feb 2022 08:11:48 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/18184 ) Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter .. Patch Set 11: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/18184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 Gerrit-Change-Number: 18184 Gerrit-PatchSet: 11 Gerrit-Owner: Noemi Pap-Takacs Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 18 Feb 2022 08:11:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
Hello Kurt Deschler, Zoltan Borok-Nagy, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/18184 to look at the new patch set (#11). Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter .. IMPALA-10961: Implementing adaptive 3-way quicksort in sorter Based on a 3-way partitioning implementation by Kurt Deschler. 3-way quicksort performs much better on data with large number of duplicates, but has a small regression in case of large NDV. This adaptive implementation keeps the advantages of both 2-way and 3-way quicksort. If duplicates are found during pivot selection (among the 3 randomly selected candidates),the 3-way partitioning function is called in SortHelper, otherwise partitioning goes 2-way. Some benchmark results: On a view created from 4 tpch_parquet lineitem tables Full sort, 1 node, 1 run - no spills (only in-memory sort is changed) Time of sorting adaptively during query execution compared to the original implementation (sort node profile): +--+++ | Test | Original 2-way | Adaptive Quicksort | +--+++ | select * order by l_linestatus, NDV=2: | 1 | 0.67 | | select l_shipmode order by l_shipmode, NDV=7 | 1 | 0.42 | | select * order by l_shipmode, NDV=7 | 1 | 0.57 | | large NDV, unique data | 1 | 1 | (no difference) +--+++ Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 --- M be/src/runtime/sorter-internal.h M be/src/runtime/sorter-ir.cc M be/src/runtime/sorter.cc M be/src/util/tuple-row-compare.h 4 files changed, 189 insertions(+), 50 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/18184/11 -- To view, visit http://gerrit.cloudera.org:8080/18184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 Gerrit-Change-Number: 18184 Gerrit-PatchSet: 11 Gerrit-Owner: Noemi Pap-Takacs Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/18184 ) Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter .. Patch Set 10: (1 comment) http://gerrit.cloudera.org:8080/#/c/18184/10/be/src/util/tuple-row-compare.h File be/src/util/tuple-row-compare.h: http://gerrit.cloudera.org:8080/#/c/18184/10/be/src/util/tuple-row-compare.h@138 PS10, Line 138: ordering_expr_evals_lhs_.data(), ordering_expr_evals_rhs_.data(), lhs, rhs); nit: weird indentation -- To view, visit http://gerrit.cloudera.org:8080/18184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 Gerrit-Change-Number: 18184 Gerrit-PatchSet: 10 Gerrit-Owner: Noemi Pap-Takacs Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 18 Feb 2022 07:55:30 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
Hello Kurt Deschler, Zoltan Borok-Nagy, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/18184 to look at the new patch set (#10). Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter .. IMPALA-10961: Implementing adaptive 3-way quicksort in sorter Based on a 3-way partitioning implementation by Kurt Deschler. 3-way quicksort performs much better on data with large number of duplicates, but has a small regression in case of large NDV. This adaptive implementation keeps the advantages of both 2-way and 3-way quicksort. If duplicates are found during pivot selection (among the 3 randomly selected candidates),the 3-way partitioning function is called in SortHelper, otherwise partitioning goes 2-way. Some benchmark results: On a view created from 4 tpch_parquet lineitem tables Full sort, 1 node, 1 run - no spills (only in-memory sort is changed) Time of sorting adaptively during query execution compared to the original implementation (sort node profile): +--+++ | Test | Original 2-way | Adaptive Quicksort | +--+++ | select * order by l_linestatus, NDV=2: | 1 | 0.67 | | select l_shipmode order by l_shipmode, NDV=7 | 1 | 0.42 | | select * order by l_shipmode, NDV=7 | 1 | 0.57 | | large NDV, unique data | 1 | 1 | (no difference) +--+++ Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 --- M be/src/runtime/sorter-internal.h M be/src/runtime/sorter-ir.cc M be/src/runtime/sorter.cc M be/src/util/tuple-row-compare.h 4 files changed, 184 insertions(+), 50 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/18184/10 -- To view, visit http://gerrit.cloudera.org:8080/18184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 Gerrit-Change-Number: 18184 Gerrit-PatchSet: 10 Gerrit-Owner: Noemi Pap-Takacs Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18184 ) Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter .. Patch Set 9: Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7842/ -- To view, visit http://gerrit.cloudera.org:8080/18184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 Gerrit-Change-Number: 18184 Gerrit-PatchSet: 9 Gerrit-Owner: Noemi Pap-Takacs Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 15 Feb 2022 14:20:03 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/18184 ) Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter .. Patch Set 9: -Code-Review I couldn't deduce the cause of the failures - there was a crash in llvm code, not sure if it is related to this change. -- To view, visit http://gerrit.cloudera.org:8080/18184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 Gerrit-Change-Number: 18184 Gerrit-PatchSet: 9 Gerrit-Owner: Noemi Pap-Takacs Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 15 Feb 2022 08:15:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18184 ) Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter .. Patch Set 9: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7842/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/18184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 Gerrit-Change-Number: 18184 Gerrit-PatchSet: 9 Gerrit-Owner: Noemi Pap-Takacs Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 15 Feb 2022 07:52:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18184 ) Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter .. Patch Set 9: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/7840/ -- To view, visit http://gerrit.cloudera.org:8080/18184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 Gerrit-Change-Number: 18184 Gerrit-PatchSet: 9 Gerrit-Owner: Noemi Pap-Takacs Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 15 Feb 2022 00:12:28 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/18184 ) Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter .. Patch Set 9: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/18184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 Gerrit-Change-Number: 18184 Gerrit-PatchSet: 9 Gerrit-Owner: Noemi Pap-Takacs Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 14 Feb 2022 19:06:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18184 ) Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter .. Patch Set 9: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7840/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/18184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 Gerrit-Change-Number: 18184 Gerrit-PatchSet: 9 Gerrit-Owner: Noemi Pap-Takacs Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 14 Feb 2022 17:38:31 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
Kurt Deschler has posted comments on this change. ( http://gerrit.cloudera.org:8080/18184 ) Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter .. Patch Set 8: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/18184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 Gerrit-Change-Number: 18184 Gerrit-PatchSet: 8 Gerrit-Owner: Noemi Pap-Takacs Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 14 Feb 2022 15:34:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/18184 ) Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter .. Patch Set 8: Code-Review+1 (1 comment) The change looks great! http://gerrit.cloudera.org:8080/#/c/18184/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/18184/7//COMMIT_MSG@23 PS7, Line 23: +--+++ : | Test | Original 2-way | Adaptive Quicksort | : +--+++ : | select * order by l_linestatus, NDV=2: | 1 | 0.67 | : | select l_shipmode order by l_shipmode, NDV=7 | 1 | 0.42 | : | select * order by l_shipmode, NDV=7 | 1 | 0.57 | : | large NDV, unique data | > Thank you for the suggestion. Should I write the actual (average) execution I'm OK with the ratio. -- To view, visit http://gerrit.cloudera.org:8080/18184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 Gerrit-Change-Number: 18184 Gerrit-PatchSet: 8 Gerrit-Owner: Noemi Pap-Takacs Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 11 Feb 2022 16:17:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/18184 ) Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter .. Patch Set 8: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/18184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 Gerrit-Change-Number: 18184 Gerrit-PatchSet: 8 Gerrit-Owner: Noemi Pap-Takacs Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 11 Feb 2022 15:22:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18184 ) Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter .. Patch Set 8: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/10142/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/18184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 Gerrit-Change-Number: 18184 Gerrit-PatchSet: 8 Gerrit-Owner: Noemi Pap-Takacs Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 11 Feb 2022 15:04:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
Hello Kurt Deschler, Zoltan Borok-Nagy, Csaba Ringhofer, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/18184 to look at the new patch set (#8). Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter .. IMPALA-10961: Implementing adaptive 3-way quicksort in sorter Based on a 3-way partitioning implementation by Kurt Deschler. 3-way quicksort performs much better on data with large number of duplicates, but has a small regression in case of large NDV. This adaptive implementation keeps the advantages of both 2-way and 3-way quicksort. If duplicates are found during pivot selection (among the 3 randomly selected candidates),the 3-way partitioning function is called in SortHelper, otherwise partitioning goes 2-way. Some benchmark results: On a view created from 4 tpch_parquet lineitem tables Full sort, 1 node, 1 run - no spills (only in-memory sort is changed) Time of sorting adaptively during query execution compared to the original implementation (sort node profile): +--+++ | Test | Original 2-way | Adaptive Quicksort | +--+++ | select * order by l_linestatus, NDV=2: | 1 | 0.67 | | select l_shipmode order by l_shipmode, NDV=7 | 1 | 0.42 | | select * order by l_shipmode, NDV=7 | 1 | 0.57 | | large NDV, unique data | 1 | 1 | (no difference) +--+++ Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 --- M be/src/runtime/sorter-internal.h M be/src/runtime/sorter-ir.cc M be/src/runtime/sorter.cc 3 files changed, 182 insertions(+), 49 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/18184/8 -- To view, visit http://gerrit.cloudera.org:8080/18184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 Gerrit-Change-Number: 18184 Gerrit-PatchSet: 8 Gerrit-Owner: Noemi Pap-Takacs Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy
[Impala-ASF-CR] IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
Noemi Pap-Takacs has posted comments on this change. ( http://gerrit.cloudera.org:8080/18184 ) Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter .. Patch Set 7: (6 comments) http://gerrit.cloudera.org:8080/#/c/18184/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/18184/7//COMMIT_MSG@9 PS7, Line 9: 3way > nit: 3-way Done http://gerrit.cloudera.org:8080/#/c/18184/7//COMMIT_MSG@23 PS7, Line 23: select * order by l_linestatus, NDV=2: : original 2-way vs adaptive quicksort - 1 : 0.67 : select l_shipmode order by l_shipmode, NDV=7: : original 2-way vs adaptive quicksort - 1 : 0.42 : select * order by l_shipmode, NDV=7: : original 2-way vs adaptive quicksort - 1 : 0.57 : large NDV, unique data: no significant difference > Using https://ozh.github.io/ascii-tables/ Thank you for the suggestion. Should I write the actual (average) execution time in ms or is the ratio better to compare the numbers? http://gerrit.cloudera.org:8080/#/c/18184/7/be/src/runtime/sorter-internal.h File be/src/runtime/sorter-internal.h: http://gerrit.cloudera.org:8080/#/c/18184/7/be/src/runtime/sorter-internal.h@533 PS7, Line 533: hasEquals > nit: we name variables with underscores instead of camel case in the backen Done http://gerrit.cloudera.org:8080/#/c/18184/7/be/src/runtime/sorter-ir.cc File be/src/runtime/sorter-ir.cc: http://gerrit.cloudera.org:8080/#/c/18184/7/be/src/runtime/sorter-ir.cc@157 PS7, Line 157: reinterpret_cast(_tuple) > nit: Since it's being used at multiple places, maybe create a variable 'tem Good idea, makes the code more readable. I just followed the original naming convention. I think it was named temp_tuple instead of pivot to avoid confusion that it is not the actual pivot's tuple. I renamed it in Partition2way() as well. http://gerrit.cloudera.org:8080/#/c/18184/7/be/src/runtime/sorter-ir.cc@260 PS7, Line 260: hasEquals > nit: should be 'has_equals' Done http://gerrit.cloudera.org:8080/#/c/18184/7/be/src/runtime/sorter-ir.cc@320 PS7, Line 320: lt > Maybe rename it to 't1_cmp_t2'? makes sense to me, done -- To view, visit http://gerrit.cloudera.org:8080/18184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 Gerrit-Change-Number: 18184 Gerrit-PatchSet: 7 Gerrit-Owner: Noemi Pap-Takacs Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Fri, 11 Feb 2022 14:25:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/18184 ) Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter .. Patch Set 7: (6 comments) I'm really excited about this change as it will greatly improve partitioned INSERTs. The change looks really good, I've only found a few naming issues. http://gerrit.cloudera.org:8080/#/c/18184/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/18184/7//COMMIT_MSG@9 PS7, Line 9: 3way nit: 3-way http://gerrit.cloudera.org:8080/#/c/18184/7//COMMIT_MSG@23 PS7, Line 23: select * order by l_linestatus, NDV=2: : original 2-way vs adaptive quicksort - 1 : 0.67 : select l_shipmode order by l_shipmode, NDV=7: : original 2-way vs adaptive quicksort - 1 : 0.42 : select * order by l_shipmode, NDV=7: : original 2-way vs adaptive quicksort - 1 : 0.57 : large NDV, unique data: no significant difference Using https://ozh.github.io/ascii-tables/ (for these benchmark tables it doesn't matter if the line widths are longer) +--+++ | Test | Original 2-way | Adaptive Quicksort | +--+++ | select * order by l_linestatus, NDV=2: | 1 | 0.67 | | select l_shipmode order by l_shipmode, NDV=7 | 1 | 0.42 | | select * order by l_shipmode, NDV=7 | 1 | 0.57 | | large NDV, unique data | 1 | 1 | (no difference) +--+++ http://gerrit.cloudera.org:8080/#/c/18184/7/be/src/runtime/sorter-internal.h File be/src/runtime/sorter-internal.h: http://gerrit.cloudera.org:8080/#/c/18184/7/be/src/runtime/sorter-internal.h@533 PS7, Line 533: hasEquals nit: we name variables with underscores instead of camel case in the backend. I.e. it should be 'has_equals' http://gerrit.cloudera.org:8080/#/c/18184/7/be/src/runtime/sorter-ir.cc File be/src/runtime/sorter-ir.cc: http://gerrit.cloudera.org:8080/#/c/18184/7/be/src/runtime/sorter-ir.cc@157 PS7, Line 157: reinterpret_cast(_tuple) nit: Since it's being used at multiple places, maybe create a variable 'temp_tuple_row' for it. And how about calling 'temp_tuple' as 'pivot_tuple' and 'temp_tuple_row' as 'pivot_tuple_row'? http://gerrit.cloudera.org:8080/#/c/18184/7/be/src/runtime/sorter-ir.cc@260 PS7, Line 260: hasEquals nit: should be 'has_equals' http://gerrit.cloudera.org:8080/#/c/18184/7/be/src/runtime/sorter-ir.cc@320 PS7, Line 320: lt Maybe rename it to 't1_cmp_t2'? -- To view, visit http://gerrit.cloudera.org:8080/18184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 Gerrit-Change-Number: 18184 Gerrit-PatchSet: 7 Gerrit-Owner: Noemi Pap-Takacs Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Kurt Deschler Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Tue, 08 Feb 2022 19:34:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/18184 ) Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter .. Patch Set 7: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/10119/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/18184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 Gerrit-Change-Number: 18184 Gerrit-PatchSet: 7 Gerrit-Owner: Noemi Pap-Takacs Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Comment-Date: Tue, 08 Feb 2022 11:27:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/18184 ) Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter .. Patch Set 7: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/18184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 Gerrit-Change-Number: 18184 Gerrit-PatchSet: 7 Gerrit-Owner: Noemi Pap-Takacs Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Noemi Pap-Takacs Gerrit-Comment-Date: Tue, 08 Feb 2022 11:11:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
Noemi Pap-Takacs has uploaded this change for review. ( http://gerrit.cloudera.org:8080/18184 Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter .. IMPALA-10961: Implementing adaptive 3-way quicksort in sorter Based on a 3way partitioning implementation by Kurt Deschler. 3-way quicksort performs much better on data with large number of duplicates, but has a small regression in case of large NDV. This adaptive implementation keeps the advantages of both 2-way and 3-way quicksort. If duplicates are found during pivot selection (among the 3 randomly selected candidates),the 3-way partitioning function is called in SortHelper, otherwise partitioning goes 2-way. Some benchmark results: On a view created from 4 tpch_parquet lineitem tables Full sort, 1 node, 1 run - no spills (only in-memory sort is changed) Time of sorting adaptively during query execution compared to the original implementation (sort node profile): select * order by l_linestatus, NDV=2: original 2-way vs adaptive quicksort - 1 : 0.67 select l_shipmode order by l_shipmode, NDV=7: original 2-way vs adaptive quicksort - 1 : 0.42 select * order by l_shipmode, NDV=7: original 2-way vs adaptive quicksort - 1 : 0.57 large NDV, unique data: no significant difference Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 --- M be/src/runtime/sorter-internal.h M be/src/runtime/sorter-ir.cc M be/src/runtime/sorter.cc 3 files changed, 174 insertions(+), 43 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/18184/7 -- To view, visit http://gerrit.cloudera.org:8080/18184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 Gerrit-Change-Number: 18184 Gerrit-PatchSet: 7 Gerrit-Owner: Noemi Pap-Takacs Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Noemi Pap-Takacs