Noemi Pap-Takacs has uploaded this change for review. ( http://gerrit.cloudera.org:8080/18184
Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter ...................................................................... IMPALA-10961: Implementing adaptive 3-way quicksort in sorter Based on a 3way partitioning implementation by Kurt Deschler. 3-way quicksort performs much better on data with large number of duplicates, but has a small regression in case of large NDV. This adaptive implementation keeps the advantages of both 2-way and 3-way quicksort. If duplicates are found during pivot selection (among the 3 randomly selected candidates),the 3-way partitioning function is called in SortHelper, otherwise partitioning goes 2-way. Some benchmark results: On a view created from 4 tpch_parquet lineitem tables Full sort, 1 node, 1 run - no spills (only in-memory sort is changed) Time of sorting adaptively during query execution compared to the original implementation (sort node profile): select * order by l_linestatus, NDV=2: original 2-way vs adaptive quicksort - 1 : 0.67 select l_shipmode order by l_shipmode, NDV=7: original 2-way vs adaptive quicksort - 1 : 0.42 select * order by l_shipmode, NDV=7: original 2-way vs adaptive quicksort - 1 : 0.57 large NDV, unique data: no significant difference Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 --- M be/src/runtime/sorter-internal.h M be/src/runtime/sorter-ir.cc M be/src/runtime/sorter.cc 3 files changed, 174 insertions(+), 43 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/18184/7 -- To view, visit http://gerrit.cloudera.org:8080/18184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 Gerrit-Change-Number: 18184 Gerrit-PatchSet: 7 Gerrit-Owner: Noemi Pap-Takacs <npaptak...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Noemi Pap-Takacs <npaptak...@cloudera.com>