Noemi Pap-Takacs has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/18184


Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
......................................................................

IMPALA-10961: Implementing adaptive 3-way quicksort in sorter

Based on a 3way partitioning implementation by Kurt Deschler.
3-way quicksort performs much better on data with large number of
duplicates, but has a small regression in case of large NDV.
This adaptive implementation keeps the advantages of both 2-way
and 3-way quicksort. If duplicates are found during pivot selection
(among the 3 randomly selected candidates),the 3-way partitioning
function is called in SortHelper, otherwise partitioning goes 2-way.

Some benchmark results:
On a view created from 4 tpch_parquet lineitem tables
Full sort, 1 node, 1 run - no spills (only in-memory sort is changed)
Time of sorting adaptively during query execution compared to
the original implementation (sort node profile):

select * order by l_linestatus, NDV=2:
original 2-way vs adaptive quicksort - 1 : 0.67
select l_shipmode order by l_shipmode, NDV=7:
original 2-way vs adaptive quicksort - 1 : 0.42
select * order by l_shipmode, NDV=7:
original 2-way vs adaptive quicksort - 1 : 0.57
large NDV, unique data: no significant difference

Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202
---
M be/src/runtime/sorter-internal.h
M be/src/runtime/sorter-ir.cc
M be/src/runtime/sorter.cc
3 files changed, 174 insertions(+), 43 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/18184/7
--
To view, visit http://gerrit.cloudera.org:8080/18184
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202
Gerrit-Change-Number: 18184
Gerrit-PatchSet: 7
Gerrit-Owner: Noemi Pap-Takacs <npaptak...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <npaptak...@cloudera.com>

Reply via email to