Hello Kurt Deschler, Zoltan Borok-Nagy, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/18184

to look at the new patch set (#10).

Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter
......................................................................

IMPALA-10961: Implementing adaptive 3-way quicksort in sorter

Based on a 3-way partitioning implementation by Kurt Deschler.
3-way quicksort performs much better on data with large number of
duplicates, but has a small regression in case of large NDV.
This adaptive implementation keeps the advantages of both 2-way
and 3-way quicksort. If duplicates are found during pivot selection
(among the 3 randomly selected candidates),the 3-way partitioning
function is called in SortHelper, otherwise partitioning goes 2-way.

Some benchmark results:
On a view created from 4 tpch_parquet lineitem tables
Full sort, 1 node, 1 run - no spills (only in-memory sort is changed)
Time of sorting adaptively during query execution compared to
the original implementation (sort node profile):

 
+----------------------------------------------+----------------+--------------------+
 |                     Test                     | Original 2-way | Adaptive 
Quicksort |
 
+----------------------------------------------+----------------+--------------------+
 | select * order by l_linestatus, NDV=2:       |              1 |              
 0.67 |
 | select l_shipmode order by l_shipmode, NDV=7 |              1 |              
 0.42 |
 | select * order by l_shipmode, NDV=7          |              1 |              
 0.57 |
 | large NDV, unique data                       |              1 |              
    1 | (no difference)
 
+----------------------------------------------+----------------+--------------------+

Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202
---
M be/src/runtime/sorter-internal.h
M be/src/runtime/sorter-ir.cc
M be/src/runtime/sorter.cc
M be/src/util/tuple-row-compare.h
4 files changed, 184 insertions(+), 50 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/84/18184/10
--
To view, visit http://gerrit.cloudera.org:8080/18184
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202
Gerrit-Change-Number: 18184
Gerrit-PatchSet: 10
Gerrit-Owner: Noemi Pap-Takacs <npaptak...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com>
Gerrit-Reviewer: Noemi Pap-Takacs <npaptak...@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>

Reply via email to