Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/18184 )
Change subject: IMPALA-10961: Implementing adaptive 3-way quicksort in sorter ...................................................................... Patch Set 7: (6 comments) I'm really excited about this change as it will greatly improve partitioned INSERTs. The change looks really good, I've only found a few naming issues. http://gerrit.cloudera.org:8080/#/c/18184/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/18184/7//COMMIT_MSG@9 PS7, Line 9: 3way nit: 3-way http://gerrit.cloudera.org:8080/#/c/18184/7//COMMIT_MSG@23 PS7, Line 23: select * order by l_linestatus, NDV=2: : original 2-way vs adaptive quicksort - 1 : 0.67 : select l_shipmode order by l_shipmode, NDV=7: : original 2-way vs adaptive quicksort - 1 : 0.42 : select * order by l_shipmode, NDV=7: : original 2-way vs adaptive quicksort - 1 : 0.57 : large NDV, unique data: no significant difference Using https://ozh.github.io/ascii-tables/ (for these benchmark tables it doesn't matter if the line widths are longer) +----------------------------------------------+----------------+--------------------+ | Test | Original 2-way | Adaptive Quicksort | +----------------------------------------------+----------------+--------------------+ | select * order by l_linestatus, NDV=2: | 1 | 0.67 | | select l_shipmode order by l_shipmode, NDV=7 | 1 | 0.42 | | select * order by l_shipmode, NDV=7 | 1 | 0.57 | | large NDV, unique data | 1 | 1 | (no difference) +----------------------------------------------+----------------+--------------------+ http://gerrit.cloudera.org:8080/#/c/18184/7/be/src/runtime/sorter-internal.h File be/src/runtime/sorter-internal.h: http://gerrit.cloudera.org:8080/#/c/18184/7/be/src/runtime/sorter-internal.h@533 PS7, Line 533: hasEquals nit: we name variables with underscores instead of camel case in the backend. I.e. it should be 'has_equals' http://gerrit.cloudera.org:8080/#/c/18184/7/be/src/runtime/sorter-ir.cc File be/src/runtime/sorter-ir.cc: http://gerrit.cloudera.org:8080/#/c/18184/7/be/src/runtime/sorter-ir.cc@157 PS7, Line 157: reinterpret_cast<TupleRow*>(&temp_tuple) nit: Since it's being used at multiple places, maybe create a variable 'temp_tuple_row' for it. And how about calling 'temp_tuple' as 'pivot_tuple' and 'temp_tuple_row' as 'pivot_tuple_row'? http://gerrit.cloudera.org:8080/#/c/18184/7/be/src/runtime/sorter-ir.cc@260 PS7, Line 260: hasEquals nit: should be 'has_equals' http://gerrit.cloudera.org:8080/#/c/18184/7/be/src/runtime/sorter-ir.cc@320 PS7, Line 320: lt Maybe rename it to 't1_cmp_t2'? -- To view, visit http://gerrit.cloudera.org:8080/18184 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I81e7b36a04a43de3b83e6aeee49ca0943f0bf202 Gerrit-Change-Number: 18184 Gerrit-PatchSet: 7 Gerrit-Owner: Noemi Pap-Takacs <npaptak...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com> Gerrit-Reviewer: Noemi Pap-Takacs <npaptak...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Comment-Date: Tue, 08 Feb 2022 19:34:58 +0000 Gerrit-HasComments: Yes