Hello Daniel Becker, Gabor Kaszab, Impala Public Jenkins, I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/21435 to look at the new patch set (#6). Change subject: IMPALA-13088: (part 1) Improve build batch processing of IcebergDeleteBuilder ...................................................................... IMPALA-13088: (part 1) Improve build batch processing of IcebergDeleteBuilder When there are lots of delete records the IcebergDeleteBuilder can become a bottleneck. Since the left side of the JOIN is blocked on the build side any improvement we make here significantly improves Iceberg V2 table scanning. Improvements of this patch: * Use a vector of vectors to collect the position delete records. This way we can avoid large re-allocations and copyings. * Insert large ranges from the build batches into the collected delete records instead of doing it one-by-one. Measurements Local measurement with 824 Million position delete records: JOIN BUILD: ~32s -> ~14s (6s is the final sorting) 40-node cluster with 68.5 Billion position delete records: JOIN BUILD: 4m15s -> 1m45s (1m7s is the final sorting) Parallelization of the final sort will be added in a follow-up CR. Change-Id: I14541a064a522d4780fb5f02636736259e79b9cf --- M be/src/exec/iceberg-delete-builder.cc M be/src/exec/iceberg-delete-builder.h 2 files changed, 113 insertions(+), 27 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/21435/6 -- To view, visit http://gerrit.cloudera.org:8080/21435 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I14541a064a522d4780fb5f02636736259e79b9cf Gerrit-Change-Number: 21435 Gerrit-PatchSet: 6 Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com> Gerrit-Reviewer: Daniel Becker <daniel.bec...@cloudera.com> Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>