Hello Todd Lipcon, Kudu Jenkins, I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/3990 to look at the new patch set (#16). Change subject: Predicate evaluation pushdown ...................................................................... Predicate evaluation pushdown The premise of this patch is to avoid the excessive use of CPU when evaluating column predicates in specific cases. Dictionary blocks, for instance, can evaluate predicates by checking whether a string's codeword matches with the predicate, rather than by doing string comparisons. This patch uses a bitset to represent the set of codewords that match a given predicate on dictionary-encoded columns. Certain decoders now have the ability to evaluate a predicate without eagerly copying all of their underlying data into a buffer first. Rather, the decoders can first evaluate the predicate and copy only when needed. Since dictionary encoding relies on plain encoding when a dictionary gets too large, plain encoding also supports decoder-level evaluation. While lacking the benefit from reduced string comparisons, this optimization still improves scan speeds by avoiding excessive copies. See the performance doc for a look into the performance differences for dictionary encoding and plain encoding: https://github.com/anjuwong/kudu/blob/pred-pushdown/docs/decoder-eval-perf.md See the design-doc for predicate-eval-pushdown for a brief overview of the considered implementations: https://github.com/anjuwong/kudu/blob/sorted-dict-block/docs/design-docs/predicate-eval-pushdown.md More in-depth analysis and benchmarking in upcoming blog post. Change-Id: I31e4cce21e99f63b089d7c84410af8ed914cb576 --- M src/kudu/cfile/binary_dict_block.cc M src/kudu/cfile/binary_dict_block.h M src/kudu/cfile/binary_plain_block.cc M src/kudu/cfile/binary_plain_block.h M src/kudu/cfile/block_encodings.h M src/kudu/cfile/cfile-test-base.h M src/kudu/cfile/cfile-test.cc M src/kudu/cfile/cfile_reader.cc M src/kudu/cfile/cfile_reader.h M src/kudu/cfile/cfile_util.cc M src/kudu/cfile/encoding-test.cc A src/kudu/common/column_materialization_context.h M src/kudu/common/column_predicate.cc M src/kudu/common/column_predicate.h M src/kudu/common/generic_iterators-test.cc M src/kudu/common/generic_iterators.cc M src/kudu/common/generic_iterators.h M src/kudu/common/iterator.h M src/kudu/common/rowblock.h M src/kudu/common/schema.h M src/kudu/tablet/CMakeLists.txt M src/kudu/tablet/cfile_set-test.cc M src/kudu/tablet/cfile_set.cc M src/kudu/tablet/cfile_set.h M src/kudu/tablet/delta_applier.cc M src/kudu/tablet/delta_applier.h M src/kudu/tablet/delta_iterator_merger.cc M src/kudu/tablet/delta_iterator_merger.h M src/kudu/tablet/delta_store.h M src/kudu/tablet/deltafile.cc M src/kudu/tablet/deltafile.h M src/kudu/tablet/deltamemstore.cc M src/kudu/tablet/deltamemstore.h A src/kudu/tablet/tablet-decoder-eval-test.cc M src/kudu/tablet/tablet-test-util.h 35 files changed, 930 insertions(+), 124 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/90/3990/16 -- To view, visit http://gerrit.cloudera.org:8080/3990 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I31e4cce21e99f63b089d7c84410af8ed914cb576 Gerrit-PatchSet: 16 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Andrew Wong <andrew.w...@cloudera.com> Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Andrew Wong Gerrit-Reviewer: Dan Burkert <d...@cloudera.com> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon <t...@apache.org>