Hello Todd Lipcon, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/3990

to look at the new patch set (#16).

Change subject: Predicate evaluation pushdown
......................................................................

Predicate evaluation pushdown

The premise of this patch is to avoid the excessive use of CPU when
evaluating column predicates in specific cases. Dictionary blocks, for
instance, can evaluate predicates by checking whether a string's
codeword matches with the predicate, rather than by doing string
comparisons.

This patch uses a bitset to represent the set of codewords that match
a given predicate on dictionary-encoded columns. Certain decoders now
have the ability to evaluate a predicate without eagerly copying all
of their underlying data into a buffer first. Rather, the decoders can
first evaluate the predicate and copy only when needed.

Since dictionary encoding relies on plain encoding when a dictionary
gets too large, plain encoding also supports decoder-level evaluation.
While lacking the benefit from reduced string comparisons, this
optimization still improves scan speeds by avoiding excessive copies.

See the performance doc for a look into the performance differences
for dictionary encoding and plain encoding:
https://github.com/anjuwong/kudu/blob/pred-pushdown/docs/decoder-eval-perf.md

See the design-doc for predicate-eval-pushdown for a brief overview of the
considered implementations:
https://github.com/anjuwong/kudu/blob/sorted-dict-block/docs/design-docs/predicate-eval-pushdown.md

More in-depth analysis and benchmarking in upcoming blog post.

Change-Id: I31e4cce21e99f63b089d7c84410af8ed914cb576
---
M src/kudu/cfile/binary_dict_block.cc
M src/kudu/cfile/binary_dict_block.h
M src/kudu/cfile/binary_plain_block.cc
M src/kudu/cfile/binary_plain_block.h
M src/kudu/cfile/block_encodings.h
M src/kudu/cfile/cfile-test-base.h
M src/kudu/cfile/cfile-test.cc
M src/kudu/cfile/cfile_reader.cc
M src/kudu/cfile/cfile_reader.h
M src/kudu/cfile/cfile_util.cc
M src/kudu/cfile/encoding-test.cc
A src/kudu/common/column_materialization_context.h
M src/kudu/common/column_predicate.cc
M src/kudu/common/column_predicate.h
M src/kudu/common/generic_iterators-test.cc
M src/kudu/common/generic_iterators.cc
M src/kudu/common/generic_iterators.h
M src/kudu/common/iterator.h
M src/kudu/common/rowblock.h
M src/kudu/common/schema.h
M src/kudu/tablet/CMakeLists.txt
M src/kudu/tablet/cfile_set-test.cc
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
M src/kudu/tablet/delta_applier.cc
M src/kudu/tablet/delta_applier.h
M src/kudu/tablet/delta_iterator_merger.cc
M src/kudu/tablet/delta_iterator_merger.h
M src/kudu/tablet/delta_store.h
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/deltamemstore.cc
M src/kudu/tablet/deltamemstore.h
A src/kudu/tablet/tablet-decoder-eval-test.cc
M src/kudu/tablet/tablet-test-util.h
35 files changed, 930 insertions(+), 124 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/90/3990/16
-- 
To view, visit http://gerrit.cloudera.org:8080/3990
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I31e4cce21e99f63b089d7c84410af8ed914cb576
Gerrit-PatchSet: 16
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andrew Wong <andrew.w...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com>
Gerrit-Reviewer: Andrew Wong
Gerrit-Reviewer: Dan Burkert <d...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>

Reply via email to