Bankim Bhavsar has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15913
Change subject: [perf] Check range predicate first while evaluating Bloom filter predicate ...................................................................... [perf] Check range predicate first while evaluating Bloom filter predicate Range predicates can be specified along with Bloom filter predicate for the same column. It's cheaper to check against range predicate and exit early if the column value is out of bounds compared to computing hash and then looking up the value in Bloom filter. This case is common when Impala pushes down Bloom filter predicate as it'll likely be accompained by min-max filter (i.e. range predicate) on the same column. Tests: Added a test case that scans against 100M column values. Across iterations observed an improvement of 20-30% when the range predicate check prevents hash computation and Bloom filter lookup. Don't see any noticeable regression for the case where values are within range bounds. Without perf change: Time spent TestKuduBloomFilterPredicateBenchmark: Counting rows when no rows expected: real 0.953s user 0.001s sys 0.000s Time spent TestKuduBloomFilterPredicateBenchmark: Counting rows when range predicate doesn't prune: real 0.767s user 0.001s sys 0.000s Time spent TestKuduBloomFilterPredicateBenchmark: Counting rows when no rows expected: real 0.899s user 0.000s sys 0.000s Time spent TestKuduBloomFilterPredicateBenchmark: Counting rows when range predicate doesn't prune: real 0.775s user 0.000s sys 0.001s Time spent TestKuduBloomFilterPredicateBenchmark: Counting rows when no rows expected: real 0.983s user 0.000s sys 0.000s Time spent TestKuduBloomFilterPredicateBenchmark: Counting rows when range predicate doesn't prune: real 0.832s user 0.001s sys 0.000s With perf change: Time spent TestKuduBloomFilterPredicateBenchmark: Counting rows when no rows expected: real 0.725s user 0.001s sys 0.000s Time spent TestKuduBloomFilterPredicateBenchmark: Counting rows when range predicate doesn't prune: real 0.847s user 0.000s sys 0.000s Time spent TestKuduBloomFilterPredicateBenchmark: Counting rows when no rows expected: real 0.664s user 0.000s sys 0.000s Time spent TestKuduBloomFilterPredicateBenchmark: Counting rows when range predicate doesn't prune: real 0.794s user 0.001s sys 0.000s Time spent TestKuduBloomFilterPredicateBenchmark: Counting rows when no rows expected: real 0.706s user 0.001s sys 0.000s Time spent TestKuduBloomFilterPredicateBenchmark: Counting rows when range predicate doesn't prune: real 0.774s user 0.000s sys 0.000s Change-Id: I8451d6ddfe1fbdf307b3e9f2cc23a8d06e655ba3 --- M src/kudu/client/predicate-test.cc M src/kudu/common/column_predicate.h 2 files changed, 69 insertions(+), 42 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/13/15913/1 -- To view, visit http://gerrit.cloudera.org:8080/15913 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I8451d6ddfe1fbdf307b3e9f2cc23a8d06e655ba3 Gerrit-Change-Number: 15913 Gerrit-PatchSet: 1 Gerrit-Owner: Bankim Bhavsar <ban...@cloudera.com>