Hello Tidy Bot, Alexey Serbin, Ashwani Raina, Yingchun Lai, Yifan Zhang, Kudu 
Jenkins, Abhishek Chennaka, KeDeng, Wang Xixu,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/19794

to look at the new patch set (#6).

Change subject: [cpp-client] KUDU-3455 Reduce space complexity and speed up 
hash partition pruning for in-list predicate
......................................................................

[cpp-client] KUDU-3455 Reduce space complexity and speed up hash partition 
pruning for in-list predicate

This patch comes from https://gerrit.cloudera.org/c/19568/. As that patch
said, logic of pruning hash partitions for in-list predicate in Kudu cpp
client has also a high space complexity and slow. Old algorithm must keep
all intermedium objects because they are incomplete until they are
completed and can be computed hash.

This patch fixes the problems and provides a recursive algorithm the
same as java client in patch: https://gerrit.cloudera.org/c/19568/.

Compared with java client, the cpp client is less likely to cause the
OOM condition because it does not keep too many intermediate results.
This optimization has good benefits too. The benefits are related to
the in-list length and the number of primary columns, The performance
would be better if in-list length is bigger. For example,
PartitionPrunerTest::TestMultiColumnInListHashPruningManyValues,
Using 10 key columns and kMaxInListLength=50, old algorithm memory cost
may reach 600MB, while new algorithm's memory cost can be ignored
(it only need one objects and a few stacks for contexts). At the same
time, new algorithm has a good speedup, some effect as below:

combination_count: 5554006920000, old cost: 428238us, new cost: 713us, speedup: 
600.6x
combination_count: 89083783664568, old cost: 2764924us, new cost: 1145us, 
speedup: 2414.7x
combination_count: 27194091724800, old cost: 1610475us, new cost: 1151us, 
speedup: 1399.2x
combination_count: 7116622216704, old cost: 34544289us, new cost: 375us, 
speedup: 92118.1x
combination_count: 37570734489600, old cost: 1733205us, new cost: 901us, 
speedup: 1923.6x

Change-Id: Ie4bea5c10b4ac2c62b85625fe9d2a33ceb4fb2e9
---
M src/kudu/common/partition_pruner-test.cc
M src/kudu/common/partition_pruner.cc
M src/kudu/common/partition_pruner.h
3 files changed, 236 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/94/19794/6
--
To view, visit http://gerrit.cloudera.org:8080/19794
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie4bea5c10b4ac2c62b85625fe9d2a33ceb4fb2e9
Gerrit-Change-Number: 19794
Gerrit-PatchSet: 6
Gerrit-Owner: Yuqi Du <shenxingwuy...@gmail.com>
Gerrit-Reviewer: Abhishek Chennaka <achenn...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <ale...@apache.org>
Gerrit-Reviewer: Ashwani Raina <ara...@cloudera.com>
Gerrit-Reviewer: KeDeng <kdeng...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Yifan Zhang <chinazhangyi...@163.com>
Gerrit-Reviewer: Yingchun Lai <laiyingc...@apache.org>
Gerrit-Reviewer: Yuqi Du <shenxingwuy...@gmail.com>

Reply via email to