Dan Burkert has posted comments on this change.

Change subject: KUDU-1643 Prune hash partitions based on IN-list predicates
......................................................................


Patch Set 7:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/5176/7/src/kudu/common/partition_pruner.cc
File src/kudu/common/partition_pruner.cc:

Line 165:     hash_values->insert(hash);
add an early return here so that the logic below can be outside the else block.


Line 181:       const KeyEncoder<string>& encoder = 
GetKeyEncoder<string>(column.type_info());
I think this can be hoisted outside of the for loop.


Line 194:         return;
This means the has component is unconstrained, right?  So does a full set and 
an empty set mean unconstrained?


Line 295:   vector<vector<uint32_t>> hash_buckets;
It could be more efficient to make this a vector<vector<bool>>, with the inner 
vector being a bitset.  That would eliminate the need for copying between an 
unordered_set and the vector.  the vector<bool> would need to be created with 
as many false values as hash buckets


Line 300:     SearchHashOfColumnCombination(partition_schema,
Given how much more expensive the hash partition search might be in terms of # 
of serialized key components, I think it would be prudent to check up front 
that all of the predicates are equality or in list before proceeding.


Line 309:     hash_buckets.push_back(hash_bucket);
this can be more efficient as

    hash_buckets.emplace_back(std::move(hash_bucket));


Line 344:     if (!hash_buckets[hash_idx].empty()) {
given a `vector<bool>` representation, this if/else clause could be combined if 
the unconstrained case is represented as a full set.


-- 
To view, visit http://gerrit.cloudera.org:8080/5176
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I578ac3c5e6770b13b0bebe01127d7d8263aceb36
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Haijie Hong <honghai...@gmail.com>
Gerrit-Reviewer: Dan Burkert <danburk...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes

Reply via email to