Dan Burkert has posted comments on this change. Change subject: KUDU-1643 Prune hash partitions based on IN-list predicates ......................................................................
Patch Set 7: (7 comments) http://gerrit.cloudera.org:8080/#/c/5176/7/src/kudu/common/partition_pruner.cc File src/kudu/common/partition_pruner.cc: Line 165: hash_values->insert(hash); add an early return here so that the logic below can be outside the else block. Line 181: const KeyEncoder<string>& encoder = GetKeyEncoder<string>(column.type_info()); I think this can be hoisted outside of the for loop. Line 194: return; This means the has component is unconstrained, right? So does a full set and an empty set mean unconstrained? Line 295: vector<vector<uint32_t>> hash_buckets; It could be more efficient to make this a vector<vector<bool>>, with the inner vector being a bitset. That would eliminate the need for copying between an unordered_set and the vector. the vector<bool> would need to be created with as many false values as hash buckets Line 300: SearchHashOfColumnCombination(partition_schema, Given how much more expensive the hash partition search might be in terms of # of serialized key components, I think it would be prudent to check up front that all of the predicates are equality or in list before proceeding. Line 309: hash_buckets.push_back(hash_bucket); this can be more efficient as hash_buckets.emplace_back(std::move(hash_bucket)); Line 344: if (!hash_buckets[hash_idx].empty()) { given a `vector<bool>` representation, this if/else clause could be combined if the unconstrained case is represented as a full set. -- To view, visit http://gerrit.cloudera.org:8080/5176 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I578ac3c5e6770b13b0bebe01127d7d8263aceb36 Gerrit-PatchSet: 7 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Haijie Hong <honghai...@gmail.com> Gerrit-Reviewer: Dan Burkert <danburk...@apache.org> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-HasComments: Yes