Todd Lipcon has posted comments on this change. Change subject: KUDU-1165: Implement partition pruning for C++ client and server ......................................................................
Patch Set 10: (19 comments) http://gerrit.cloudera.org:8080/#/c/2413/10/src/kudu/client/client.cc File src/kudu/client/client.cc: Line 1063: if (!data_->partition_pruner_.HasMorePartitions()) { can you just make this a || for the above if? http://gerrit.cloudera.org:8080/#/c/2413/10/src/kudu/common/CMakeLists.txt File src/kudu/common/CMakeLists.txt: Line 89: #ADD_KUDU_TEST(partition_pruner-test) hrm? http://gerrit.cloudera.org:8080/#/c/2413/10/src/kudu/common/partition_pruner-test.cc File src/kudu/common/partition_pruner-test.cc: Line 54: size_t expected_pruned_tablets) { maybe it's just me, but as I was reviewing this file, I found I was able to intuitively think "how many tablets should this query have to access?" but the assertions were all "how many should it _not_ have to access?". I kept having to subtract to validate that the assertion was what I expected. For example, if a query fully specifies the hash key, I can intuitively think "oh, this only needs one tablet." but it's hard to intuitively know "there should be 29 that got pruned" without looking up at the partition schema. Would it be really painful to ask you to make this method be CheckRemainingPartitions or something, and change the assertions to be looking at how many partitions remain? Line 343: Check(vector<ColumnPredicate> {}, 0); curious (still learning c++11): why do you need to list the type here for this initializer list? Doesn't it infer it from the parameter? http://gerrit.cloudera.org:8080/#/c/2413/10/src/kudu/common/partition_pruner.cc File src/kudu/common/partition_pruner.cc: Line 72: if (scan_spec.lower_bound_key() != nullptr || scan_spec.exclusive_upper_bound_key() != nullptr) { invert this condition to return early Line 130: Arena arena(max<size_t>(16, schema.key_byte_size()), 4096); maybe add Arena::kMinimumChunkSize so this isn't a magic number 16? Line 147: const ScanSpec& scan_spec) { is there a way we can CHECK(scan_spec.optimized()) or something to enforce the precondition that you mention in the comment for this method? Line 174: // Construct the list of hash buckets, or none if the hash bucket is not this could do with a big header above it something like: //---------------------- // Handle pruning partitions based on any hash distribution rule which is fully specified by an equality predicate or something (to separate it from the above range-pruning stuff) Line 179: int constrained_index = -1; hrm, not entirely following the purpose of this. This is just the index of the last element of hash_buckets which is not boost::none? can you compute it in a second loop down below instead of merging it into this one to make the code a little easier to follow? Line 186: for (int i = 0; i < hash_bucket_schema.column_ids.size(); i++) { can you rename this or the other (or both) loop index to avoid shadowing 'i'? Line 208: if (!range_lower_bound.empty() && !range_upper_bound.empty()) { I don't entirely understand what this is doing either, should add a comment Line 258: for (auto range = ranges.rbegin(); range != ranges.rend(); range++) { can you refactor this into another method like RemovePartitionKeyRange below? Line 260: scan_spec.exclusive_upper_bound_partition_key() < get<1>(*range)) { can you invert this and break early to avoid the extra nesting? Line 291: void PartitionPruner::RemovePartitionKeyRange(const string& bound) { clarify bound as upper or lower bound Line 301: if (bound > get<0>(*range)) { same http://gerrit.cloudera.org:8080/#/c/2413/10/src/kudu/common/partition_pruner.h File src/kudu/common/partition_pruner.h: Line 33: class Schema; given the above forward decls, do we need the #includes above? Line 35: class PartitionPruner { could do with a little class-doc Line 51: void RemovePartitionKeyRange(const std::string& upper_bound); I think this method is named somewhat unintuitively Line 64: }; DISALLOW_COPY_AND_ASSIGN -- To view, visit http://gerrit.cloudera.org:8080/2413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9f4fc14b44af63a17b34cd9c2a9134127b1afe4d Gerrit-PatchSet: 10 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan Burkert <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-HasComments: Yes
