Mahesh Reddy has posted comments on this change. ( http://gerrit.cloudera.org:8080/17879 )
Change subject: WIP KUDU-2671: Follow up pruning patch. ...................................................................... Patch Set 6: (11 comments) http://gerrit.cloudera.org:8080/#/c/17879/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/17879/5//COMMIT_MSG@9 PS5, Line 9: This patch modifies how partition key ranges are removed in : PartitionPruner::RemovePartitionKeyRange(). : : Currently, full scans using KuduScanner with no p > Could you add the rationale explaining why it makes sense doing so? Were t After some more examination, the entire set does not need to be reverse sorted, only each individual set of partition key ranges. http://gerrit.cloudera.org:8080/#/c/17879/5//COMMIT_MSG@16 PS5, Line 16: the range. Further : testing is required to s > Could you clarify in the commit message what you meant by 'incorrect'? Doe I can expand on it but as the next sentence explains, all tablets with the same hash component are returned first, regardless of range. Ultimately, the order in which tablets are scanned doesn't have an impact on the final result as long as the right tablets are being scanned. That being said, as we've discussed offline, I'll try to find a test case where the order proves to be problematic. http://gerrit.cloudera.org:8080/#/c/17879/5//COMMIT_MSG@25 PS5, Line 25: > IIRC, INT8_MIN has always meant 'no boundary', no? If so, then simply rena This variable isn't actually meant to be set, it just allows one to call CheckScanWithColumnPredicate() while only setting one of the greater/less predicates. I modified the logic but the downside with this approach is one cannot test a scan while setting the predicate to the filler variable. It still doesn't look clean to me so I would appreciate it if you had any tips. http://gerrit.cloudera.org:8080/#/c/17879/5/src/kudu/client/scan_token-test.cc File src/kudu/client/scan_token-test.cc: http://gerrit.cloudera.org:8080/#/c/17879/5/src/kudu/client/scan_token-test.cc@143 PS5, Line 143: for (auto* t : tokens) { > It seems 'token_count' isn't used anywhere but in the debug output which is Done http://gerrit.cloudera.org:8080/#/c/17879/5/src/kudu/client/scan_token-test.cc@151 PS5, Line 151: count += batch.NumRows(); > nit: remove this Done http://gerrit.cloudera.org:8080/#/c/17879/5/src/kudu/client/scan_token-test.cc@156 PS5, Line 156: } > ditto Done http://gerrit.cloudera.org:8080/#/c/17879/5/src/kudu/client/scan_token-test.cc@225 PS5, Line 225: vector<scoped_refptr<TabletInfo>> all_tablets_info; : { > Maybe, simply return the number of rows via an output parameter? The funct Done http://gerrit.cloudera.org:8080/#/c/17879/5/src/kudu/client/scanner-internal.cc File src/kudu/client/scanner-internal.cc: http://gerrit.cloudera.org:8080/#/c/17879/5/src/kudu/client/scanner-internal.cc@518 PS5, Line 518: // Check if the meta cache returned a tablet covering a partition key range past > nit: remove this Done http://gerrit.cloudera.org:8080/#/c/17879/5/src/kudu/common/partition_pruner-test.cc File src/kudu/common/partition_pruner-test.cc: http://gerrit.cloudera.org:8080/#/c/17879/5/src/kudu/common/partition_pruner-test.cc@1174 PS5, Line 1174: // B = 1 > nit: not sure these extra lines add extra readability :) Done http://gerrit.cloudera.org:8080/#/c/17879/5/src/kudu/common/partition_pruner.h File src/kudu/common/partition_pruner.h: http://gerrit.cloudera.org:8080/#/c/17879/5/src/kudu/common/partition_pruner.h@109 PS5, Line 109: : // Evaluated to true when the pruner is > Even if the names of these member fields are self-explanatory, it would be Done http://gerrit.cloudera.org:8080/#/c/17879/5/src/kudu/common/partition_pruner.cc File src/kudu/common/partition_pruner.cc: http://gerrit.cloudera.org:8080/#/c/17879/5/src/kudu/common/partition_pruner.cc@429 PS5, Line 429: scan_contains_predicates_ > Is it possible to rely on scan_spec.predicates() emptiness instead? If not Good catch. Changed it to rely on scan_spec.predicates() instead. -- To view, visit http://gerrit.cloudera.org:8080/17879 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3a1bf5344c0ef856072d3ed102714dce5ba21060 Gerrit-Change-Number: 17879 Gerrit-PatchSet: 6 Gerrit-Owner: Mahesh Reddy <[email protected]> Gerrit-Reviewer: Abhishek Chennaka <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mahesh Reddy <[email protected]> Gerrit-Reviewer: Tidy Bot (241) Gerrit-Comment-Date: Thu, 02 Jun 2022 07:26:26 +0000 Gerrit-HasComments: Yes
