Todd Lipcon has posted comments on this change. Change subject: Add ColumnPredicate class ......................................................................
Patch Set 2: (25 comments) http://gerrit.cloudera.org:8080/#/c/2137/2/src/kudu/common/column_predicate-test.cc File src/kudu/common/column_predicate-test.cc: Line 30: void test_merge(const ColumnSchema& column, - style: TestMerge - please add a short doc - why do you have to pass 'column' here when it's already passed to the predicate ctors? Line 48: ASSERT_EQ(expected.predicate_type(), type); since the user passes in expected, why do you also need to pass in 'type'? can't you just use expected.predicate_type() in the assertion down below? Line 58: // [--------) can you add 'AND' at the end of this line? Line 60: // and a '=" or "transforms to:" here? (same in the examples below) Line 225: // Test that the range constructor handles equality and disjoint ranges. I think you mean 'degenerate' ranges, not disjoint? http://gerrit.cloudera.org:8080/#/c/2137/2/src/kudu/common/column_predicate.cc File src/kudu/common/column_predicate.cc: Line 95: predicate_type_ = PredicateType::None; nit: indent Line 112: default: LOG(FATAL) << "unknown predicate type"; is it better to leave this out and put the LOG(FATAL) after the switch? that way I think you get a compilation error instead? Line 117: CHECK(predicate_type_ == PredicateType::Range); CHECK_EQ (or does that fail because of inability to stringify an enum class?) Line 125: remove blank line (or maybe put the blank line above the 'case' line if you want some visual separation) Line 128: && (lower_ == nullptr || column_.type_info()->Compare(lower_, other.lower_) < 0)) { nit: usually put && on previous line Line 138: // Check if the resulting range is disjoint or an equality. I think 'empty' is better than 'disjoint' (disjoint to me implies two separate intervals, rather than an interval with negative width). Or say check whether the original ranges were disjoint. Can you refactor this code that's shared with the Range() factory function into a new function like 'Simplify()'? Line 154: || column_.type_info()->Compare(upper_, other.lower_) <= 0) { nit: || on prev line Line 274: return strings::Substitute("(`$0` BETWEEN $1 AND $2)", this isn't quite right since BETWEEN is an inclusive range Line 288: CHECK(column_.Equals(other.column_, false)); seems fishy to CHECK here instead of just returning false http://gerrit.cloudera.org:8080/#/c/2137/2/src/kudu/common/column_predicate.h File src/kudu/common/column_predicate.h: Line 50: // so it's lifetime must be managed to make sure it does not reference invalid nit: its Line 59: static ColumnPredicate Equality(ColumnSchema column, const void* value); sort of odd that this class is partially managing memory -- it copies the 'column', but doesn't make defensive copies of the values. Seems like, if it's perf-sensitive, we should probably try to copy nothing, and if it's not, we should copy everything? Line 62: // exclusive upper bound. clarify if you can pass NULLs to represent open-ended intervals I think it's also worth noting that this method actually performs 'simplification' of degenerate ranges (both simplify-to-equality, and simplify-to-none) Line 71: // arena must outlive the returned predicate. should clarify what the arena is used for Line 73: // If a normalized column predicate can not be created, then none will be nit: 'cannot' Line 96: // Data is not copied from the other predicate, so it's data must continue "its" Line 99: // If the predicates are disjoint, then an IllegalArgument status is returned. not true anymore Line 115: void Evaluate(const ColumnBlock& block, SelectionVector *sel) const; style nit: * goes with the type Line 165: // The exclusive upper bound value if this is a Range predicate. what about open-ended intervals? Line 167: }; should you DISALLOW_COPY_AND_ASSIGN? if not, then why do you need SetToNone? (why not just assign a new None?) Line 169: // Compares predicates according to selectivity. should note that in this context, a higher 'selectivity' means an estimation that the predicate will _match_ a higher number of rows (not _remove_ a higher number). this terminology always trips people up. -- To view, visit http://gerrit.cloudera.org:8080/2137 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I72cac35d7a168f96c2ee52afd284489e99755e3f Gerrit-PatchSet: 2 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Dan Burkert <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Dan Burkert <[email protected]> Gerrit-Reviewer: David Ribeiro Alves <[email protected]> Gerrit-Reviewer: Jean-Daniel Cryans Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-HasComments: Yes
