Alexey Serbin has posted comments on this change. Change subject: KUDU-1363: Add IN-list predicate type ......................................................................
Patch Set 12: (10 comments) http://gerrit.cloudera.org:8080/#/c/2986/12/src/kudu/client/predicate-test.cc File src/kudu/client/predicate-test.cc: Line 546: } consider adding a test cases for 1. value IN (NULL) 2. value IN (NULL, <some_value>) http://gerrit.cloudera.org:8080/#/c/2986/12/src/kudu/client/scan_predicate.cc File src/kudu/client/scan_predicate.cc: Line 122: vector<const void*> vals_list; vals_list.reserve(vals_.size()) ? http://gerrit.cloudera.org:8080/#/c/2986/12/src/kudu/common/column_predicate.cc File src/kudu/common/column_predicate.cc: PS12, Line 334: values_.erase(std::remove_if(values_.begin(), values_.end(), : [&other] (const void* v){ : return !other.CheckValueInRange(v); : an optimization opportunity if we expect IN() lists to be large. Get the upper and lower bounds of range and then found corresponding boundaries in the values_. If the range does not cover the list, copy the values in the range into a temporary array and swap with values_. Line 352: case PredicateType::IsNotNull: return; What if it was IN (NULL) predicate? PS12, Line 501: } else if (predicate_type_ == PredicateType::Equality) { : return column_.type_info()->Compare(lower_, other.lower_) == 0; : } else if (predicate_type_ == PredicateType::Range) { : return (lower_ == other.lower_ || : (lower_ != nullptr && other.lower_ != nullptr && : column_.type_info()->Compare(lower_, other.lower_) == 0)) && : (upper_ == other.upper_ || : (upper_ != nullptr && other.upper_ != nullptr && : column_.type_info()->Compare(upper_, other.upper_) == 0)); : } else if (predicate_type_ == PredicateType::InList) { : if (values_.size() != other.values_.size()) return false; : for (int i = 0; i < values_.size(); i++) { : if (column_.type_info()->Compare(values_[i], other.values_[i]) != 0) return false; : } : } May be, it's time to start using the switch() here instead? PS12, Line 537: other Is this by-copy capture? Consider replacing with by-reference capture to avoid copying. PS12, Line 532: void ColumnPredicate::MergeInLists(const ColumnPredicate& other) { : // We iterate through the values_ list and remove elements that dont : // exist in the other ColumnPredicate's list : // Each list vector is sorted and contains only unique values. : values_.erase(std::remove_if(values_.begin(), values_.end(), : [this, other](const void *v) { : return !other.CheckValueInList(v); : This looks strange to me: given the name of the method, I would expect it to _merge_ lists, i.e. add the values from other.values_ which are not present in this->values_ Consider renaming this method to something like 'IntersectInLists'. http://gerrit.cloudera.org:8080/#/c/2986/12/src/kudu/common/key_util.cc File src/kudu/common/key_util.cc: PS12, Line 154: if (predicate->predicate_type() == PredicateType::Equality) { : memcpy(row->mutable_cell_ptr(*col_idx_it), predicate->raw_lower(), size); : pushed_predicates++; : final_predicate = predicate; : } else if (predicate->predicate_type() == PredicateType::Range) { : if (predicate->raw_upper() != nullptr) { : memcpy(row->mutable_cell_ptr(*col_idx_it), predicate->raw_upper(), size); : pushed_predicates++; : final_predicate = predicate; : } : // After the first column with a range constraint we stop pushing : // constraints into the upper bound. Instead, we push minimum values : // to the remaining columns (below), which is the maximally tight : // constraint. : break; : } else if (predicate->predicate_type() == PredicateType::InList) { : // Since the InList predicate is a sorted vector of values, the last : // value would provide an Inclusive Upper Bound that can be pushed. : memcpy(row->mutable_cell_ptr(*col_idx_it), predicate->raw_values().back(), size); : pushed_predicates++; : final_predicate = predicate; : } else { : LOG(FATAL) << "unexpected predicate type can not be pushed into key"; : } : } nit: consider converting this into switch() PS12, Line 215: if (predicate->predicate_type() == PredicateType::Equality) { : memcpy(row->mutable_cell_ptr(*col_idx_it), predicate->raw_lower(), size); : pushed_predicates++; : } else if (predicate->predicate_type() == PredicateType::Range) { : if (predicate->raw_lower() != nullptr) { : memcpy(row->mutable_cell_ptr(*col_idx_it), predicate->raw_lower(), size); : pushed_predicates++; : } else { : break; : } : } else if (predicate->predicate_type() == PredicateType::InList) { : // Since the InList predicate is a sorted vector of values, the first : // value would provide an lower bound that can be pushed. : memcpy(row->mutable_cell_ptr(*col_idx_it), predicate->raw_values().front(), size); : pushed_predicates++; : } else { : LOG(FATAL) << "unexpected predicate type can not be pushed into key"; : } : } nit: consider converting this into switch() http://gerrit.cloudera.org:8080/#/c/2986/12/src/kudu/common/wire_protocol.cc File src/kudu/common/wire_protocol.cc: PS12, Line 395: string pb_value; Why is it necessary here? -- To view, visit http://gerrit.cloudera.org:8080/2986 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I986cb13097f1cf4af2b752f58c4c38f412a6a598 Gerrit-PatchSet: 12 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Sameer Abhyankar <abhyan...@gmail.com> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Dan Burkert <d...@cloudera.com> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Sameer Abhyankar <abhyan...@gmail.com> Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-HasComments: Yes