Todd Lipcon has posted comments on this change.

Change subject: Integrate ColumnPredicate into client and server
......................................................................


Patch Set 3:

(16 comments)

still have quite a bit more of this to look at, but figured i'd post these 
thoughts before heading home

http://gerrit.cloudera.org:8080/#/c/2138/3/src/kudu/client/client.cc
File src/kudu/client/client.cc:

Line 1050:   // Find the first tablet.
comment doesn't seem quite accurate


Line 1054:     VLOG(1) << "Short circuting scan " << ToString();
typo


Line 1114:   if (data_->short_circuit_) {
why is this necessary? shouldn't next_req_.scanner_id() be empty in the case 
that we didn't need to scan at all? maybe better to just move the 
CHECK(data_->proxy_) down into the 'if' below


http://gerrit.cloudera.org:8080/#/c/2138/3/src/kudu/client/scan_predicate-internal.h
File src/kudu/client/scan_predicate-internal.h:

Line 79:   ComparisonPredicateData* Clone() const override {
it seems like there is a potential bug lurking here: AddToScanSpec std::moves 
col_ into the ScanSpec, but Clone() tries to copy col_. Is there a restriction 
that Clone() cannot be called after AddToScanSpec? Can we document that and add 
a DCHECK?


http://gerrit.cloudera.org:8080/#/c/2138/3/src/kudu/client/scan_predicate.cc
File src/kudu/client/scan_predicate.cc:

Line 73:       auto pred = ColumnPredicate::InclusiveRange(move(col_), nullptr, 
val_void, arena);
can you expand out this auto type? wasn't obvious from context to me what it 
is. (boost::optional, right? the *pred made me think it was some kind of 
pointer)


Line 74:       if (pred) {
I think it's worth a note explaining that InclusiveRange will return none if 
the range actually spans the whole value space


http://gerrit.cloudera.org:8080/#/c/2138/3/src/kudu/client/scanner-internal.cc
File src/kudu/client/scanner-internal.cc:

Line 64:     pool_(),
no need for these empty initializers


Line 100: void ColumnPredicateIntoPB(const ColumnPredicate& predicate,
style: add space above


http://gerrit.cloudera.org:8080/#/c/2138/3/src/kudu/common/generic_iterators.h
File src/kudu/common/generic_iterators.h:

Line 170:   std::vector<std::tuple<int32_t, ColumnPredicate>> predicates_;
maybe colidx_to_predicate_ or something?


Line 173:   std::vector<size_t> non_predicates_;
I think better to rename this to non_predicate_col_indexes or somesuch -- we've 
been bitten too many times in the past by mixing up index + id


http://gerrit.cloudera.org:8080/#/c/2138/3/src/kudu/common/scan_spec.cc
File src/kudu/common/scan_spec.cc:

Line 57: bool ScanSpec::CanShortCircuit() {
hrm, possible thought: would it make more sense to have the 'Optimize()' step 
evaluate whether it can be short-circuited, and if so, remove all the 
predicates and set a 'short_circuited_' flag or somesuch? maybe not.


Line 106:       preds.push_back(predicate.second.ToString());
nit: indent


Line 116:   // Don't bother if we can already short circuit the scan. This also 
let's us
typo: lets


http://gerrit.cloudera.org:8080/#/c/2138/3/src/kudu/tserver/tserver.proto
File src/kudu/tserver/tserver.proto:

Line 213:     optional bytes upper = 2;
let's name this exclusive_upper or excl_upper, or even 'xupper' if you want to 
use that abbreviation consistently. Otherwise I think it's too easy to get 
confused (given we still have to handle the deprecated [inclusive] code path 
for a while, etc)


Line 243:   // DEPRECATED: use column_predicates field.
in other places, we've renamed the field to DEPRECATED_foo -- that way we make 
sure that we've caught all the cases that were using it, and it's clear in the 
code that it's a deprecation compat path.


Line 249:   repeated ColumnPredicatePB column_predicates = 13;
This protocol change will allow old clients to continue to talk to new servers, 
but if new clients send a predicate to an old server, it'll be silently 
ignored. That seems a little problematic.

I think it's OK in beta to not support this compatibility yet, but we should 
have some safeguard so it fails fast instead of silently ignoring all the 
predicates. For example, we could add a bitfield to the response like 
'supported_features', and then the client can check that the response has a bit 
set like COLUMN_PREDICATE_PB?

We could also do this more generically in the RPC system... worth thinking 
about a little. I pinged the #grpc IRC channel to see what they normally 
consider best practices for changes like this.


-- 
To view, visit http://gerrit.cloudera.org:8080/2138
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ife6852680b7f59fddee688e5702c1a70944f7622
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: David Ribeiro Alves <[email protected]>
Gerrit-Reviewer: Jean-Daniel Cryans
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-HasComments: Yes

Reply via email to