Todd Lipcon has posted comments on this change.

Change subject: KUDU-2065, KUDU-2011: Release sidecars on cancellation or 
timeout
......................................................................


Patch Set 1:

(28 comments)

http://gerrit.cloudera.org:8080/#/c/7599/1//COMMIT_MSG
Commit Message:

PS1, Line 9: call times out
worth specifying "times out after transmission of the request has begun but not 
finished" or something


Line 17: the caller's point of view for resource management.
mind linking to the Impala JIRA where it was discovered this lifecycle was 
causing trouble?


PS1, Line 20: in-flight
not in-flight but mid-transmission. We still don't support a preemptive "abort" 
command for an RPC which has already been fully transmitted, right?


PS1, Line 22: flight
nit: again I think "mid-transmission" is clearer since we usually call an RPC 
"in-flight" while it has been fully sent and awaiting a response


http://gerrit.cloudera.org:8080/#/c/7599/1/docs/design-docs/rpc.md
File docs/design-docs/rpc.md:

PS1, Line 274: | RPC Footer protobuf length (variable encoding) | --- Exists 
only in client requests.
             : +------------------------------------------------+
             : | RPC Footer protobuf                            | --- Exists 
only in client requests.
             : +
This being a variable-length protobuf is a little bit risky, in the sense that 
you are modifying the value here after you've started transmitting the message 
(i.e. after computing 'total message length'). So, you're requiring that any 
value modification you do doesn't change the serialized length of the message?

It seems like a safe bet that booleans are always going to be one byte, but we 
should make sure we have a clear note somewhere about this -- perhaps in the 
protobuf definition something that says "any fields which are computed during 
the process of sending the body of the message must be fixed length protobuf 
types"?


PS1, Line 366: cancelled before it completes
should clarify "mid-tranmission" here


Line 387: ```
can you add some note here that says this capture does _not_ include the footer?


http://gerrit.cloudera.org:8080/#/c/7599/1/src/kudu/rpc/connection.cc
File src/kudu/rpc/connection.cc:

Line 240:   DCHECK(car->call);
can you DCHECK that it's the reactor thread here just for documentation value?


Line 574:     VLOG(1) << "inbound transfer aborted " << s.ToString();
I think this should probably be VLOG(2) and need a lot more context to be 
useful (or just remove)


PS1, Line 649:         CallAwaitingResponse* car = 
FindOrDie(awaiting_response_, transfer->call_id());
             :         if (!car->call) {
             :           // If the call has already timed out or has already 
been cancelled, the 'call'
             :           // field would be set to NULL. In that case, don't 
bother sending it.
             :           DCHECK(transfer->TransferAborted());
             :           send_buffer = false;
             :         } else {
             :           DCHECK(!transfer->TransferAborted());
             :           // If this is the start of the transfer, then check if 
the server has the
             :           // required RPC flags. We have to wait until just 
before the transfer in
             :           // order to ensure that the negotiation has taken 
place, so that the flags
             :           // are available.
             :           const set<RpcFeatureFlag>& required_features =
             :               car->call->required_rpc_features();
             :           if (!includes(remote_features_.begin(), 
remote_features_.end(),
             :                         required_features.begin(), 
required_features.end())) {
             :             Status s =
             :                 Status::NotSupported("server does not support 
the required RPC features");
             :             transfer->Abort(s);
             :             car->call->SetFailed(s, negotiation_complete_ ?
             :                                  Phase::REMOTE_CALL : 
Phase::CONNECTION_NEGOTIATION);
             :             // Test cancellation when 'call_' is in 
'FINISHED_ERROR' state.
             :             MaybeInjectCancellation(car->call);
             :             car->call.reset();
             :             send_buffer = false;
             :           } else {
             :             car->call->SetSending();
             :             // Test cancellation when 'call_' is in 'SENDING' 
state.
             :             MaybeInjectCancellation(car->call);
             :           }
do you think we could refactor out these ~30 lines into a new function? This 
flow inside this while loop is getting a bit hard to follow with the 'bool 
send_buffer' thing, and I think I could parse it a bit better if this were all 
something like 'PrepareToSendCall(car)' returning an enum or somesuch


http://gerrit.cloudera.org:8080/#/c/7599/1/src/kudu/rpc/connection.h
File src/kudu/rpc/connection.h:

Line 198:   bool RemoteHasFooter() const {
I think a better name is something like 'RemoteSupportsFooters' or something


Line 199:     return remote_features_.find(HAS_FOOTER) != 
remote_features_.end();
use ContainsKey from map-util.h


http://gerrit.cloudera.org:8080/#/c/7599/1/src/kudu/rpc/inbound_call.h
File src/kudu/rpc/inbound_call.h:

PS1, Line 228:  This contains a flag which
             :   // indicates if the call was aborted and should be ignored.
no need to repeat this here since it's in the PB definition (or should be)


http://gerrit.cloudera.org:8080/#/c/7599/1/src/kudu/rpc/outbound_call.cc
File src/kudu/rpc/outbound_call.cc:

Line 90:     required_rpc_features_.insert(RpcFeatureFlag::HAS_FOOTER);
this doesn't seem right... we should be negotiating this early during the 
connection establishment


PS1, Line 143:   footer_.set_aborted(true);
             :   serialization::SerializeFooter(footer_, &footer_buf_);
can you grab the length before and after this and CHECK that they are the same? 
otherwise we could have a very hard to understand bug in the future if someone 
decides to futz with the footer PB fields


Line 438:   return state_ == ON_OUTBOUND_QUEUE || state_ == SENDING;
I'm surprised to see ON_OUTBOUND_QUEUE here actually. maybe a better name would 
help


http://gerrit.cloudera.org:8080/#/c/7599/1/src/kudu/rpc/outbound_call.h
File src/kudu/rpc/outbound_call.h:

PS1, Line 166:   // The footer is also updated to indicate the call has been 
aborted.
             :   // Returns the number of slices in the re-serialized call.
I find it surprising that this method has this side effect, based on the name. 
Is it straight-forward to put this elsewhere?


PS1, Line 206: IsInFlight
how about 'IsBeingSent' or 'TransmissionInProgress' or somesuch?


PS1, Line 316:  It contains a flag which indicates
             :   // whether the RPC was cancelled when being sent. In which 
case, the server side
             :   // should just ignore this RPC request.
don't think this is necessary here


http://gerrit.cloudera.org:8080/#/c/7599/1/src/kudu/rpc/rpc-test.cc
File src/kudu/rpc/rpc-test.cc:

Line 994:     
controller.set_timeout(MonoDelta::FromMicroseconds(rand.Uniform64(i * 30 + 1)));
> warning: either cast from 'int' to 'uint64_t' (aka 'unsigned long') is inef
maybe 30L here would help


http://gerrit.cloudera.org:8080/#/c/7599/1/src/kudu/rpc/rpc_header.proto
File src/kudu/rpc/rpc_header.proto:

PS1, Line 96:   // The footer contains a flag which indicates whether the 
incoming request has been
            :   // aborted and should be ignored.
I think these two lines are redundant with just looking at the RequestFooter 
struct below. Maybe just replace with "See RequestFooter below.'


PS1, Line 98: HAS_FOOTER
let's be more specific with this name like 'REQUEST_FOOTERS' in case we decide 
to add RESPONSE_FOOTERS later


Line 273: message RequestFooter {
I know we aren't very consistent here in this file, but can you rename this to 
RequestFooterPB? We've been following that convention more reliably in recent 
years


PS1, Line 274: required
- why make this required? what if some day we want to deprecate it?
- can you add a comment here about how this is used?


http://gerrit.cloudera.org:8080/#/c/7599/1/src/kudu/rpc/serialization.cc
File src/kudu/rpc/serialization.cc:

PS1, Line 113: CHECK_EQ
seems safe enough to be a DCHECK here since this is really just an invariant of 
the code


http://gerrit.cloudera.org:8080/#/c/7599/1/src/kudu/rpc/transfer.cc
File src/kudu/rpc/transfer.cc:

PS1, Line 68: dummy_buffer
style: rename to g_dummy_buffer since it's a global.

I think it's also worth a note explaining why it's not const (to avoid wasting 
space in the linked executable text section, right?)

What do you think about using a function with __attribute__((constructor)) to 
actually initialize this memory to a repeating pattern like 
'ABORTED-DURING-TRANSFER-ABORTED-DURING-TRANSFER-...'? That way if we have some 
bug (or we're looking at a wireshark trace) we will be less confused vs 
000000s. We have a handy 'OverwriteWithPattern' function in 
util/memory/memory.h for this.


Line 194:   if (!TransferFinished()) {
style: we usually prefer "early return" vs nesting the whole function. ie if 
(TransferFinished()) return;


Line 213:   for (n_iovecs = 0; n_iovecs < IO_VEC_SIZE && idx < 
n_payload_slices_; ++n_iovecs) {
this loop is certainly tricky... do you have confidence that it's all covered 
by tests? Does setting dummy_data to be smaller increase the coverage in tests?


-- 
To view, visit http://gerrit.cloudera.org:8080/7599
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I5c8e294090279649082eebc4f6cfb6fe858bbbfc
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-HasComments: Yes

Reply via email to