Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-3977: TransmitData() should not block
......................................................................


Patch Set 2:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/5491/1/be/src/runtime/data-stream-mgr.cc
File be/src/runtime/data-stream-mgr.cc:

PS1, Line 53: STREAM_EXPIRATION_TIME_MS
> Do you think we still need the stream cache? In the case where a receiver h
I've thought about this for a while and it makes sense that the cache is not 
very necessary. However, there are a couple of cases I can think of where this 
may come in handy.

1. The obvious one where cancellation takes a while to reach the sender. We can 
avoid extra RPCs here.
2.  When there is a Limit and the receiver fragment instance has hit it and 
deregistered its receiver, but the sender fragment keeps attempting to send row 
batches, as the DataStreamSender has no notion of a limit, it just sends 
everything it gets.

My suggestion is that we leave it as it is for now, and I can try to get some 
numbers on how useful it is after this patch, and make a decision accordingly.


PS1, Line 99: if (already_unregistered != nul
> Can we just require now that the caller takes the lock?
Done


http://gerrit.cloudera.org:8080/#/c/5491/1/be/src/runtime/data-stream-sender.cc
File be/src/runtime/data-stream-sender.cc:

PS1, Line 209: params.protocol_version 
> Let's move this inside the loop so it's freshly initialized on each iterati
TTransmitDataResult has only one field, i.e. Status.

Also, 'res' is required outside the loop in L224 which is why I initialized it 
outside.


PS1, Line 217: conds(FLAGS_datas
> Thinking about this a bit more - that might still be worthwhile for the ini
Just to clarify, do you mean that we need to add a timed wait in AddBatch() 
somewhere around here, and return an EAGAIN if we've waited too long?


PS1, Line 217: conds(FLAGS_datas
> Since a row batch can potentially be many megabytes large, I wonder if it m
I like the idea, but isn't that optimizing for a worse-case scenario? Keeping 
in mind that latency is usually the bottleneck and not the throughput itself. 
Also, the payload is already thrift-serialized by this point so we're not 
wasting any time doing that too.

My worry is that, after we move to KuduRPC, if we have a probe RPC for every 
TransmitData() call, considering that each RPC is queued on the sender and the 
receiver side, it might add a permanent latency to the overall query time for a 
lot of queries.


Line 224:           address_, &rpc_status_);
> Is cancellation detected during this loop? That's part of the point of movi
No, I added a check now to return if cancelled.


PS1, Line 225:   if (!rpc_status_.ok()) return;
> Not very informative without the fragment ID / destination address etc.
Done


PS1, Line 226: 
> rather than have the sleep period be a function of the total timeout, it ma
Done


PS1, Line 226: 
> Does it make sense to add some exponential back-off here, let's say up to a
That sounds like a good idea, but I would wait to see some numbers on how 
having a fixed interval between RPCs has a negative effect (if it does at all). 
Otherwise, it seems a bit like over-optimizing for a case that we don't know 
hurts yet.


Line 228:       rpc_status_ = DoTransmitDataRpc(&client, params, &res);
> I think it's a bit confusing to have this method return EAGAIN upon failure
Done


http://gerrit.cloudera.org:8080/#/c/5491/1/common/thrift/generate_error_codes.py
File common/thrift/generate_error_codes.py:

Line 233:   ("DATASTREAM_SENDER_TIMEOUT", 72, "Sender timed out waiting for 
receiver fragment "
> Should we mark this as deprecated here?
In the next patchset, I went back to using this, so we still need this.


PS1, Line 314: DATASTREAM_RECEIVER_EAGAIN
> Would it make sense to name this more general, e.g. "RPC_TRY_AGAIN" so we c
This is set only by the DataStreamMgr which technically is a layer above the 
RPC stack. And this will only be more true after moving to KuduRPC. So I think 
it's best not to have generic RPC error codes in Impala since KuduRPC will have 
its own and we will need to stick to those instead.


Line 315: )
> nit: single line?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5491
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I47f9d6c1de95a72ab73471dcd2a3118ef13e7db0
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil <sail...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <l...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sail...@cloudera.com>
Gerrit-HasComments: Yes

Reply via email to