Todd Lipcon has posted comments on this change.

Change subject: rpc: use earliest-deadline-first RPC scheduling and rejection
......................................................................


Patch Set 3:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/2641/3//COMMIT_MSG
Commit Message:

Line 38: scheduling for the queue. Because the scan RPC retries retain the 
deadline
> What of other operations? Do we faithfully retain the 'operation' deadline 
> What of other operations? Do we faithfully retain the 'operation' deadline 
> across all RPCs? I think we put some effort into doing so, I just don't know 
> how successful we were. I'm asking because I wonder how RPCs that keep 
> resetting their deadline would be affected by this policy change.

I think we did a decent job here. But even if they do reset the deadline each 
time, I don't think it's so bad. Keep in mind that this basically only causes 
potential trouble in overload scenarios -- and in overload scenarios, we 
already had the bad behavior described here where a request that got rejected 
once was likely to just keep getting rejected over and over.

> And that means a smaller number of discrete time points with which to 
> determine fairness. I don't know if that's problematic or not, but it's worth 
> thinking about and perhaps discussing.

Right, but keep in mind that in the case that the timeouts are all the same (eg 
all 5 seconds) then this basically degenerates to "earliest arrival first", 
which is back to what we already had prior to this patch.


http://gerrit.cloudera.org:8080/#/c/2641/3/src/kudu/rpc/rpc_stub-test.cc
File src/kudu/rpc/rpc_stub-test.cc:

Line 430:   ASSERT_GT(min, avg / 2)
> Have you tried running this test in a 'slow' environment, such as with ASAN
Only the precommit runs, which haven't failed


http://gerrit.cloudera.org:8080/#/c/2641/3/src/kudu/rpc/service_pool.cc
File src/kudu/rpc/service_pool.cc:

Line 116:   LOG(WARNING) << err_msg;
> Don't think this would be excessive in production?
Well, it was already logging this, I just ended up moving the logging here (vs 
before it would return a bad Status and then log elsewhere)


Line 119:   DLOG(INFO) << err_msg << " Contents of service queue:\n"
> Not necessarily in scope, but I've found this raw dumping to be really exce
agreed it's too noisy in debug builds, but didn't want to lump it in.


http://gerrit.cloudera.org:8080/#/c/2641/3/src/kudu/rpc/service_queue.h
File src/kudu/rpc/service_queue.h:

> Lots of similarities and differences with BlockingQueue. I think there are 
Yea, I originally started that, and then by the time I had to change from 
std::queue to std::multiset it started getting "weird" enough that I thought it 
was clearer to just duplicate the code. Trying to follow the general guideline 
around not over-using templates.


Line 25: #include <vector>
> Unused. Can you double check the others?
Done


Line 37: // Blocking queue used for passing inbound RPC calls to the service 
handler
> How painful would it be for ServiceQueue to remain a templated class so as 
I think I'd rather defer that -- I'm afraid that this will eventually start to 
grow other features (eg prioritization based on users or service classes) and 
we'll probably have a better idea of how to templatize at that point. Plus I 
think that the current test coverage is pretty good already without a more 
targeted test.


Line 38: // pool.
> Nit: re-wrap so this word isn't on its own here?
Done


Line 44: // can evict any call that does not have a deadline. This incentivizes 
clients to
> Makes sense. In practice, I think the only deadline-less calls we have are 
Done


Line 73:         auto it = queue_.begin();
> Does this also mean that calls are retrieved in deadline order as opposed t
Yes, that's right. Clarified in the class doc.


Line 101:     if (queue_.size() >= max_queue_size_) {
> Is it actually possible for queue_.size() to exceed max_queue_size_? Or is 
was just being defensive. i'll add a DCHECK


-- 
To view, visit http://gerrit.cloudera.org:8080/2641
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I423ce5d8c54f61aeab4909393bbcac3516fe94c6
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Adar Dembo <[email protected]>
Gerrit-Reviewer: Anonymous Coward #80
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-HasComments: Yes

Reply via email to