Todd Lipcon has uploaded a new change for review. http://gerrit.cloudera.org:8080/2696
Change subject: rpc: use earliest-deadline-first RPC scheduling and rejection ...................................................................... rpc: use earliest-deadline-first RPC scheduling and rejection This changes the behavior of the RPC service queue to use earliest-deadline-first scheduling. This is seeking to address an issue I noticed when testing Impala on a box with 32 cores: - Impala is spinning up 96 clients which all operate in a loop scanning local tablets. The "think time" in between Scan RPCs is very small, since the scanner threads are just pushing the requests onto an Impala-side queue and not doing any processing. - With the default settings, we have 20 RPC handlers and a queue length of 50. This causes the remaining 26 threads to get rejected with TOO_BUSY errors on their first Scan() RPCs. - The unlucky threads back off by going to sleep for a little bit. Meanwhile, every time one of the lucky threads gets a response, it sends a new RPC and occupies the space in the queue that was just freed up. Because we have exactly 70 "lucky" threads, and 70 slots on the server side, and no "think time", the queue is full almost all the time. - When one of the "unlucky" threads wakes up from its backoff sleep, it is extremely likely that it will not find an empty queue slot, and will thus just get rejected again. The result of this behavior is extreme unfairness -- those threads that got lucky at the beginning are successfully processing lots of scan requests, but the ones that got unlucky at the beginning get rejected over and over again until they eventually time out. The approach taken by this patch is to do earliest-deadline-first (EDF) scheduling for the queue. Because the scan RPC retries retain the deadline of the original attempt after they back off, they'll have an earlier deadline than a newly-arrived scan request, thus taking priority. The patch includes a simple functional test which spawns a bunch of threads which act somewhat like the above Impala scenario, and measure the number of successful RPCs they are able to send in a 5-second period. Without the patch, I got: I0328 20:09:16.566520 1461 rpc_stub-test.cc:399] 1 1 0 1 10 17 6 1 12 12 17 10 8 7 12 9 16 15 In other words, some threads were able to complete tens of RPCs whereas other threads were unable to complete any in the same time period. With the patch, the distribution was very even: I0328 20:08:21.608039 1250 rpc_stub-test.cc:399] 9 9 9 8 9 9 9 9 9 9 9 9 9 9 9 9 9 In testing on a cluster, this solved the frequent Impala query failures when running 5 concurrent TPCH Q6 queries. Change-Id: I423ce5d8c54f61aeab4909393bbcac3516fe94c6 Reviewed-on: http://gerrit.cloudera.org:8080/2641 Reviewed-by: Adar Dembo <[email protected]> Tested-by: Kudu Jenkins (cherry picked from commit d79c1cf8020b1e577daecd2880ca7713af20c4b7) --- M src/kudu/rpc/inbound_call.cc M src/kudu/rpc/inbound_call.h M src/kudu/rpc/rpc-test-base.h M src/kudu/rpc/rpc_stub-test.cc M src/kudu/rpc/service_pool.cc M src/kudu/rpc/service_pool.h A src/kudu/rpc/service_queue.h 7 files changed, 325 insertions(+), 21 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/96/2696/1 -- To view, visit http://gerrit.cloudera.org:8080/2696 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I423ce5d8c54f61aeab4909393bbcac3516fe94c6 Gerrit-PatchSet: 1 Gerrit-Project: kudu Gerrit-Branch: branch-0.8.x Gerrit-Owner: Todd Lipcon <[email protected]>
