[ https://issues.apache.org/jira/browse/KUDU-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16805109#comment-16805109 ]
Grant Henke commented on KUDU-1395: ----------------------------------- FWIW the Java client retries keepAlive requests (KUDU-2710) > Scanner KeepAlive requests can get starved on an overloaded server > ------------------------------------------------------------------ > > Key: KUDU-1395 > URL: https://issues.apache.org/jira/browse/KUDU-1395 > Project: Kudu > Issue Type: Bug > Components: impala, rpc, tserver > Affects Versions: 0.8.0 > Reporter: Todd Lipcon > Assignee: Todd Lipcon > Priority: Major > Labels: backup > > As of 0.8.0, the RPC system schedules RPCs on an earliest-deadline-first > basis, rejecting those with later deadlines. This works well for RPCs which > are retried on SERVER_TOO_BUSY errors, since the retries maintain the > original deadline and thus get higher and higher priority as they get closer > to timing out. > We don't, however, do any retries on scanner KeepAlive RPCs. So, if a > keepalive RPC arrives at a heavily overloaded tserver, it will likely get > rejected, and won't retry. This means that Impala queries or other long scans > that rely on KeepAlives will likely fail on overloaded clusters since the > KeepAlive never gets through. -- This message was sent by Atlassian JIRA (v7.6.3#76005)