[ https://issues.apache.org/jira/browse/HADOOP-10597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ming Ma updated HADOOP-10597: ----------------------------- Attachment: HADOOP-10597-4.patch Thanks, [~ste...@apache.org]! Here is the new patch with your suggestions. Regarding the serialization of {{RetryAction}} via {{RetriableException}} message string, I agree it is not necessarily the best approach. Here we need to serialize RetryAction and have RPC server send it back to RPC client. Possible options that I know of: * Current RPC Header structure {{RpcHeaderProtos}} includes Exception message field; thus it is convenient to use {{RetriableException}} message. * We can consider adding optional {{RetryAction}} field into RPC header {{RpcHeaderProtos}}. That requires more changes. > Evaluate if we can have RPC client back off when server is under heavy load > --------------------------------------------------------------------------- > > Key: HADOOP-10597 > URL: https://issues.apache.org/jira/browse/HADOOP-10597 > Project: Hadoop Common > Issue Type: Sub-task > Reporter: Ming Ma > Assignee: Steve Loughran > Attachments: HADOOP-10597-2.patch, HADOOP-10597-3.patch, > HADOOP-10597-4.patch, HADOOP-10597.patch, MoreRPCClientBackoffEvaluation.pdf, > RPCClientBackoffDesignAndEvaluation.pdf > > > Currently if an application hits NN too hard, RPC requests be in blocking > state, assuming OS connection doesn't run out. Alternatively RPC or NN can > throw some well defined exception back to the client based on certain > policies when it is under heavy load; client will understand such exception > and do exponential back off, as another implementation of > RetryInvocationHandler. -- This message was sent by Atlassian JIRA (v6.3.4#6332)