RPC client should deal with the IP address changes --------------------------------------------------
Key: HADOOP-7472 URL: https://issues.apache.org/jira/browse/HADOOP-7472 Project: Hadoop Common Issue Type: Improvement Components: ipc Affects Versions: 0.20.205.0 Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Minor The current RPC client implementation and the client-side callers assume that the hostname-address mappings of servers never change. The resolved address is stored in an immutable InetSocketAddress object above/outside RPC, and the reconnect logic in the RPC Connection implementation also trusts the resolved address that was passed down. If the NN suffers a failure that requires migration, it may be started on a different node with a different IP address. In this case, even if the name-address mapping is updated in DNS, the cluster is stuck trying old address until the whole cluster is restarted. The RPC client-side should detect this situation and exit or try to recover. Updating ConnectionId within the Client implementation may get the system work for the moment, there always is a risk of the cached address:port become connectable again unintentionally. The real solution will be notifying upper layer of the address change so that they can re-resolve and retry or re-architecture the system as discussed in HDFS-34. For 0.20 lines, some type of compromise may be acceptable. For example, raise a custom exception for some well-defined high-impact upper layer to do re-resolve/retry, while other will have to restart. For TRUNK, the HA work will most likely determine what needs to be done. So this Jira won't cover the solutions for TRUNK. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira