Mate Szalay-Beko created ZOOKEEPER-3705:
-------------------------------------------

             Summary: Filtering unreachable hosts without using ICMP
                 Key: ZOOKEEPER-3705
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3705
             Project: ZooKeeper
          Issue Type: Improvement
    Affects Versions: 3.6.0
            Reporter: Mate Szalay-Beko
            Assignee: Mate Szalay-Beko


This is a follow-up ticket for ZOOKEEPER-3698, what was a quick fix to make the 
multi-address feature (introduced in ZOOKEEPER-3188) working on mac if ICMP 
throttling is enabled.

The whole purpose of the multi-address feature is to always try to use an 
address which works. The current implementation is (in case of the leader 
election) always filters the address list using {{InetAddress.isReachable()}} 
calls to find out which is the working server address. This will cause ICMP 
calls (or TCP connections on port 7 (Echo) of the destination host), depending 
on the native implementation (see the [Oracle 
docs|https://docs.oracle.com/javase/7/docs/api/java/net/InetAddress.html#isReachable(int)])

So if the {{InetAddress.isReachable}} can not reach the host, then the current 
multi-address feature will not able to take the given address as a working one. 
Basically right now it can not distinguish between the case of a broken network 
link (when the whole node is unreachable) and the case of a disabled ICMP (when 
only the ICMP port and the port 7 is disabled in the firewall of the 
destination host). 

A few ideas how to handle this better: 
 * One way to improve this could be to implement something like the {{ruok}} 
4LW command for the server ports. Some simple request-response messages that 
only shows that the server is alive and listen on the given election / quorum 
port. Then we could use that instead of the ICMP calls.
 * One other way can be to implement something like how the Learner is doing 
this right now (if I remember correctly, it basically starts to connect to all 
known Quorum ports in parallel, then keep the connection which is established 
first). However, it might be more tricky in case of the Leader Election 
protocol...
 * An other way would be just to try to establish a connection to the election 
addresses one-by-one, and go to the next one if the call fails. It would be 
slower, but we wouldn't rely on {{InetAddress.isReachable()}}.

A few challenges we also need to consider:
 * it can be tricky to detect if the current election address become 
unavailable. This is an other edge case where we currently use 
{{InetAddress.isReachable()}}. (this is why we call the 
{{SendWorker.asyncValidateIfSocketIsStillReachable()}})
 * we also need to take the backward-compatibility into consideration for the 
leader election protocol during rolling upgrades

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to