[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287397#comment-13287397
 ] 

Marshall McMullen commented on ZOOKEEPER-1476:
----------------------------------------------

We've experienced this identical problem where reverse name lookup prevents 
zookeeper leader election from ever completing successfully. In our case this 
was failing on Linux with IPv4 not IPv6. As it turns out, there is a lot of 
code in zookeeper server that calls GetHostName which does a reverse dns 
lookup. I've patched the code in question to use GetHostString instead which 
does not do a reverse name lookup. Eventually it does perform a lookup but it 
uses getByName to do a normal dns lookup if necessary (if it's not an IP 
address already). 

I'm happy to upload the patch we use, but I can only vouch for it compiling 
properly on openjdk7. The function I had to use (GetHostString) was wrongly 
private in openjdk6 and made public in openjdk7. I don't know whether that 
function is public or private in Sun or IBM or any other flavor of java.
                
> ipv6 reverse dns related timeouts on OSX connecting to localhost
> ----------------------------------------------------------------
>
>                 Key: ZOOKEEPER-1476
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1476
>             Project: ZooKeeper
>          Issue Type: Bug
>            Reporter: Jilles van Gurp
>            Priority: Minor
>
> We observed a weird, random issue trying to create zookeeper client 
> connections on osx. Sometimes it would work and sometimes it would fail. Also 
> it is randomly very slow. It turns out both issues have the same cause.
> My hosts file on osx (which is an unmodified default one), lists three 
> entries for localhost:
> 127.0.0.1     localhost
> ::1             localhost 
> fe80::1%lo0   localhost
> We saw zookeeper trying to connect to fe80:0:0:0:0:0:0:1%1 sometimes, which 
> is not listed (actually one in four times, it seems to round robin over the 
> addresses). 
> Whenever that happens, it sometimes works and sometimes fails. In both cases 
> it's very slow. Reason: the reverse lookup for fe80:0:0:0:0:0:0:1%1 can't be 
> resolved using the hosts file and it falls back to actually using the dns. 
> Sometimes it actually works but other times it fails/times out after about 5 
> seconds. Probably a platform specific settings with dns setup hide this 
> problem on linux. 
> As a workaround, we preresolve localhost now: 
> Inet4Address.getByName("localhost"). This always resolves to 127.0.0.1 on my 
> machine and works fast.
> This fixes the issue for us. We're not sure where the fe80:0:0:0:0:0:0:1%1 
> address comes from though. I don't recall having this issue with other server 
> side software so this might be a mix of platform setup, osx specific 
> defaults, and zookeeper behavior.
> I've seen one ticket that relates to ipv6 in zookeeper that might be related: 
> ZOOKEEPER-667. Perhaps the workaround for that ticket introduced this 
> problem? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to