[
https://issues.apache.org/jira/browse/ZOOKEEPER-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287397#comment-13287397
]
Marshall McMullen commented on ZOOKEEPER-1476:
----------------------------------------------
We've experienced this identical problem where reverse name lookup prevents
zookeeper leader election from ever completing successfully. In our case this
was failing on Linux with IPv4 not IPv6. As it turns out, there is a lot of
code in zookeeper server that calls GetHostName which does a reverse dns
lookup. I've patched the code in question to use GetHostString instead which
does not do a reverse name lookup. Eventually it does perform a lookup but it
uses getByName to do a normal dns lookup if necessary (if it's not an IP
address already).
I'm happy to upload the patch we use, but I can only vouch for it compiling
properly on openjdk7. The function I had to use (GetHostString) was wrongly
private in openjdk6 and made public in openjdk7. I don't know whether that
function is public or private in Sun or IBM or any other flavor of java.
> ipv6 reverse dns related timeouts on OSX connecting to localhost
> ----------------------------------------------------------------
>
> Key: ZOOKEEPER-1476
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1476
> Project: ZooKeeper
> Issue Type: Bug
> Reporter: Jilles van Gurp
> Priority: Minor
>
> We observed a weird, random issue trying to create zookeeper client
> connections on osx. Sometimes it would work and sometimes it would fail. Also
> it is randomly very slow. It turns out both issues have the same cause.
> My hosts file on osx (which is an unmodified default one), lists three
> entries for localhost:
> 127.0.0.1 localhost
> ::1 localhost
> fe80::1%lo0 localhost
> We saw zookeeper trying to connect to fe80:0:0:0:0:0:0:1%1 sometimes, which
> is not listed (actually one in four times, it seems to round robin over the
> addresses).
> Whenever that happens, it sometimes works and sometimes fails. In both cases
> it's very slow. Reason: the reverse lookup for fe80:0:0:0:0:0:0:1%1 can't be
> resolved using the hosts file and it falls back to actually using the dns.
> Sometimes it actually works but other times it fails/times out after about 5
> seconds. Probably a platform specific settings with dns setup hide this
> problem on linux.
> As a workaround, we preresolve localhost now:
> Inet4Address.getByName("localhost"). This always resolves to 127.0.0.1 on my
> machine and works fast.
> This fixes the issue for us. We're not sure where the fe80:0:0:0:0:0:0:1%1
> address comes from though. I don't recall having this issue with other server
> side software so this might be a mix of platform setup, osx specific
> defaults, and zookeeper behavior.
> I've seen one ticket that relates to ipv6 in zookeeper that might be related:
> ZOOKEEPER-667. Perhaps the workaround for that ticket introduced this
> problem?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira