[ 
https://issues.apache.org/jira/browse/GIRAPH-882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14277034#comment-14277034
 ] 

Lee Dongjin commented on GIRAPH-882:
------------------------------------

For this bug, two facts are related: First, the Zookeeper server list is stored 
temporarily in hdfs, in an empty file named with "zkServerList_HOST1 PORT1 
HOST2 PORT2 ...". However, there is a length limitation on hdfs' filename: if 
it exceeds 255 characters, the tail of the filename can be trimmed. Second, In 
order to display the job summary on console, Giraph stores the Zookeeper server 
list into GiraphConstants.ZOOKEEPER_SERVER_PORT_COUNTER_GROUP counter group. 
But it also has length limitation (by default, 128) - by this reason, it can be 
also trimmed. (In short, the bug description is a little bit inaccurate.)

Needless to say, the first one is the heart of this problem. This patch fixes 
that problem, by storing given Zookeeper server list in hdfs file's contents, 
instead of the filename. I validated this patch using 5 Zookeeper servers, each 
of whose name was 53 characters. That is:

the-answer-to-life-the-universe-and-everything-1:2181
the-answer-to-life-the-universe-and-everything-2:2181
the-answer-to-life-the-universe-and-everything-3:2181
the-answer-to-life-the-universe-and-everything-4:2181
the-answer-to-life-the-universe-and-everything-5:2181

With commas, the whole length of Zookeeper server list is 53 * 5 + 4 = 269 > 
255.

> List of zookeeper connection strings is trimmed by Hadoop counters.
> -------------------------------------------------------------------
>
>                 Key: GIRAPH-882
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-882
>             Project: Giraph
>          Issue Type: Bug
>          Components: zookeeper
>    Affects Versions: 1.1.0
>            Reporter: Lukas Nalezenec
>
> We are running job with quorum of 3 zookeepers. Each serves has got long name 
> (turing452.fi.callan.de:22181). Connection strings are stored to Hadoop 
> Counters (for example: 
> turing452.fi.callan.de:22181,turing124.fi.callan.de:22181,turing488.fi.callan.de:22181)
>  but since name of counter is limited to ~63 character the connection string 
> is trimmed (turing452.fi.callan.de:22181,turing124.fi.callan.de:22181,turin).
> 14/03/18 23:44:41 INFO zookeeper.ZooKeeper: Client 
> environment:user.name=hadoop
> 14/03/18 23:44:41 INFO zookeeper.ZooKeeper: Initiating client connection, 
> connectString=turing452.fi.callan.de:22181,turing124.fi.callan.de:22181,turin 
> sessionTimeout=60000 
> Exception in thread "main" java.net.UnknownHostException: turin
>       at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
>       at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:901)
>       at 
> java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1293)
>       at java.net.InetAddress.getAllByName0(InetAddress.java:1246)
>       at java.net.InetAddress.getAllByName(InetAddress.java:1162)
>       at java.net.InetAddress.getAllByName(InetAddress.java:1098)
>       at 
> org.apache.zookeeper.client.StaticHostProvider.<init>(StaticHostProvider.java:60)
>       at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:445)
>       at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:380)
>       at org.apache.giraph.zk.ZooKeeperExt.<init>(ZooKeeperExt.java:114)
>       at 
> org.apache.giraph.job.JobProgressTracker.<init>(JobProgressTracker.java:69)
>       at org.apache.giraph.job.GiraphJob.run(GiraphJob.java:255)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to