Hi All,
I have successfully ran custom job on 46 MB size of graph. Now I am
trying to run same giraph job on 3 node cluster and my data size is 450
MB. I have turned off zookeeper server on all machines. The three data
nodes are *master-hadoop, hnode, INPUN-5KPH622*. I google a lot about
this error but could not found exact solution for it. Is it mandatory to
run zookeeper on all machines for Giraph Job?
I know little bit about functioning of Zookeeper, it default run on 2181
port but here it is trying to connect on port 22181.
Here is how i run the job.
target# hadoop jar
giraph-examples-1.1.0-SNAPSHOT-for-hadoop-1.2.1-jar-with-dependencies.jar org.apache.giraph.GiraphRunner
org.apache.giraph.examples.CalculateCCWithJSON -vif
org.apache.giraph.examples.JsonLongTextLongTextVertexInputFormat -vip
/giraph/input/graphInputDatawithoutroot.txt -vof
org.apache.giraph.examples.ccOutputFormat -op /giraph/vout2 -w 3
I am receiving this error on all datanodes.
684 INFO org.apache.giraph.graph.GraphTaskManager: setup: Registering health of
this worker...
2014-12-29 15:58:30,699 INFO org.apache.giraph.bsp.BspService: getJobState: Job
state already exists (/_hadoopBsp/job_201412291209_0010/_masterJobState)
2014-12-29 15:58:30,703 INFO org.apache.giraph.bsp.BspService:
getApplicationAttempt: Node
/_hadoopBsp/job_201412291209_0010/_applicationAttemptsDir already exists!
2014-12-29 15:58:30,706 INFO org.apache.giraph.bsp.BspService:
getApplicationAttempt: Node
/_hadoopBsp/job_201412291209_0010/_applicationAttemptsDir already exists!
2014-12-29 15:58:30,711 INFO org.apache.giraph.worker.BspServiceWorker:
registerHealth: Created my health node for attempt=0, superstep=-1 with
/_hadoopBsp/job_201412291209_0010/_applicationAttemptsDir/0/_superstepDir/-1/_workerHealthyDir/inpun-5kph622_1
and workerInfo= Worker(hostname=inpun-5kph622, MRtaskID=1, port=30001)
2014-12-29 16:14:42,385 INFO org.apache.zookeeper.ClientCnxn: Unable to read
additional data from server sessionid 0x14a9596f1a20002, likely server has
closed socket, closing socket connection and attempting reconnect
2014-12-29 16:14:42,486 WARN org.apache.giraph.bsp.BspService: process:
Disconnected from ZooKeeper (will automatically try to recover) WatchedEvent
state:Disconnected type:None path:null
2014-12-29 16:14:43,525 INFO org.apache.zookeeper.ClientCnxn: Opening socket
connection to server hnode/192.168.2.24:22181. Will not attempt to authenticate
using SASL (unknown error)
2014-12-29 16:14:43,527 WARN org.apache.zookeeper.ClientCnxn: Session
0x14a9596f1a20002 for server null, unexpected error, closing socket connection
and attempting reconnect
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
at
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:350)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1075)
2014-12-29 16:14:43,636 WARN org.apache.giraph.zk.ZooKeeperExt: exists:
Connection loss on attempt 0, waiting 5000 msecs before retrying.
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode =
ConnectionLoss for
/_hadoopBsp/job_201412291209_0010/_applicationAttemptsDir/0/_superstepDir/-1/_addressesAndPartitions
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1041)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1069)
at org.apache.giraph.zk.ZooKeeperExt.exists(ZooKeeperExt.java:360)
at
org.apache.giraph.worker.BspServiceWorker.startSuperstep(BspServiceWorker.java:818)
at
org.apache.giraph.worker.BspServiceWorker.setup(BspServiceWorker.java:576)
at
org.apache.giraph.graph.GraphTaskManager.execute(GraphTaskManager.java:284)
at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:93)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
at org.apache.hadoop.mapred.Child.main(Child.java:262)
Thanks,
Bipin Dalbhide
Acellere Software Pvt. Ltd.