Hi all, I'm requesting help again! I'm trying to get this
SimpleShortestPathsComputation example working, but I'm stuck again. Now
the job begins to run and seems to work until the final step (it performs 3
supersteps), but the overall job is failing.

In the master, among other things, I see:

...
14/01/10 15:04:17 INFO master.MasterThread: setup: Took 0.87 seconds.
14/01/10 15:04:17 INFO master.MasterThread: input superstep: Took 0.708
seconds.
14/01/10 15:04:17 INFO master.MasterThread: superstep 0: Took 0.158 seconds.
14/01/10 15:04:17 INFO master.MasterThread: superstep 1: Took 0.344 seconds.
14/01/10 15:04:17 INFO master.MasterThread: superstep 2: Took 0.064 seconds.
14/01/10 15:04:17 INFO master.MasterThread: shutdown: Took 0.162 seconds.
14/01/10 15:04:17 INFO master.MasterThread: total: Took 2.31 seconds.
14/01/10 15:04:17 INFO yarn.GiraphYarnTask: Master is ready to commit final
job output data.
14/01/10 15:04:18 INFO yarn.GiraphYarnTask: Master has committed the final
job output data.
...

To me, that looks promising - like the job was successful. However, in the
WORKER_ONLY containers, I see these things:

...
14/01/10 15:04:17 INFO graph.GraphTaskManager: cleanup: Starting for
WORKER_ONLY
14/01/10 15:04:17 WARN bsp.BspService: process: Unknown and unprocessed
event
(path=/_hadoopBsp/giraph_yarn_application_1389300168420_0024/_applicationAttemptsDir/0/_superstepDir/1/_addressesAndPartitions,
type=NodeDeleted, state=SyncConnected)
14/01/10 15:04:17 INFO worker.BspServiceWorker: processEvent :
partitionExchangeChildrenChanged (at least one worker is done sending
partitions)
14/01/10 15:04:17 WARN bsp.BspService: process: Unknown and unprocessed
event
(path=/_hadoopBsp/giraph_yarn_application_1389300168420_0024/_applicationAttemptsDir/0/_superstepDir/1/_superstepFinished,
type=NodeDeleted, state=SyncConnected)
14/01/10 15:04:17 INFO netty.NettyClient: stop: reached wait threshold, 1
connections closed, releasing NettyClient.bootstrap resources now.
14/01/10 15:04:17 INFO worker.BspServiceWorker: processEvent: Job state
changed, checking to see if it needs to restart
14/01/10 15:04:17 INFO bsp.BspService: getJobState: Job state already
exists
(/_hadoopBsp/giraph_yarn_application_1389300168420_0024/_masterJobState)
14/01/10 15:04:17 INFO yarn.GiraphYarnTask: [STATUS: task-1] saveVertices:
Starting to save 2 vertices using 1 threads
14/01/10 15:04:17 INFO worker.BspServiceWorker: saveVertices: Starting to
save 2 vertices using 1 threads
14/01/10 15:04:17 INFO worker.BspServiceWorker: processEvent: Job state
changed, checking to see if it needs to restart
14/01/10 15:04:17 INFO bsp.BspService: getJobState: Job state already
exists
(/_hadoopBsp/giraph_yarn_application_1389300168420_0024/_masterJobState)
14/01/10 15:04:17 INFO bsp.BspService: getJobState: Job state path is
empty! -
/_hadoopBsp/giraph_yarn_application_1389300168420_0024/_masterJobState
14/01/10 15:04:17 ERROR zookeeper.ClientCnxn: Error while calling watcher
java.lang.NullPointerException
        at java.io.StringReader.<init>(StringReader.java:50)
        at org.json.JSONTokener.<init>(JSONTokener.java:66)
        at org.json.JSONObject.<init>(JSONObject.java:402)
        at org.apache.giraph.bsp.BspService.getJobState(BspService.java:716)
        at
org.apache.giraph.worker.BspServiceWorker.processEvent(BspServiceWorker.java:1563)
        at org.apache.giraph.bsp.BspService.process(BspService.java:1095)
        at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519)
        at
org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495)
14/01/10 15:04:17 WARN bsp.BspService: process: Unknown and unprocessed
event
(path=/_hadoopBsp/giraph_yarn_application_1389300168420_0024/_vertexInputSplitsAllReady,
type=NodeDeleted, state=SyncConnected)
14/01/10 15:04:17 WARN bsp.BspService: process: Unknown and unprocessed
event
(path=/_hadoopBsp/giraph_yarn_application_1389300168420_0024/_applicationAttemptsDir/0/_superstepDir/2/_addressesAndPartitions,
type=NodeDeleted, state=SyncConnected)
14/01/10 15:04:17 INFO worker.BspServiceWorker: processEvent :
partitionExchangeChildrenChanged (at least one worker is done sending
partitions)
14/01/10 15:04:17 WARN bsp.BspService: process: Unknown and unprocessed
event
(path=/_hadoopBsp/giraph_yarn_application_1389300168420_0024/_applicationAttemptsDir/0/_superstepDir/2/_superstepFinished,
type=NodeDeleted, state=SyncConnected)
...
14/01/10 15:04:17 WARN hdfs.DFSClient: DataStreamer Exception
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException):
No lease on
/user/spry/Shortest/_temporary/1/_temporary/attempt_1389300168420_0024_m_000001_1/part-m-00001:
File does not exist. Holder DFSClient_NONMAPREDUCE_-643344145_1 does not
have any open files.
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2755)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.analyzeFileState(FSNamesystem.java:2567)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2480)
        at
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)
        at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:387)
...

I apologize for the wall of error message, but I tried to leave in at least
some of the parts that might be useful. I put the entire YARN log here:
http://tny.cz/af229738

Has anyone ever seen this before? This is the command I'm using to run:

hadoop jar
giraph-core/target/giraph-1.1.0-SNAPSHOT-for-hadoop-2.2.0-jar-with-dependencies.jar
org.apache.giraph.GiraphRunner -Dgiraph.SplitMasterWorker=false
-Dgiraph.zkList="localhost:2181" -Dgiraph.zkSessionMsecTimeout=600000
-Dgiraph.useInputSplitLocality=false
org.apache.giraph.examples.SimpleShortestPathsComputation -vif
org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
-vip /user/spry/input -vof
org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op
/user/spry/Shortest -w 1

My setup is still the same as the other email if you saw it:

I compiled Giraph with this command, and everything built successfully
except "Apache Giraph Distribution" which it doesn't seem like I need:

mvn -Phadoop_yarn -Dhadoop.version=2.2.0 -DskipTests clean package

I am running with the following components:

Single node cluster
Giraph 1.1
Hadoop 2.2.0 (Hortonworks)
Java 1.7.0_45

Thanks in advance,
-Kristen Hardwick

Reply via email to