Hi all again,

After compiling the version 1.1 I found the following bug:

https://issues.apache.org/jira/browse/GIRAPH-859

I applied the patch and disable the permissions in the HDFS (I would
want not to do that... but I can accept it).

but still executing the example as:


hadoop jar giraph-ex.jar org.apache.giraph.GiraphRunner
org.apache.giraph.examples.SimpleShortestPathsComputation  -vif
org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
-vip tiny_graph.txt -vof
org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op
shortestpaths -yj giraph-ex.jar -w 1


The program runs for about 10 minutes (the example graph has 5 nodes)
before failing.

the gam-stderr.log file only contains info about SLF4J, and the 
gam-stdout.log finish with:

Container exited with a non-zero exit code 143

2015-11-05 14:25:39,340 INFO  [AMRM Callback Handler Thread] 
yarn.GiraphApplicationMaster 
(GiraphApplicationMaster.java:onContainersCompleted(605)) - After completion of 
one conatiner. current status is: completedCount :1 containersToLaunch :2 
successfulCount :0 failedCount :1
2015-11-05 14:26:13,414 INFO  [AMRM Callback Handler Thread] 
yarn.GiraphApplicationMaster 
(GiraphApplicationMaster.java:onContainersCompleted(580)) - Got response from 
RM for container ask, completedCnt=1
2015-11-05 14:26:13,414 INFO  [AMRM Callback Handler Thread] 
yarn.GiraphApplicationMaster 
(GiraphApplicationMaster.java:onContainersCompleted(583)) - Got container 
status for containerID=container_1446634690791_0024_01_000003, state=COMPLETE, 
exitStatus=2, diagnostics=Exception from container-launch: 
org.apache.hadoop.util.Shell$ExitCodeException: 
org.apache.hadoop.util.Shell$ExitCodeException: 
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:505)
        at org.apache.hadoop.util.Shell.run(Shell.java:418)
        at 
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650)
        at 
org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:300)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:81)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)


Container exited with a non-zero exit code 2

2015-11-05 14:26:13,415 INFO  [AMRM Callback Handler Thread] 
yarn.GiraphApplicationMaster 
(GiraphApplicationMaster.java:onContainersCompleted(603)) - All container 
compeleted. done = true
2015-11-05 14:26:13,543 INFO  [main] yarn.GiraphApplicationMaster 
(GiraphApplicationMaster.java:run(195)) - Done true
2015-11-05 14:26:13,543 INFO  [main] yarn.GiraphApplicationMaster 
(GiraphApplicationMaster.java:run(207)) - Forcefully terminating executors with 
done =:true
2015-11-05 14:26:13,543 INFO  [main] yarn.GiraphApplicationMaster 
(GiraphApplicationMaster.java:finish(221)) - Application completed. Stopping 
running containers
2015-11-05 14:26:13,578 INFO  [main] impl.ContainerManagementProtocolProxy 
(ContainerManagementProtocolProxy.java:mayBeCloseProxy(145)) - Closing proxy : 
computer62:59272
2015-11-05 14:26:13,579 INFO  [main] impl.ContainerManagementProtocolProxy 
(ContainerManagementProtocolProxy.java:mayBeCloseProxy(145)) - Closing proxy : 
computer66:45051
2015-11-05 14:26:13,579 INFO  [main] yarn.GiraphApplicationMaster 
(GiraphApplicationMaster.java:finish(226)) - Application completed. Signalling 
finish to RM
2015-11-05 14:26:13,586 INFO  [main] impl.AMRMClientImpl 
(AMRMClientImpl.java:unregisterApplicationMaster(321)) - Waiting for 
application to be successfully unregistered.
2015-11-05 14:26:13,688 INFO  [main] yarn.GiraphApplicationMaster 
(GiraphApplicationMaster.java:main(454)) - Giraph Application Master failed. 
exiting
2015-11-05 14:26:13,688 INFO  [AMRM Callback Handler Thread] 
impl.AMRMClientAsyncImpl (AMRMClientAsyncImpl.java:run(277)) - Interrupted 
while waiting for queue
java.lang.InterruptedException
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2048)
        at 
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
        at 
org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:275)


Moreover, even when the exception is 1 minute after the program starts,
it last more than 10 minutes to finish.

Do you have any idea??

Thanks.




-- 
Dr. Roberto Gonzalez 
Research Scientist, Networked Systems and Data Analytics Group
NEC Europe Ltd.
NEC Laboratories Europe
Kurfürsten-Anlage 36
 
D-69115 Heidelberg
 
phone +49 6221 4342 256
fax +49 6221 4342 155
e-mail: roberto.gonza...@neclab.eu
 
NEC Europe Ltd | Registered Office: Athene, Odyssey Business Park, West End  
Road, 
London, HA4 6QE, GB | Registered in England 2832014

Reply via email to