So I've made some progress on this issue, and I'm hoping someone like Eli might have some insight after watching his great presentation on giraph and yarn... I'm receiving:

Error: Could not find or load main class org.apache.giraph.yarn.GiraphApplicationMaster

I'm guessing this has something to do with the -yj option and ensuring the jar is distributed to all the workers, am I looking in the correct direction or does anyone have advice on solving this issue?

Thanks.

On 13-10-14 04:55 PM, Matthew Laird wrote:
After much googling I finally pieced together how to run in pure yarn
mode, first, for any future googlers here's the two command lines I used
to build and run giraph head:

mvn -Dhadoop.version=2.0.3-alpha -Phadoop_yarn -DskipTests clean install

$HADOOP_HOME/bin/hadoop jar
$GIRAPH_HOME/giraph-examples/target/giraph-examples-1.1.0-SNAPSHOT-for-hadoop-2.0.3-alpha-jar-with-dependencies.jar
org.apache.giraph.GiraphRunner -Dgiraph.zkList=zookeeper1:7000
org.apache.giraph.examples.SimpleShortestPathsComputation -vif
org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
-vip /in/tiny_graph.txt -vof
org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op
/user/hduser/output/shortestpaths -w 4

(where -Dgiraph.zkList=zookeeper1:7000 is a comma separated list of
zookeeper quorum members)

Now on to my query... I'm wondering about suggestions on where to look
for debugging information, upon running I received:

13/10/14 16:49:24 WARN util.NativeCodeLoader: Unable to load
native-hadoop library for your platform... using builtin-java classes
where applicable
13/10/14 16:49:24 INFO utils.ConfigurationUtils: No edge input format
specified. Ensure your InputFormat does not require one.
13/10/14 16:49:24 INFO utils.ConfigurationUtils: No edge output format
specified. Ensure your OutputFormat does not require one.
13/10/14 16:49:24 INFO yarn.GiraphYarnClient: Final output path is:
hdfs://x007:9000/user/hduser/output/shortestpaths
13/10/14 16:49:24 INFO service.AbstractService:
Service:org.apache.hadoop.yarn.client.YarnClientImpl is inited.
13/10/14 16:49:24 INFO service.AbstractService:
Service:org.apache.hadoop.yarn.client.YarnClientImpl is started.
13/10/14 16:49:24 INFO yarn.GiraphYarnClient: Defaulting per-task heap
size to 1024MB.
13/10/14 16:49:24 INFO yarn.GiraphYarnClient: Obtained new Application
ID: application_1381087048629_0004
13/10/14 16:49:24 WARN conf.Configuration: mapred.job.id is deprecated.
Instead, use mapreduce.job.id
13/10/14 16:49:24 WARN conf.Configuration: mapred.output.dir is
deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
13/10/14 16:49:25 INFO yarn.YarnUtils: Registered file in
LocalResources: giraph-conf.xml
13/10/14 16:49:25 INFO yarn.GiraphYarnClient:
ApplicationSumbissionContext for GiraphApplicationMaster launch
container is populated.
13/10/14 16:49:25 INFO client.YarnClientImpl: Submitted application
application_1381087048629_0004 to ResourceManager at
x007/192.168.10.117:8040
13/10/14 16:49:25 INFO yarn.GiraphYarnClient: GiraphApplicationMaster
container request was submitted to ResourceManager for job: Giraph:
org.apache.giraph.examples.SimpleShortestPathsComputation
13/10/14 16:49:26 INFO yarn.GiraphYarnClient: Giraph:
org.apache.giraph.examples.SimpleShortestPathsComputation, Elapsed: 0.84
secs
13/10/14 16:49:26 INFO yarn.GiraphYarnClient:
appattempt_1381087048629_0004_000001, State: ACCEPTED, Containers used: 1
13/10/14 16:49:27 ERROR yarn.GiraphYarnClient: Giraph:
org.apache.giraph.examples.SimpleShortestPathsComputation reports FAILED
state, diagnostics show: Application application_1381087048629_0004
failed 1 times due to AM Container for
appattempt_1381087048629_0004_000001 exited with exitCode: 1 due to:
.Failing this attempt.. Failing the application.
13/10/14 16:49:27 INFO yarn.GiraphYarnClient: Cleaning up HDFS
distributed cache directory for Giraph job.
13/10/14 16:49:27 INFO yarn.GiraphYarnClient: Completed Giraph:
org.apache.giraph.examples.SimpleShortestPathsComputation: FAILED, total
running time: 0 minutes, 1 seconds.
lairdm@x007:/opt/giraph$ $HADOOP_HOME/bin/hadoop jar
$GIRAPH_HOME/giraph-examples/target/giraph-examples-1.1.0-SNAPSHOT-for-hadoop-2.0.3-alpha-jar-with-dependencies.jar
org.apache.giraph.GiraphRunner -Dgiraph.zkList=zookeeper1:7000
org.apache.giraph.examples.SimpleShortestPathsComputation -vif
org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
-vip /in/tiny_graph.txt -vof
org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op
/user/hduser/output/shortestpaths -w 4
13/10/14 16:52:15 WARN util.NativeCodeLoader: Unable to load
native-hadoop library for your platform... using builtin-java classes
where applicable
13/10/14 16:52:16 INFO utils.ConfigurationUtils: No edge input format
specified. Ensure your InputFormat does not require one.
13/10/14 16:52:16 INFO utils.ConfigurationUtils: No edge output format
specified. Ensure your OutputFormat does not require one.
13/10/14 16:52:16 INFO yarn.GiraphYarnClient: Final output path is:
hdfs://x007:9000/user/hduser/output/shortestpaths
13/10/14 16:52:16 INFO service.AbstractService:
Service:org.apache.hadoop.yarn.client.YarnClientImpl is inited.
13/10/14 16:52:16 INFO service.AbstractService:
Service:org.apache.hadoop.yarn.client.YarnClientImpl is started.
13/10/14 16:52:16 INFO yarn.GiraphYarnClient: Defaulting per-task heap
size to 1024MB.
13/10/14 16:52:16 INFO yarn.GiraphYarnClient: Obtained new Application
ID: application_1381087048629_0005
13/10/14 16:52:16 WARN conf.Configuration: mapred.job.id is deprecated.
Instead, use mapreduce.job.id
13/10/14 16:52:16 WARN conf.Configuration: mapred.output.dir is
deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
13/10/14 16:52:16 INFO yarn.YarnUtils: Registered file in
LocalResources: giraph-conf.xml
13/10/14 16:52:17 INFO yarn.GiraphYarnClient:
ApplicationSumbissionContext for GiraphApplicationMaster launch
container is populated.
13/10/14 16:52:17 INFO client.YarnClientImpl: Submitted application
application_1381087048629_0005 to ResourceManager at
x007/192.168.10.117:8040
13/10/14 16:52:17 INFO yarn.GiraphYarnClient: GiraphApplicationMaster
container request was submitted to ResourceManager for job: Giraph:
org.apache.giraph.examples.SimpleShortestPathsComputation
13/10/14 16:52:18 INFO yarn.GiraphYarnClient: Giraph:
org.apache.giraph.examples.SimpleShortestPathsComputation, Elapsed: 0.84
secs
13/10/14 16:52:18 INFO yarn.GiraphYarnClient:
appattempt_1381087048629_0005_000001, State: ACCEPTED, Containers used: 1
13/10/14 16:52:19 ERROR yarn.GiraphYarnClient: Giraph:
org.apache.giraph.examples.SimpleShortestPathsComputation reports FAILED
state, diagnostics show: Application application_1381087048629_0005
failed 1 times due to AM Container for
appattempt_1381087048629_0005_000001 exited with exitCode: 1 due to:
.Failing this attempt.. Failing the application.
13/10/14 16:52:19 INFO yarn.GiraphYarnClient: Cleaning up HDFS
distributed cache directory for Giraph job.
13/10/14 16:52:19 INFO yarn.GiraphYarnClient: Completed Giraph:
org.apache.giraph.examples.SimpleShortestPathsComputation: FAILED, total
running time: 0 minutes, 1 seconds.

Is there a log somewhere of why it failed or a way to get more detailed
output from the AM? Getting closer, I'm doing little jumps for joy.

Thanks, and great work getting this package put together in the first
place!


--
Matthew Laird
Lead Software Developer, Bioinformatics
Brinkman Laboratory
Simon Fraser University, Burnaby, BC, Canada

Reply via email to