You are right the docs situation is shameful, and I am as much to blame as anyone! There are some serious documentation efforts in the works, stay tuned. For now, running the benchmarks has a different CLI syntax than running applications or examples.
Check out GiraphRunner for apps/examples and check out the benchmark driver classes themselves to see how they are launched (directly from "hadoop jar" without o.a.g.GiraphRunner as the main class) -- they also have their own CLI option conventions. Most have help msgs. embedded as well. So that probably hasn't helped ;) On Wed, Feb 27, 2013 at 6:47 AM, David Boyd <db...@data-tactics-corp.com>wrote: > Eli: > Thanks. I was coming to that conclusion. The documentation is out > of sync and very sketchy. > I was basing this off of the README file in the 0.2 snapshot(line 173). > It should not make a difference is the > cluster being submitted to is running local or or remote. > > What is really missing is some basic documentation on exactly how to > run other examples other than > the PageRankBenchmark. Depending on priorities hopefully I will get > some time to mess around and > figure out one or two. > > If anyone on the list has a small example program or test they are > willing to share I would be most > appreciative as that would help my target users significantly. > > > > On 2/26/2013 9:01 PM, Eli Reisman wrote: > > Just to throw this out there: it has been noted (by me most recently) on > the Giraph JIRA site that the tests aren't happy when you try to run them > against a running cluster, they like to instantiate their own local > resources (ZK, Hadoop single-node) for the tests. If your example jobs run > with "hadoop jar" on the cluster than thats what matters, you're all set. > > > On Mon, Feb 25, 2013 at 4:03 PM, David Boyd > <db...@data-tactics-corp.com>wrote: > >> Sandy: >> Yes. Attached is the segment from the job tracker log file that >> shows the error and stack traces. >> >> The maven surefire report for the test shows an assertion failure on the >> following line from >> the test: >> >> assertTrue(job.run(true)); >> >> >> ------------------------------------------------------------------------------- >> Test set: org.apache.giraph.io.TestJsonBase64Format >> >> ------------------------------------------------------------------------------- >> Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 32.363 >> sec <<< FAILURE! >> testContinue(org.apache.giraph.io.TestJsonBase64Format) Time elapsed: >> 32.352 sec <<< FAILURE! >> java.lang.AssertionError: >> at org.junit.Assert.fail(Assert.java:91) >> at org.junit.Assert.assertTrue(Assert.java:43) >> at org.junit.Assert.assertTrue(Assert.java:54) >> at >> org.apache.giraph.io.TestJsonBase64Format.testContinue(TestJsonBase64Format.java:74) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> at java.lang.reflect.Method.invoke(Method.java:597) >> at >> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44) >> at >> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) >> at >> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41) >> at >> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) >> at >> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) >> at >> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31) >> at >> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76) >> at >> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) >> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193) >> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52) >> at >> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191) >> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42) >> at >> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184) >> at org.junit.runners.ParentRunner.run(ParentRunner.java:236) >> at >> org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:59) >> at >> org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.executeTestSet(AbstractDirectoryTestSuite.java:120) >> at >> org.apache.maven.surefire.suite.AbstractDirectoryTestSuite.execute(AbstractDirectoryTestSuite.java:103) >> at org.apache.maven.surefire.Surefire.run(Surefire.java:169) >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> at java.lang.reflect.Method.invoke(Method.java:597) >> at >> org.apache.maven.surefire.booter.SurefireBooter.runSuitesInProcess(SurefireBooter.java:350) >> at >> org.apache.maven.surefire.booter.SurefireBooter.main(SurefireBooter.java:1021) >> >> >> Below is the surefire report stack trace: >> >> >> On 2/25/2013 6:55 PM, Sandy Ryza wrote: >> >> Great to hear it helped. Are you able to provide the full stack trace >> for that exception? >> >> thanks, >> Sandy >> >> >> On Mon, Feb 25, 2013 at 3:51 PM, David Boyd >> <db...@data-tactics-corp.com>wrote: >> >>> Sandy: >>> Thanks that helps a great deal. I am now at least getting to the >>> point that the jobs show up in the job tracker. However, they all >>> fail on initialization with the good old: >>> >>> java.io.FileNotFoundException: File >>> /tmp/hadoop-mapred/mapred/staging/hdfs/.staging/job_201302211213_0055/job.jar >>> does not exist >>> >>> This tells me that maven is either not specifying that the giraph-core >>> jar file should be used as the job jar or I am missing >>> something else in the set up. >>> >>> Attached is the job.xml file from one of the failed jobs and below is >>> the relevant profile out of my pom.xml. >>> I did upgrade to CDH4.1.3 just to see if that would help. >>> Also, I have been running all sorts of jobs (benchmarks, and other >>> tests) against this cluster for some time so I know that the cluster >>> works well. >>> >>> Again, any help is appreciated. >>> >>> Relevant section of pom.xml: >>> >>> <profile> >>> <id>hadoop_cdh4.1.3mr1</id> >>> <properties> >>> <hadoopmr1.version>2.0.0-mr1-cdh4.1.3</hadoopmr1.version> >>> <hadoop.version>2.0.0-cdh4.1.3</hadoop.version> >>> >>> <munge.symbols>HADOOP_1_SECURITY,HADOOP_1_SECRET_MANAGER</munge.symbols> >>> </properties> >>> <dependencies> >>> <!-- sorted lexicographically --> >>> <dependency> >>> <groupId>commons-net</groupId> >>> <artifactId>commons-net</artifactId> >>> </dependency> >>> <dependency> >>> <groupId>org.apache.hadoop</groupId> >>> <artifactId>hadoop-client</artifactId> >>> <version>${hadoopmr1.version}</version> >>> <scope>provided</scope> >>> </dependency> >>> <dependency> >>> <groupId>org.apache.hadoop</groupId> >>> <artifactId>hadoop-common</artifactId> >>> <version>${hadoop.version}</version> >>> <scope>provided</scope> >>> </dependency> >>> <dependency> >>> <groupId>org.apache.hadoop</groupId> >>> <artifactId>hadoop-hdfs</artifactId> >>> <version>${hadoop.version}</version> >>> <scope>provided</scope> >>> </dependency> >>> <dependency> >>> <groupId>org.apache.hadoop</groupId> >>> <artifactId>hadoop-test</artifactId> >>> <version>${hadoopmr1.version}</version> >>> <scope>provided</scope> >>> </dependency> >>> </dependencies> >>> </profile> >>> >>> >>> >>> >>> On 2/25/2013 12:47 PM, Sandy Ryza wrote: >>> >>> Hi David, >>> >>> Moving this to cdh-user, as it is CDH-specific. >>> >>> CDH4 comes with two versions of mapreduce, MR1, and MR2. It sounds >>> like you are building against MR2 ( >>> http://blog.cloudera.com/blog/2012/10/mr2-and-yarn-briefly-explained/). >>> Do you know whether your cluster runs MR2/YARN or MR1? If it runs, MR2, >>> you can set mapreduce.framework.name to "yarn". If it runs MR1, you >>> can build against the MR1 jar by setting the version of your hadoop-client >>> to 2.0.0-mr1-cdh4.1.1. ( >>> https://ccp.cloudera.com/display/CDH4DOC/Managing+Hadoop+API+Dependencies+in+CDH4 >>> ) >>> Does that help? >>> >>> -Sandy >>> >>> >>> On Mon, Feb 25, 2013 at 8:26 AM, David Boyd <db...@data-tactics-corp.com >>> > wrote: >>> >>>> All: >>>> I am trying to get the Giraph 0.2 snapshot (pulled via GIT on Friday) >>>> to build and run with CDH4. >>>> >>>> I modified the pom.xml to provide a profile for my specific version >>>> (4.1.1). >>>> The build works (mvn -Phadoop_cdh4.1.1 clean package test) and passes >>>> all the tests. >>>> >>>> If I try to do the next step and submit to my cluster with the command: >>>> mvn -Phadoop_cdh4.1.1 test >>>> -Dprop.mapred.job.tracker=10.1.94.53:8021-Dgiraph.zkList= >>>> 10.1.94.104:2181 >>>> >>>> the JSON test in core fails. If I move that test out of the way a >>>> whole bunch of tests in examples >>>> fail. They all fail with: >>>> >>>>> java.io.IOException: Cannot initialize Cluster. Please check your >>>>> configuration for mapreduce.framework.name and the correspond server >>>>> addresses. >>>>> >>>> >>>> I have tried passing mapreduce.framework.name as both local and >>>> classic. I have also set those values in my mapreduce-site.xml. >>>> >>>> Interestingly I can run the pagerank benchmark in code with the command: >>>> >>>>> hadoop jar >>>>> >>>>> ./giraph-core/target/giraph-0.2-SNAPSHOT-for-hadoop-2.0.0-cdh4.1.3-jar-with-dependencies.jar >>>>> org.apache.giraph.benchmark.PageRankBenchmark >>>>> -Dmapred.child.java-opts="-Xmx64g -Xms64g XX:+UseConcMarkSweepGC >>>>> -XX:-UseGCOverheadLimit" -Dgiraph.zkList=10.1.94.104:2181 -e 1 -s 3 -v >>>>> -V 50000 -w 83 >>>>> >>>> And it completes just fine. >>>> >>>> I have searched high and low for documents and examples on how to run >>>> the example programs from other >>>> than maven but have not found any thing. >>>> >>>> Any help or suggestions would be greatly appreciated. >>>> >>>> THanks. >>>> >>>> >>>> >>>> -- >>>> ========= mailto:db...@data-tactics.com ============ >>>> David W. Boyd >>>> Director, Engineering, Research and Development >>>> Data Tactics Corporation >>>> 7901 Jones Branch, Suite 240 >>>> Mclean, VA 22102 >>>> office: +1-703-506-3735, ext 308 <%2B1-703-506-3735%2C%20ext%20308> >>>> fax: +1-703-506-6703 >>>> cell: +1-703-402-7908 >>>> ============== http://www.data-tactics.com/ ============ >>>> >>>> The information contained in this message may be privileged >>>> and/or confidential and protected from disclosure. >>>> If the reader of this message is not the intended recipient >>>> or an employee or agent responsible for delivering this message >>>> to the intended recipient, you are hereby notified that any >>>> dissemination, distribution or copying of this communication >>>> is strictly prohibited. If you have received this communication >>>> in error, please notify the sender immediately by replying to >>>> this message and deleting the material from any computer. >>>> >>>> >>>> >>> >>> >>> -- >>> ========= mailto:db...@data-tactics.com <db...@data-tactics.com> >>> ============ >>> David W. Boyd >>> Director, Engineering, Research and Development >>> Data Tactics Corporation >>> 7901 Jones Branch, Suite 240 >>> Mclean, VA 22102 >>> office: +1-703-506-3735, ext 308 >>> fax: +1-703-506-6703 >>> cell: +1-703-402-7908 >>> ============== http://www.data-tactics.com/ ============ >>> >>> >>> The information contained in this message may be privileged >>> and/or confidential and protected from disclosure. >>> If the reader of this message is not the intended recipient >>> or an employee or agent responsible for delivering this message >>> to the intended recipient, you are hereby notified that any >>> dissemination, distribution or copying of this communication >>> is strictly prohibited. If you have received this communication >>> in error, please notify the sender immediately by replying to >>> this message and deleting the material from any computer. >>> >>> >>> >>> >> >> >> -- >> ========= mailto:db...@data-tactics.com <db...@data-tactics.com> ============ >> David W. Boyd >> Director, Engineering, Research and Development >> Data Tactics Corporation >> 7901 Jones Branch, Suite 240 >> Mclean, VA 22102 >> office: +1-703-506-3735, ext 308 >> fax: +1-703-506-6703 >> cell: +1-703-402-7908 >> ============== http://www.data-tactics.com/ ============ >> >> >> The information contained in this message may be privileged >> and/or confidential and protected from disclosure. >> If the reader of this message is not the intended recipient >> or an employee or agent responsible for delivering this message >> to the intended recipient, you are hereby notified that any >> dissemination, distribution or copying of this communication >> is strictly prohibited. If you have received this communication >> in error, please notify the sender immediately by replying to >> this message and deleting the material from any computer. >> >> >> >> > > > -- > ========= mailto:db...@data-tactics.com <db...@data-tactics.com> ============ > David W. Boyd > Director, Engineering, Research and Development > Data Tactics Corporation > 7901 Jones Branch, Suite 240 > Mclean, VA 22102 > office: +1-703-506-3735, ext 308 > fax: +1-703-506-6703 > cell: +1-703-402-7908 > ============== http://www.data-tactics.com/ ============ > > > The information contained in this message may be privileged > and/or confidential and protected from disclosure. > If the reader of this message is not the intended recipient > or an employee or agent responsible for delivering this message > to the intended recipient, you are hereby notified that any > dissemination, distribution or copying of this communication > is strictly prohibited. If you have received this communication > in error, please notify the sender immediately by replying to > this message and deleting the material from any computer. > > > >