Figured out the issue via the container log file: container_1426433168188_0001_01_000001/gam-stdout.log. Too much virtual memory was trying to be used (I am using a micro instance on EC2 so there is not much to work with) causing an "exitCode: 143". Apparently, there is a limit on the virtual memory based on the physical memory, but you can ignore this limit by adding the following to yarn-site.xml:
<property> <name>yarn.nodemanager.vmem-check-enabled</name> <value>false</value> <description>Whether virtual memory limits will be enforced for containers.</description> </property> source: http://stackoverflow.com/questions/14110428/am-container-is-running-beyond-virtual-memory-limits Everything seems to be working for me now. On Fri, Mar 13, 2015 at 10:24 PM, Steven Harenberg <sdhar...@ncsu.edu> wrote: > Thanks Phil, I appreciate the help. Your posts over the past couple days > have already been quite helpful. > > There were a few things I was going to play with as well, perhaps it is > some configuration issue as you mentioned earlier. I had some issues with > EC2 today and I will look at it again tomorrow. > > Thanks for letting me know about your talk, it sounds interesting. I will > try and go as long as I can get there in time. > > --Steve > > On Fri, Mar 13, 2015 at 3:37 PM, Phillip Rhodes <motley.crue....@gmail.com > > wrote: > >> Steve: >> >> I'm not 100% sure what to tell you, and I don't have access to my >> cluster right this minute. But later this evening I can log in and >> see if I can find anything that might be >> useful to you. >> >> Also, as an FYI, I'll be doing a presentation on Giraph at the >> Triangle Java User's Group meeting this coming Monday... if you're in >> the area (I see you have an @ncsu.edu address), and you can come by, I >> might be able to help you then. Part of my presentation will be >> walking through how to setup a Giraph / YARN cluster, based on my >> experiences over the past few days... >> >> >> Phil >> >> This message optimized for indexing by NSA PRISM >> >> >> On Fri, Mar 13, 2015 at 3:30 PM, Steven Harenberg <sdhar...@ncsu.edu> >> wrote: >> > Hey Phil, >> > >> > I have been having the exact same problems as you (I am also setting up >> > Giraph on EC2), but this solution did not work for me. >> > >> > Do you recall what error you saw in resourcemanager logs? I am also >> looking >> > at these logs, but nothing is standing out to me. In fact, it almost >> seems >> > like the application should have successfully finished. The log stops >> > updating and I see a lot of "COMPLETED", "RESULT=SUCCESS", "FINISHED" >> at the >> > end of the log. Though, it does look like one of the containers is not >> > transitioning to these states. >> > >> > Thanks, >> > Steve >> > >> > >> > On Wed, Mar 11, 2015 at 11:54 PM, Phillip Rhodes < >> motley.crue....@gmail.com> >> > wrote: >> >> >> >> OK, this was easy enough to fix, once I understood what >> >> was actually happening. Since I'm running on EC2 nodes on >> >> AWS, it is not the case that any give node can talk to any other >> >> node on any port (at least not by default). I had tried to >> >> cherry-pick which ports to whitelist in the security group, >> >> but I missed one or more that YARN needed for internal >> >> communication. I discovered this when examining the >> >> resourcemanager logs. >> >> >> >> >> >> For now, instead of trying to enumerate exactly which ports >> >> to allow, I added a rule to allow "all traffic" for address >> 10.0.0.0/24 >> >> and that solved this. >> >> >> >> >> >> Cheers, >> >> >> >> >> >> Phil >> >> >> >> >> >> On Wed, Mar 11, 2015 at 1:39 PM, Phillip Rhodes >> >> <motley.crue....@gmail.com> wrote: >> >> > Interesting... It totally did not work for me when built using the >> >> > hadoop_2 profile, but with the hadoop_yarn profile everything at >> least >> >> > starts up. I'm pretty baffled right now... my cluster is essentially >> >> > working, and I can run, for example, the WordCount example just fine. >> >> > And the Giraph job starts and shows no apparent errors, but I get no >> >> > output and it seems to run forever. >> >> > >> >> > It's probably some really small detail of my Hadoop configuration, or >> >> > some environmental issue. The problem is, I don't even know where to >> >> > start looking right now. :-( >> >> > >> >> > >> >> > Phil >> >> > This message optimized for indexing by NSA PRISM >> >> > >> >> > >> >> > On Wed, Mar 11, 2015 at 3:16 AM, Martin Junghanns >> >> > <martin.jungha...@gmx.net> wrote: >> >> >> Hi Phillip, >> >> >> >> >> >> I am using Hadoop 2.5.2 with Giraph 1.1.0 and it runs fine with >> >> >> -Phadoop2 (from scratch) and -Phadoop_yarn (after removing >> >> >> STATIC_SASL_SYMBOL from munge.symbols in pom.xml). >> >> >> >> >> >> Maybe you can also try the stable Giraph >> >> >> version and report your problem as an issue? >> >> >> >> >> >> Cheers, >> >> >> Martin >> >> >> >> >> >> On 11.03.2015 04:03, Phillip Rhodes wrote: >> >> >>> Giraph crew: >> >> >>> >> >> >>> I'm trying to run the SimpleShortestPathsComputation example using >> >> >>> the latest Giraph code and Hadoop 2.5.2. My command line looks >> >> >>> like this: >> >> >>> >> >> >>> hadoop jar >> >> >>> >> >> >>> >> /home/prhodes/giraph/giraph-examples/target/giraph-examples-1.2.0-SNAPSHOT-for-hadoop-2.5.2-jar-with-dependencies.jar >> >> >>> >> >> >>> >> >> >> org.apache.giraph.GiraphRunner >> >> >>> org.apache.giraph.examples.SimpleShortestPathsComputation -vif >> >> >>> >> >> >>> >> org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat >> >> >>> >> >> >>> >> >> >> -vip /user/prhodes/input/tiny_graph.txt -vof >> >> >>> org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op >> >> >>> /user/prhodes/giraph_output/shortestpaths -w 4 >> >> >>> >> >> >>> >> >> >>> and the job appears to start OK. But then it starts outputing >> >> >>> these kinds of messages, and this just continues (seemingly) >> >> >>> forever until you ctrl+c it. >> >> >>> >> >> >>> 15/03/11 02:54:31 INFO yarn.GiraphYarnClient: Giraph: >> >> >>> org.apache.giraph.examples.SimpleShortestPathsComputation, >> >> >>> Elapsed: 305.43 secs 15/03/11 02:54:31 INFO yarn.GiraphYarnClient: >> >> >>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers >> >> >>> used: 1 15/03/11 02:54:35 INFO yarn.GiraphYarnClient: Giraph: >> >> >>> org.apache.giraph.examples.SimpleShortestPathsComputation, >> >> >>> Elapsed: 309.44 secs 15/03/11 02:54:35 INFO yarn.GiraphYarnClient: >> >> >>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers >> >> >>> used: 1 15/03/11 02:54:39 INFO yarn.GiraphYarnClient: Giraph: >> >> >>> org.apache.giraph.examples.SimpleShortestPathsComputation, >> >> >>> Elapsed: 313.45 secs 15/03/11 02:54:39 INFO yarn.GiraphYarnClient: >> >> >>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers >> >> >>> used: 1 15/03/11 02:54:43 INFO yarn.GiraphYarnClient: Giraph: >> >> >>> org.apache.giraph.examples.SimpleShortestPathsComputation, >> >> >>> Elapsed: 317.45 secs 15/03/11 02:54:43 INFO yarn.GiraphYarnClient: >> >> >>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers >> >> >>> used: 1 ^C15/03/11 02:54:47 INFO yarn.GiraphYarnClient: Giraph: >> >> >>> org.apache.giraph.examples.SimpleShortestPathsComputation, >> >> >>> Elapsed: 321.46 secs 15/03/11 02:54:47 INFO yarn.GiraphYarnClient: >> >> >>> appattempt_1426041786848_0002_000001, State: ACCEPTED, Containers >> >> >>> used: 1 >> >> >>> >> >> >>> Any idea what is going on here? >> >> >>> >> >> >>> >> >> >>> Thanks, >> >> >>> >> >> >>> >> >> >>> Phil --- >> >> >>> >> >> >>> >> >> >>> This message optimized for indexing by NSA PRISM >> >> >>> >> > >> > >> > >