Re: [SOLVED] Re: Giraph job never ends

2015-03-15 Thread Steven Harenberg
Figured out the issue via the container log file:
container_1426433168188_0001_01_01/gam-stdout.log. Too much virtual
memory was trying to be used (I am using a micro instance on EC2 so there
is not much to work with) causing an exitCode: 143. Apparently, there is
a limit on the virtual memory based on the physical memory, but you can
ignore this limit by adding the following to yarn-site.xml:

property
  nameyarn.nodemanager.vmem-check-enabled/name
  valuefalse/value
  descriptionWhether virtual memory limits will be enforced for
containers./description
/property

source:
http://stackoverflow.com/questions/14110428/am-container-is-running-beyond-virtual-memory-limits

Everything seems to be working for me now.

On Fri, Mar 13, 2015 at 10:24 PM, Steven Harenberg sdhar...@ncsu.edu
wrote:

 Thanks Phil, I appreciate the help. Your posts over the past couple days
 have already been quite helpful.

 There were a few things I was going to play with as well, perhaps it is
 some configuration issue as you mentioned earlier. I had some issues with
 EC2 today and I will look at it again tomorrow.

 Thanks for letting me know about your talk, it sounds interesting. I will
 try and go as long as I can get there in time.

 --Steve

 On Fri, Mar 13, 2015 at 3:37 PM, Phillip Rhodes motley.crue@gmail.com
  wrote:

 Steve:

 I'm not 100% sure what to tell you, and I don't have access to my
 cluster right this minute.  But later this evening I can log in and
 see if I can find anything that might be
 useful to you.

 Also, as an FYI, I'll be doing a presentation on Giraph at the
 Triangle Java User's Group meeting this coming Monday... if you're in
 the area (I see you have an @ncsu.edu address), and you can come by, I
 might be able to help you then.   Part of my presentation will be
 walking through how to setup a Giraph / YARN cluster, based on my
 experiences over the past few days...


 Phil

 This message optimized for indexing by NSA PRISM


 On Fri, Mar 13, 2015 at 3:30 PM, Steven Harenberg sdhar...@ncsu.edu
 wrote:
  Hey Phil,
 
  I have been having the exact same problems as you (I am also setting up
  Giraph on EC2), but this solution did not work for me.
 
  Do you recall what error you saw in resourcemanager logs? I am also
 looking
  at these logs, but nothing is standing out to me. In fact, it almost
 seems
  like the application should have successfully finished. The log stops
  updating and I see a lot of COMPLETED, RESULT=SUCCESS, FINISHED
 at the
  end of the log. Though, it does look like one of the containers is not
  transitioning to these states.
 
  Thanks,
  Steve
 
 
  On Wed, Mar 11, 2015 at 11:54 PM, Phillip Rhodes 
 motley.crue@gmail.com
  wrote:
 
  OK, this was easy enough to fix, once I understood what
  was actually happening.  Since I'm running on EC2 nodes on
  AWS, it is not the case that any give node can talk to any other
  node on any port (at least not by default).  I had tried to
  cherry-pick which ports to whitelist in the security group,
  but I missed one or more that YARN needed for internal
  communication.   I discovered this when examining the
  resourcemanager logs.
 
 
  For now, instead of trying to enumerate exactly which ports
  to allow, I added a rule to allow all traffic for address
 10.0.0.0/24
  and that solved this.
 
 
  Cheers,
 
 
  Phil
 
 
  On Wed, Mar 11, 2015 at 1:39 PM, Phillip Rhodes
  motley.crue@gmail.com wrote:
   Interesting... It totally did not work for me when built using the
   hadoop_2 profile, but with the hadoop_yarn profile everything at
 least
   starts up.  I'm pretty baffled right now... my cluster is essentially
   working, and I can run, for example, the WordCount example just fine.
   And the Giraph job starts and shows no apparent errors, but I get no
   output and it seems to run forever.
  
   It's probably some really small detail of my Hadoop configuration, or
   some environmental issue.  The problem is, I don't even know where to
   start looking right now.  :-(
  
  
   Phil
   This message optimized for indexing by NSA PRISM
  
  
   On Wed, Mar 11, 2015 at 3:16 AM, Martin Junghanns
   martin.jungha...@gmx.net wrote:
   Hi Phillip,
  
   I am using Hadoop 2.5.2 with Giraph 1.1.0 and it runs fine with
   -Phadoop2 (from scratch) and -Phadoop_yarn (after removing
   STATIC_SASL_SYMBOL from munge.symbols in pom.xml).
  
   Maybe you can also try the stable Giraph
   version and report your problem as an issue?
  
   Cheers,
   Martin
  
   On 11.03.2015 04:03, Phillip Rhodes wrote:
   Giraph crew:
  
   I'm trying to run the SimpleShortestPathsComputation example using
   the latest Giraph code and Hadoop 2.5.2.  My command line looks
   like this:
  
   hadoop jar
  
  
 /home/prhodes/giraph/giraph-examples/target/giraph-examples-1.2.0-SNAPSHOT-for-hadoop-2.5.2-jar-with-dependencies.jar
  
  
   org.apache.giraph.GiraphRunner
   org.apache.giraph.examples.SimpleShortestPathsComputation -vif
  
  
 

Re: [SOLVED] Re: Giraph job never ends

2015-03-13 Thread Steven Harenberg
Hey Phil,

I have been having the exact same problems as you (I am also setting up
Giraph on EC2), but this solution did not work for me.

Do you recall what error you saw in resourcemanager logs? I am also looking
at these logs, but nothing is standing out to me. In fact, it almost seems
like the application should have successfully finished. The log stops
updating and I see a lot of COMPLETED, RESULT=SUCCESS, FINISHED at
the end of the log. Though, it does look like one of the containers is not
transitioning to these states.

Thanks,
Steve

On Wed, Mar 11, 2015 at 11:54 PM, Phillip Rhodes motley.crue@gmail.com
wrote:

 OK, this was easy enough to fix, once I understood what
 was actually happening.  Since I'm running on EC2 nodes on
 AWS, it is not the case that any give node can talk to any other
 node on any port (at least not by default).  I had tried to
 cherry-pick which ports to whitelist in the security group,
 but I missed one or more that YARN needed for internal
 communication.   I discovered this when examining the
 resourcemanager logs.


 For now, instead of trying to enumerate exactly which ports
 to allow, I added a rule to allow all traffic for address 10.0.0.0/24
 and that solved this.


 Cheers,


 Phil


 On Wed, Mar 11, 2015 at 1:39 PM, Phillip Rhodes
 motley.crue@gmail.com wrote:
  Interesting... It totally did not work for me when built using the
  hadoop_2 profile, but with the hadoop_yarn profile everything at least
  starts up.  I'm pretty baffled right now... my cluster is essentially
  working, and I can run, for example, the WordCount example just fine.
  And the Giraph job starts and shows no apparent errors, but I get no
  output and it seems to run forever.
 
  It's probably some really small detail of my Hadoop configuration, or
  some environmental issue.  The problem is, I don't even know where to
  start looking right now.  :-(
 
 
  Phil
  This message optimized for indexing by NSA PRISM
 
 
  On Wed, Mar 11, 2015 at 3:16 AM, Martin Junghanns
  martin.jungha...@gmx.net wrote:
  Hi Phillip,
 
  I am using Hadoop 2.5.2 with Giraph 1.1.0 and it runs fine with
  -Phadoop2 (from scratch) and -Phadoop_yarn (after removing
  STATIC_SASL_SYMBOL from munge.symbols in pom.xml).
 
  Maybe you can also try the stable Giraph
  version and report your problem as an issue?
 
  Cheers,
  Martin
 
  On 11.03.2015 04:03, Phillip Rhodes wrote:
  Giraph crew:
 
  I'm trying to run the SimpleShortestPathsComputation example using
  the latest Giraph code and Hadoop 2.5.2.  My command line looks
  like this:
 
  hadoop jar
 
 /home/prhodes/giraph/giraph-examples/target/giraph-examples-1.2.0-SNAPSHOT-for-hadoop-2.5.2-jar-with-dependencies.jar
 
 
  org.apache.giraph.GiraphRunner
  org.apache.giraph.examples.SimpleShortestPathsComputation -vif
  org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
 
 
  -vip /user/prhodes/input/tiny_graph.txt -vof
  org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op
  /user/prhodes/giraph_output/shortestpaths -w 4
 
 
  and the job appears to start OK.  But then it starts outputing
  these kinds of messages, and this just continues (seemingly)
  forever until you ctrl+c it.
 
  15/03/11 02:54:31 INFO yarn.GiraphYarnClient: Giraph:
  org.apache.giraph.examples.SimpleShortestPathsComputation,
  Elapsed: 305.43 secs 15/03/11 02:54:31 INFO yarn.GiraphYarnClient:
  appattempt_1426041786848_0002_01, State: ACCEPTED, Containers
  used: 1 15/03/11 02:54:35 INFO yarn.GiraphYarnClient: Giraph:
  org.apache.giraph.examples.SimpleShortestPathsComputation,
  Elapsed: 309.44 secs 15/03/11 02:54:35 INFO yarn.GiraphYarnClient:
  appattempt_1426041786848_0002_01, State: ACCEPTED, Containers
  used: 1 15/03/11 02:54:39 INFO yarn.GiraphYarnClient: Giraph:
  org.apache.giraph.examples.SimpleShortestPathsComputation,
  Elapsed: 313.45 secs 15/03/11 02:54:39 INFO yarn.GiraphYarnClient:
  appattempt_1426041786848_0002_01, State: ACCEPTED, Containers
  used: 1 15/03/11 02:54:43 INFO yarn.GiraphYarnClient: Giraph:
  org.apache.giraph.examples.SimpleShortestPathsComputation,
  Elapsed: 317.45 secs 15/03/11 02:54:43 INFO yarn.GiraphYarnClient:
  appattempt_1426041786848_0002_01, State: ACCEPTED, Containers
  used: 1 ^C15/03/11 02:54:47 INFO yarn.GiraphYarnClient: Giraph:
  org.apache.giraph.examples.SimpleShortestPathsComputation,
  Elapsed: 321.46 secs 15/03/11 02:54:47 INFO yarn.GiraphYarnClient:
  appattempt_1426041786848_0002_01, State: ACCEPTED, Containers
  used: 1
 
  Any idea what is going on here?
 
 
  Thanks,
 
 
  Phil ---
 
 
  This message optimized for indexing by NSA PRISM
 



Re: [SOLVED] Re: Giraph job never ends

2015-03-13 Thread Steven Harenberg
Thanks Phil, I appreciate the help. Your posts over the past couple days
have already been quite helpful.

There were a few things I was going to play with as well, perhaps it is
some configuration issue as you mentioned earlier. I had some issues with
EC2 today and I will look at it again tomorrow.

Thanks for letting me know about your talk, it sounds interesting. I will
try and go as long as I can get there in time.

--Steve

On Fri, Mar 13, 2015 at 3:37 PM, Phillip Rhodes motley.crue@gmail.com
wrote:

 Steve:

 I'm not 100% sure what to tell you, and I don't have access to my
 cluster right this minute.  But later this evening I can log in and
 see if I can find anything that might be
 useful to you.

 Also, as an FYI, I'll be doing a presentation on Giraph at the
 Triangle Java User's Group meeting this coming Monday... if you're in
 the area (I see you have an @ncsu.edu address), and you can come by, I
 might be able to help you then.   Part of my presentation will be
 walking through how to setup a Giraph / YARN cluster, based on my
 experiences over the past few days...


 Phil

 This message optimized for indexing by NSA PRISM


 On Fri, Mar 13, 2015 at 3:30 PM, Steven Harenberg sdhar...@ncsu.edu
 wrote:
  Hey Phil,
 
  I have been having the exact same problems as you (I am also setting up
  Giraph on EC2), but this solution did not work for me.
 
  Do you recall what error you saw in resourcemanager logs? I am also
 looking
  at these logs, but nothing is standing out to me. In fact, it almost
 seems
  like the application should have successfully finished. The log stops
  updating and I see a lot of COMPLETED, RESULT=SUCCESS, FINISHED at
 the
  end of the log. Though, it does look like one of the containers is not
  transitioning to these states.
 
  Thanks,
  Steve
 
 
  On Wed, Mar 11, 2015 at 11:54 PM, Phillip Rhodes 
 motley.crue@gmail.com
  wrote:
 
  OK, this was easy enough to fix, once I understood what
  was actually happening.  Since I'm running on EC2 nodes on
  AWS, it is not the case that any give node can talk to any other
  node on any port (at least not by default).  I had tried to
  cherry-pick which ports to whitelist in the security group,
  but I missed one or more that YARN needed for internal
  communication.   I discovered this when examining the
  resourcemanager logs.
 
 
  For now, instead of trying to enumerate exactly which ports
  to allow, I added a rule to allow all traffic for address 10.0.0.0/24
  and that solved this.
 
 
  Cheers,
 
 
  Phil
 
 
  On Wed, Mar 11, 2015 at 1:39 PM, Phillip Rhodes
  motley.crue@gmail.com wrote:
   Interesting... It totally did not work for me when built using the
   hadoop_2 profile, but with the hadoop_yarn profile everything at least
   starts up.  I'm pretty baffled right now... my cluster is essentially
   working, and I can run, for example, the WordCount example just fine.
   And the Giraph job starts and shows no apparent errors, but I get no
   output and it seems to run forever.
  
   It's probably some really small detail of my Hadoop configuration, or
   some environmental issue.  The problem is, I don't even know where to
   start looking right now.  :-(
  
  
   Phil
   This message optimized for indexing by NSA PRISM
  
  
   On Wed, Mar 11, 2015 at 3:16 AM, Martin Junghanns
   martin.jungha...@gmx.net wrote:
   Hi Phillip,
  
   I am using Hadoop 2.5.2 with Giraph 1.1.0 and it runs fine with
   -Phadoop2 (from scratch) and -Phadoop_yarn (after removing
   STATIC_SASL_SYMBOL from munge.symbols in pom.xml).
  
   Maybe you can also try the stable Giraph
   version and report your problem as an issue?
  
   Cheers,
   Martin
  
   On 11.03.2015 04:03, Phillip Rhodes wrote:
   Giraph crew:
  
   I'm trying to run the SimpleShortestPathsComputation example using
   the latest Giraph code and Hadoop 2.5.2.  My command line looks
   like this:
  
   hadoop jar
  
  
 /home/prhodes/giraph/giraph-examples/target/giraph-examples-1.2.0-SNAPSHOT-for-hadoop-2.5.2-jar-with-dependencies.jar
  
  
   org.apache.giraph.GiraphRunner
   org.apache.giraph.examples.SimpleShortestPathsComputation -vif
  
  
 org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
  
  
   -vip /user/prhodes/input/tiny_graph.txt -vof
   org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op
   /user/prhodes/giraph_output/shortestpaths -w 4
  
  
   and the job appears to start OK.  But then it starts outputing
   these kinds of messages, and this just continues (seemingly)
   forever until you ctrl+c it.
  
   15/03/11 02:54:31 INFO yarn.GiraphYarnClient: Giraph:
   org.apache.giraph.examples.SimpleShortestPathsComputation,
   Elapsed: 305.43 secs 15/03/11 02:54:31 INFO yarn.GiraphYarnClient:
   appattempt_1426041786848_0002_01, State: ACCEPTED, Containers
   used: 1 15/03/11 02:54:35 INFO yarn.GiraphYarnClient: Giraph:
   org.apache.giraph.examples.SimpleShortestPathsComputation,
   Elapsed: 309.44 secs 15/03/11 

[SOLVED] Re: Giraph job never ends

2015-03-11 Thread Phillip Rhodes
OK, this was easy enough to fix, once I understood what
was actually happening.  Since I'm running on EC2 nodes on
AWS, it is not the case that any give node can talk to any other
node on any port (at least not by default).  I had tried to
cherry-pick which ports to whitelist in the security group,
but I missed one or more that YARN needed for internal
communication.   I discovered this when examining the
resourcemanager logs.


For now, instead of trying to enumerate exactly which ports
to allow, I added a rule to allow all traffic for address 10.0.0.0/24
and that solved this.


Cheers,


Phil


On Wed, Mar 11, 2015 at 1:39 PM, Phillip Rhodes
motley.crue@gmail.com wrote:
 Interesting... It totally did not work for me when built using the
 hadoop_2 profile, but with the hadoop_yarn profile everything at least
 starts up.  I'm pretty baffled right now... my cluster is essentially
 working, and I can run, for example, the WordCount example just fine.
 And the Giraph job starts and shows no apparent errors, but I get no
 output and it seems to run forever.

 It's probably some really small detail of my Hadoop configuration, or
 some environmental issue.  The problem is, I don't even know where to
 start looking right now.  :-(


 Phil
 This message optimized for indexing by NSA PRISM


 On Wed, Mar 11, 2015 at 3:16 AM, Martin Junghanns
 martin.jungha...@gmx.net wrote:
 Hi Phillip,

 I am using Hadoop 2.5.2 with Giraph 1.1.0 and it runs fine with
 -Phadoop2 (from scratch) and -Phadoop_yarn (after removing
 STATIC_SASL_SYMBOL from munge.symbols in pom.xml).

 Maybe you can also try the stable Giraph
 version and report your problem as an issue?

 Cheers,
 Martin

 On 11.03.2015 04:03, Phillip Rhodes wrote:
 Giraph crew:

 I'm trying to run the SimpleShortestPathsComputation example using
 the latest Giraph code and Hadoop 2.5.2.  My command line looks
 like this:

 hadoop jar
 /home/prhodes/giraph/giraph-examples/target/giraph-examples-1.2.0-SNAPSHOT-for-hadoop-2.5.2-jar-with-dependencies.jar


 org.apache.giraph.GiraphRunner
 org.apache.giraph.examples.SimpleShortestPathsComputation -vif
 org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat


 -vip /user/prhodes/input/tiny_graph.txt -vof
 org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op
 /user/prhodes/giraph_output/shortestpaths -w 4


 and the job appears to start OK.  But then it starts outputing
 these kinds of messages, and this just continues (seemingly)
 forever until you ctrl+c it.

 15/03/11 02:54:31 INFO yarn.GiraphYarnClient: Giraph:
 org.apache.giraph.examples.SimpleShortestPathsComputation,
 Elapsed: 305.43 secs 15/03/11 02:54:31 INFO yarn.GiraphYarnClient:
 appattempt_1426041786848_0002_01, State: ACCEPTED, Containers
 used: 1 15/03/11 02:54:35 INFO yarn.GiraphYarnClient: Giraph:
 org.apache.giraph.examples.SimpleShortestPathsComputation,
 Elapsed: 309.44 secs 15/03/11 02:54:35 INFO yarn.GiraphYarnClient:
 appattempt_1426041786848_0002_01, State: ACCEPTED, Containers
 used: 1 15/03/11 02:54:39 INFO yarn.GiraphYarnClient: Giraph:
 org.apache.giraph.examples.SimpleShortestPathsComputation,
 Elapsed: 313.45 secs 15/03/11 02:54:39 INFO yarn.GiraphYarnClient:
 appattempt_1426041786848_0002_01, State: ACCEPTED, Containers
 used: 1 15/03/11 02:54:43 INFO yarn.GiraphYarnClient: Giraph:
 org.apache.giraph.examples.SimpleShortestPathsComputation,
 Elapsed: 317.45 secs 15/03/11 02:54:43 INFO yarn.GiraphYarnClient:
 appattempt_1426041786848_0002_01, State: ACCEPTED, Containers
 used: 1 ^C15/03/11 02:54:47 INFO yarn.GiraphYarnClient: Giraph:
 org.apache.giraph.examples.SimpleShortestPathsComputation,
 Elapsed: 321.46 secs 15/03/11 02:54:47 INFO yarn.GiraphYarnClient:
 appattempt_1426041786848_0002_01, State: ACCEPTED, Containers
 used: 1

 Any idea what is going on here?


 Thanks,


 Phil ---


 This message optimized for indexing by NSA PRISM