Hi Eli, Please ignore my previous response. So for Giraph on YARN, are you saying for a large graph when running on MR1 (runs with one master + multiple worker) but for yarn it would be one Application Master, one Master, and multiple Workers ?
Thanks Sundi On Sun, Mar 2, 2014 at 10:14 PM, Jyotirmoy Sundi <sundi...@gmail.com> wrote: > Hmmm, > I am running in cloudera manager. The number of application master,master > and worker seems as per the config stats. > > I get the following response in master mapper while running on a small > graph > MASTER_ONLY checkWorkers: Only found 2 responses of 3 needed to start > superstep -1 > > When I go to the mapper running master, I get the following log: > > INFO org.apache.giraph.master.BspServiceMaster: logMissingWorkersOnSuperstep: > No response from partition 3 (could be master) > > Any idea what configuration issue it might be ? > > > Thanks > > Sundi > > > > > On Sun, Mar 2, 2014 at 4:56 PM, Eli Reisman <apache.mail...@gmail.com>wrote: > >> This looks like YARN cluster is misconfigured. Alternately, you need to >> configure it to allow a few more worker tasks. Giraph on YARN at minimum >> needs one Application Master, one Master, and one Worker (so 3 YARN >> containers) I have a feeling this could be the issue. >> >> >> On Sat, Mar 1, 2014 at 9:18 PM, Jyotirmoy Sundi <sundi...@gmail.com>wrote: >> >>> Hi Folks, >>> >>> The job was working properly in MR1 without any issue. I am trying to >>> run a simple CC sample Giraph job on YARN. . I have attached the stacktrace >>> and a few errors. Any pointers will be really helpful for the below errors. >>> >>> *1. BspServiceMaster (YARN profile) is FAILING this task, throwing >>> exception to end job run.* >>> >>> *2. java.lang.IllegalStateException: Not enough healthy workers to create >>> input splits* >>> >>> >>> >>> >>> *StackTrace:* >>> >>> 2014-03-02 04:53:24,646 INFO org.apache.giraph.master.BspServiceMaster: >>> logMissingWorkersOnSuperstep: No response from partition 2 (could be master) >>> 2014-03-02 04:53:24,646 ERROR org.apache.giraph.master.BspServiceMaster: >>> checkWorkers: Did not receive enough processes in time (only 1 of 2 >>> required) after waiting 600000msecs). This occurs if you do not have >>> enough map tasks available simultaneously on your Hadoop instance to >>> fulfill the number of requested workers. >>> 2014-03-02 04:53:24,649 INFO org.apache.giraph.master.BspServiceMaster: >>> setJobState: >>> {"_stateKey":"FAILED","_applicationAttemptKey":-1,"_superstepKey":-1} on >>> superstep -1 >>> 2014-03-02 04:53:24,653 FATAL org.apache.giraph.master.BspServiceMaster: >>> failJob: Killing job job_201402281650_0019 >>> 2014-03-02 04:53:24,654 FATAL org.apache.giraph.master.BspServiceMaster: >>> failJob: exception java.lang.IllegalStateException: Not enough healthy >>> workers to create input splits >>> 2014-03-02 04:53:24,654 ERROR org.apache.giraph.master.MasterThread: >>> masterThread: Master algorithm failed with RuntimeException >>> java.lang.RuntimeException: BspServiceMaster (YARN profile) is FAILING this >>> task, throwing exception to end job run. >>> at >>> org.apache.giraph.master.BspServiceMaster.failJob(BspServiceMaster.java:349) >>> at >>> org.apache.giraph.master.BspServiceMaster.setJobStateFailed(BspServiceMaster.java:297) >>> at >>> org.apache.giraph.master.BspServiceMaster.createInputSplits(BspServiceMaster.java:616) >>> at >>> org.apache.giraph.master.BspServiceMaster.createVertexInputSplits(BspServiceMaster.java:692) >>> at org.apache.giraph.master.MasterThread.run(MasterThread.java:100) >>> Caused by: java.lang.IllegalStateException: Not enough healthy workers to >>> create input splits >>> ... 4 more >>> 2014-03-02 04:53:24,656 FATAL org.apache.giraph.graph.GraphMapper: >>> uncaughtException: OverrideExceptionHandler on thread >>> org.apache.giraph.master.MasterThread, msg = java.lang.RuntimeException: >>> BspServiceMaster (YARN profile) is FAILING this task, throwing exception to >>> end job run., exiting... >>> java.lang.IllegalStateException: java.lang.RuntimeException: >>> BspServiceMaster (YARN profile) is FAILING this task, throwing exception to >>> end job run. >>> at org.apache.giraph.master.MasterThread.run(MasterThread.java:181) >>> Caused by: java.lang.RuntimeException: BspServiceMaster (YARN profile) is >>> FAILING this task, throwing exception to end job run. >>> at >>> org.apache.giraph.master.BspServiceMaster.failJob(BspServiceMaster.java:349) >>> at >>> org.apache.giraph.master.BspServiceMaster.setJobStateFailed(BspServiceMaster.java:297) >>> at >>> org.apache.giraph.master.BspServiceMaster.createInputSplits(BspServiceMaster.java:616) >>> at >>> org.apache.giraph.master.BspServiceMaster.createVertexInputSplits(BspServiceMaster.java:692) >>> at org.apache.giraph.master.MasterThread.run(MasterThread.java:100) >>> Caused by: java.lang.IllegalStateException: Not enough healthy workers to >>> create input splits >>> ... 4 more >>> >>> ------------------------------ >>> >>> >> > > > -- > Best Regards, > Jyotirmoy Sundi > > -- Best Regards, Jyotirmoy Sundi