Hmmm, I am running in cloudera manager. The number of application master,master and worker seems as per the config stats.
I get the following response in master mapper while running on a small graph MASTER_ONLY checkWorkers: Only found 2 responses of 3 needed to start superstep -1 When I go to the mapper running master, I get the following log: INFO org.apache.giraph.master.BspServiceMaster: logMissingWorkersOnSuperstep: No response from partition 3 (could be master) Any idea what configuration issue it might be ? Thanks Sundi On Sun, Mar 2, 2014 at 4:56 PM, Eli Reisman <apache.mail...@gmail.com>wrote: > This looks like YARN cluster is misconfigured. Alternately, you need to > configure it to allow a few more worker tasks. Giraph on YARN at minimum > needs one Application Master, one Master, and one Worker (so 3 YARN > containers) I have a feeling this could be the issue. > > > On Sat, Mar 1, 2014 at 9:18 PM, Jyotirmoy Sundi <sundi...@gmail.com>wrote: > >> Hi Folks, >> >> The job was working properly in MR1 without any issue. I am trying to >> run a simple CC sample Giraph job on YARN. . I have attached the stacktrace >> and a few errors. Any pointers will be really helpful for the below errors. >> >> *1. BspServiceMaster (YARN profile) is FAILING this task, throwing exception >> to end job run.* >> >> *2. java.lang.IllegalStateException: Not enough healthy workers to create >> input splits* >> >> >> >> >> *StackTrace:* >> >> 2014-03-02 04:53:24,646 INFO org.apache.giraph.master.BspServiceMaster: >> logMissingWorkersOnSuperstep: No response from partition 2 (could be master) >> 2014-03-02 04:53:24,646 ERROR org.apache.giraph.master.BspServiceMaster: >> checkWorkers: Did not receive enough processes in time (only 1 of 2 >> required) after waiting 600000msecs). This occurs if you do not have enough >> map tasks available simultaneously on your Hadoop instance to fulfill the >> number of requested workers. >> 2014-03-02 04:53:24,649 INFO org.apache.giraph.master.BspServiceMaster: >> setJobState: >> {"_stateKey":"FAILED","_applicationAttemptKey":-1,"_superstepKey":-1} on >> superstep -1 >> 2014-03-02 04:53:24,653 FATAL org.apache.giraph.master.BspServiceMaster: >> failJob: Killing job job_201402281650_0019 >> 2014-03-02 04:53:24,654 FATAL org.apache.giraph.master.BspServiceMaster: >> failJob: exception java.lang.IllegalStateException: Not enough healthy >> workers to create input splits >> 2014-03-02 04:53:24,654 ERROR org.apache.giraph.master.MasterThread: >> masterThread: Master algorithm failed with RuntimeException >> java.lang.RuntimeException: BspServiceMaster (YARN profile) is FAILING this >> task, throwing exception to end job run. >> at >> org.apache.giraph.master.BspServiceMaster.failJob(BspServiceMaster.java:349) >> at >> org.apache.giraph.master.BspServiceMaster.setJobStateFailed(BspServiceMaster.java:297) >> at >> org.apache.giraph.master.BspServiceMaster.createInputSplits(BspServiceMaster.java:616) >> at >> org.apache.giraph.master.BspServiceMaster.createVertexInputSplits(BspServiceMaster.java:692) >> at org.apache.giraph.master.MasterThread.run(MasterThread.java:100) >> Caused by: java.lang.IllegalStateException: Not enough healthy workers to >> create input splits >> ... 4 more >> 2014-03-02 04:53:24,656 FATAL org.apache.giraph.graph.GraphMapper: >> uncaughtException: OverrideExceptionHandler on thread >> org.apache.giraph.master.MasterThread, msg = java.lang.RuntimeException: >> BspServiceMaster (YARN profile) is FAILING this task, throwing exception to >> end job run., exiting... >> java.lang.IllegalStateException: java.lang.RuntimeException: >> BspServiceMaster (YARN profile) is FAILING this task, throwing exception to >> end job run. >> at org.apache.giraph.master.MasterThread.run(MasterThread.java:181) >> Caused by: java.lang.RuntimeException: BspServiceMaster (YARN profile) is >> FAILING this task, throwing exception to end job run. >> at >> org.apache.giraph.master.BspServiceMaster.failJob(BspServiceMaster.java:349) >> at >> org.apache.giraph.master.BspServiceMaster.setJobStateFailed(BspServiceMaster.java:297) >> at >> org.apache.giraph.master.BspServiceMaster.createInputSplits(BspServiceMaster.java:616) >> at >> org.apache.giraph.master.BspServiceMaster.createVertexInputSplits(BspServiceMaster.java:692) >> at org.apache.giraph.master.MasterThread.run(MasterThread.java:100) >> Caused by: java.lang.IllegalStateException: Not enough healthy workers to >> create input splits >> ... 4 more >> >> ------------------------------ >> >> > -- Best Regards, Jyotirmoy Sundi