Re: Giraph Job "Task attempt_* failed to report status" Problem

2012-08-23 Thread Amani Alonazi
Yes, I run the minimum spanning tree and it fails again. I increased the Zookeeper counter also it fails again. The log files state that an " org.apache.zookeeper.KeeperExceptionConnectionLossException" occurred before killing the job. If it's a memory problem, can I increase the memory limit per e

Re: Giraph Job "Task attempt_* failed to report status" Problem

2012-08-23 Thread Vishal Patel
As I said, failures on specific supersteps *might* happen, but its not necessary. Did you run the minimum spanning tree job again? Did it finish successfully? On a different note, what do you mean by "submitted a job of 90 supersteps"? I don't think you can specify the number of supersteps-- that

Re: Giraph Job "Task attempt_* failed to report status" Problem

2012-08-23 Thread Amani Alonazi
Thank you Vishal. But I submitted a PageRank job of 90 supersteps, 20 workers, 4,000,000 vertices and 30 edges per vertex. The job completed successfully. I'm really confused. On Wed, Aug 22, 2012 at 7:33 PM, Vishal Patel wrote: > After several supersteps, sometimes a worker thread dies (say it

Re: Giraph Job "Task attempt_* failed to report status" Problem

2012-08-22 Thread Vishal Patel
After several supersteps, sometimes a worker thread dies (say it ran out of memory). Zookeeper waits for ~5 mins (600 seconds) and then decides that the worker is not responsive and fails the entire job. At this point if you have a checkpoint saved it will resume from there otherwise you have to st

Giraph Job "Task attempt_* failed to report status" Problem

2012-08-21 Thread Amani Alonazi
Hi all, I'm running a minimum spanning tree compute function on Hadoop cluster (20 machines). After certain supersteps (e.g. superstep 47 for a graph of 4,194,304 vertices and 181,566,970 edges), the execution time increased dramatically. This is not the only problem, the job has been killed "Task