Yes, I run the minimum spanning tree and it fails again. I increased the
Zookeeper counter also it fails again. The log files state that an "
org.apache.zookeeper.KeeperExceptionConnectionLossException" occurred
before killing the job. If it's a memory problem, can I increase the memory
limit per e
As I said, failures on specific supersteps *might* happen, but its not
necessary.
Did you run the minimum spanning tree job again? Did it finish
successfully?
On a different note, what do you mean by "submitted a job of 90
supersteps"? I don't think you can specify the number of supersteps-- that
Thank you Vishal.
But I submitted a PageRank job of 90 supersteps, 20 workers, 4,000,000
vertices and 30 edges per vertex. The job completed successfully. I'm
really confused.
On Wed, Aug 22, 2012 at 7:33 PM, Vishal Patel wrote:
> After several supersteps, sometimes a worker thread dies (say it
After several supersteps, sometimes a worker thread dies (say it ran out of
memory). Zookeeper waits for ~5 mins (600 seconds) and then decides that
the worker is not responsive and fails the entire job. At this point if you
have a checkpoint saved it will resume from there otherwise you have to
st
Hi all,
I'm running a minimum spanning tree compute function on Hadoop cluster (20
machines). After certain supersteps (e.g. superstep 47 for a graph of
4,194,304 vertices and 181,566,970 edges), the execution time increased
dramatically. This is not the only problem, the job has been killed "Task