Hi folks.. I am implementing one algorithm.It involves thousands of supersteps to be run.
After running some supersteps,worker come into failure state,I enabled the checkpoint feature on,but when worker fails,it again started the computation from superstep 0.So program is taking long time..Is there any way to restart the job from last checkpoint ?? Any command line option to fix above problem.?? Any ideas are appreciated.. Thanks Jyoti