1) If a BSPJob has 10 super steps and a task fails at step 5, does the job need to be run again? Is Hama-503 the solution? Is the state of the job stored in HDFS between super steps?
2) What other fault tolerance features are implemented in Hama? 3) What is check pointing in Hama? Praveen
