Internally, the framework checkpoint the messages transferred among BSP tasks during the BSP synchronization period.
If user want to checkpoint additional other things, user should use HDFS APIs directly. On Mon, Feb 29, 2016 at 11:15 PM, Behroz Sikander <[email protected]> wrote: > Ok. So, Hama does support FT but it is not thoroughly tested. > > Btw, how can a user checkpoint or Hama does that internally ? Is there any > method exposed using BSPPeer ? > > Regards, > Behroz > > On Mon, Feb 29, 2016 at 2:03 PM, Edward J. Yoon <[email protected]> > wrote: > >> If I remember correctly, .. the framework change the job status as a >> "recovering" first, and then simply restart all the tasks from the >> last checkpoint. It works well but I only tested simple jobs (no >> input/output) on my cluster (see also HAMA-973). >> >> To write perfect FT application from user side, every states in BSP >> program need to be written on the disk. So, some people discussed and >> introduced new Superstep API that provides more abstract interface >> like Pregel. >> >> >> On Mon, Feb 29, 2016 at 8:09 PM, Behroz Sikander <[email protected]> >> wrote: >> > Hi, >> > Just a quick question, is Hama fault tolerant ? What happens if a Hama >> > tasks fails ? >> > >> > Regards, >> > Behroz >> >> >> >> -- >> Best Regards, Edward J. Yoon >> -- Best Regards, Edward J. Yoon
