Hi,

I'm quite interested in how Spark's fault tolerance works and I'd like to
ask a question here.

According to the paper, there are two kinds of dependencies--the wide
dependency and the narrow dependency. My understanding is, if the
operations I use are all "narrow", then when one machine crashes, the
system just need to recover the lost RDDs from the most recent checkpoint.
However, if all transformations are "wide"(e.g. in calculating PageRank),
then when one node crashes, all other nodes need to roll back to the most
recent checkpoint. Is my understanding correct?

Thanks!

Best Regards,
Fan

Reply via email to