Hello, community,

I noticed one surprising code region in Spark. It's in DAGScheduler -->
handleTaskCompletion --> case FetchFailed --> failed += magStage. (I am in
branch-0.8)

That is to say, if one task failed to fetch its input from
shuffledependency, this task's stage and its parent stage will both need to
be re-executed.

If my understanding is correct, then why does Spark need to materialize the
intermediate data? This is "restart" fault tolerance mechanism.

thanks for your help,
dachuan.

-- 
Dachuan Huang
Cellphone: 614-390-7234
2015 Neil Avenue
Ohio State University
Columbus, Ohio
U.S.A.
43210

Reply via email to