Stage vs. StageInfo

Mark Hamstra Tue, 23 Jul 2013 16:24:44 -0700

So I'm currently working in Spark's DAGScheduler and related UI code, and
I'm finding myself wondering why there are StageInfos distinct from Stages.
 It seems like we go through some bookkeeping to make sure that we can get
from a Stage to a StageInfo, which in turn is just a pairing of the Stage
with a collection of (TaskInfo, TaskMetrics) pairs.  Why not avoid the
bookkeeping and just put that collection of (TaskInfo, TaskMetrics) pairs
right in the Stage itself?  I.e., directly change the Stage class to
augment it with the collection instead of indirectly augmenting stages by
going through the (potentially error-prone) mechanics of maintaining an
association between a StageInfo distinct from the Stage.


Or am I missing something?

Stage vs. StageInfo

Reply via email to