Are you guys sure this is a bug? In the task scheduler, we keep two identifiers for each task: the "index", which uniquely identifiers the computation+partition, and the "taskId" which is unique across all tasks for that Spark context (See https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala#L439). If multiple attempts of one task are run, they will have the same index, but different taskIds. Historically, we have used "taskId" and "taskAttemptId" interchangeably (which arose from naming in Mesos, which uses similar naming).
This was complicated when Mr. Xin added the "attempt" field to TaskInfo, which we show in the UI. This field uniquely identifies attempts for a particular task, but is not unique across different task indexes (it always starts at 0 for a given task). I'm guessing the right fix is to rename Task.taskAttemptId to Task.taskId to resolve this inconsistency -- does that sound right to you Reynold? -Kay On Mon, Oct 20, 2014 at 1:29 PM, Patrick Wendell <pwend...@gmail.com> wrote: > There is a deeper issue here which is AFAIK we don't even store a > notion of attempt inside of Spark, we just use a new taskId with the > same index. > > On Mon, Oct 20, 2014 at 12:38 PM, Yin Huai <huaiyin....@gmail.com> wrote: > > Yeah, seems we need to pass the attempt id to executors through > > TaskDescription. I have created > > https://issues.apache.org/jira/browse/SPARK-4014. > > > > On Mon, Oct 20, 2014 at 1:57 PM, Reynold Xin <r...@databricks.com> > wrote: > > > >> I also ran into this earlier. It is a bug. Do you want to file a jira? > >> > >> I think part of the problem is that we don't actually have the attempt > id > >> on the executors. If we do, that's great. If not, we'd need to propagate > >> that over. > >> > >> On Mon, Oct 20, 2014 at 7:17 AM, Yin Huai <huaiyin....@gmail.com> > wrote: > >> > >>> Hello, > >>> > >>> Is there any way to get the attempt number in a closure? Seems > >>> TaskContext.attemptId actually returns the taskId of a task (see this > >>> < > >>> > https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/Executor.scala#L181 > >>> > > >>> and this > >>> < > >>> > https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/Task.scala#L47 > >>> >). > >>> It looks like a bug. > >>> > >>> Thanks, > >>> > >>> Yin > >>> > >> > >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > For additional commands, e-mail: dev-h...@spark.apache.org > >