Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/20082#discussion_r159033482 --- Diff: core/src/main/scala/org/apache/spark/TaskContext.scala --- @@ -150,6 +150,13 @@ abstract class TaskContext extends Serializable { */ def stageId(): Int + /** + * An ID that is unique to the stage attempt that this task belongs to. It represents how many + * times the stage has been attempted. The first stage attempt will be assigned stageAttemptId = 0 + * , and subsequent attempts will increasing stageAttemptId one by one. + */ + def stageAttemptId(): Int --- End diff -- My concern is that, internally we use `stageAttemptId`, and internally we call `TaskContext.taskAttemptId` `taskId`. However, for end users, they don't know the internal code, and they are more familiar with `TaskContext`. I think the naming should be consistent with the public API `TaskContext`, instead of internal code.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org