Jackey Lee created SPARK-37831: ---------------------------------- Summary: Add task partition id in metrics Key: SPARK-37831 URL: https://issues.apache.org/jira/browse/SPARK-37831 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 3.2.1, 3.3.0 Reporter: Jackey Lee
There is no partition id in current metrics, it makes difficult to trace stage metrics, such as stage shuffle read, especially when there are stage retries. It is also impossible to check task metrics between different applications. {code:java} class TaskData private[spark]( val taskId: Long, val index: Int, val attempt: Int, val launchTime: Date, val resultFetchStart: Option[Date], @JsonDeserialize(contentAs = classOf[JLong]) val duration: Option[Long], val executorId: String, val host: String, val status: String, val taskLocality: String, val speculative: Boolean, val accumulatorUpdates: Seq[AccumulableInfo], val errorMessage: Option[String] = None, val taskMetrics: Option[TaskMetrics] = None, val executorLogs: Map[String, String], val schedulerDelay: Long, val gettingResultTime: Long) {code} Adding partitionId in Task Data can not only make us easy to trace task metrics, also can make it possible to collect metrics for actual stage outputs, especially when stage retries. -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org