Github user maropu commented on the issue: https://github.com/apache/spark/pull/22232 I'm not sure we can test the case though, for example, how about the sequence below? ``` import org.apache.spark.TaskContext spark.range(10).selectExpr("id AS c0", "rand() AS c1").write.parquet("/tmp/t1") val df = spark.read.parquet("/tmp/t1") val fileScanRdd = df.repartition(1).queryExecution.executedPlan.children(0).children(0).execute() fileScanRdd.mapPartitions { part => println(s"Initial bytesRead=${TaskContext.get.taskMetrics().inputMetrics.bytesRead}") TaskContext.get.addTaskCompletionListener[Unit] { taskCtx => // Check if the metric is correct? println(s"Total bytesRead=${TaskContext.get.taskMetrics().inputMetrics.bytesRead}") } part }.collect ```
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org