Hi All, I tried to run a simple spark program to find out the metrics collected while executing the program. What I observed is, I'm able to get TaskMetrics.inputMetrics data like records read, bytesread etc. But I do not get any metrics about the output.
I ran the below code in local mode as well as on a YARN Cluster, yet the result is the same. This is the code I used to test. val datadf = sqlcontext.read.json("schema.txt").repartition(10) datadf.distinct().write.mode(SaveMode.Append).save("resources\\data-" + Calendar.getInstance.getTimeInMillis) I have attached the listener like this sc.addSparkListener(new SparkListener() { override def onTaskEnd(taskEnd: SparkListenerTaskEnd) { val metrics = taskEnd.taskMetrics if (metrics.inputMetrics != None) { println("Metrics for the task Input recs:" + metrics.inputMetrics.get.recordsRead) inputRecords += metrics.inputMetrics.get.recordsRead } if (metrics.outputMetrics != None) { println("Metrics for the task Output recs:" + metrics.inputMetrics.get.recordsRead) outputWritten += metrics.outputMetrics.get.recordsWritten } } }) I get valid data for the input metrics. But none for output metrics. Am I missing anything here. Any pointers, much appreciated.