We are collecting bunch of metrics from spark sources. recently we switched
some of the tables to iceberg and noticed that metrics data is not
available for some of the metrics

following metrics always show count = 0
bytesRead: 0
recordsRead: 0
bytesWritten: 0
recordsWritten: 0
diskBytesSpilled: 0
memoryBytesSpilled: 0

( executorRunTime, executorCpuTime, resultSize , jvmGCTime metrics seems to
have data)

I see that spark base RDD increments accumulator in task metrics
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rdd/NewHadoopRDD.scala#L277
I am trying to find appropriate location in IcebergSource  where these
metrics can be updated.
I am happy to create github issue to track this work-item if needed.

--
Thanks

Reply via email to