Hi All, I noticed on some Spark jobs it shows you input/output read size. I am implementing a custom RDD which reads files and would like to report these metrics to Spark since they are available to me.
I looked through the RDD source code and a couple different implementations and the best I could find were some Hadoop metrics. Is there a way to simply report the number of bytes a partition read so Spark can put it on the UI? Thanks, — Pedro Rodriguez PhD Student in Large-Scale Machine Learning | CU Boulder Systems Oriented Data Scientist UC Berkeley AMPLab Alumni pedrorodriguez.io | 909-353-4423 github.com/EntilZha | LinkedIn