Hi, I'm using the Databricks spark-avro library to save some DataFrames out as Avro (with Spark 1.6.1). When I do this however, I lose the information in the spark events about the number of records and size of data written to HDFS for each partition that's available if I save an RDD out as a text file.
Is this just a limitation of data frames, or is there a way of making this information available? It's really useful for performance monitoring. Thanks, Tim. -- This email is confidential, if you are not the intended recipient please delete it and notify us immediately by emailing the sender. You should not copy it or use it for any purpose nor disclose its contents to any other person. Privitar Limited is registered in England with registered number 09305666. Registered office Salisbury House, Station Road, Cambridge, CB12LA.