Re: How to collect Spark dataframe write metrics

2020-03-04 Thread Manjunath Shetty H
Thanks Zohar, Will try that - Manjunath From: Zohar Stiro Sent: Tuesday, March 3, 2020 1:49 PM To: Manjunath Shetty H Cc: user Subject: Re: How to collect Spark dataframe write metrics Hi, to get DataFrame level write metrics you can take a look

Re: How to collect Spark dataframe write metrics

2020-03-03 Thread Zohar Stiro
Hi, to get DataFrame level write metrics you can take a look at the following trait : https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/WriteStatsTracker.scala and a basic implementation example:

How to collect Spark dataframe write metrics

2020-03-01 Thread Manjunath Shetty H
Hi all, Basically my use case is to validate the DataFrame rows count before and after writing to HDFS. Is this even to good practice ? Or Should relay on spark for guaranteed writes ?. If it is a good practice to follow then how to get the DataFrame level write metrics ? Any pointers would