Github user steveloughran commented on the issue: https://github.com/apache/spark/pull/18979 @viirya : the new data writer API will allow for a broader set of stats to be propagated back from workers. When you are working with the object stores, an useful stat to get back is throttle count & retry count as they can be the cause of why things are slow ... and if it is due to throttling, throwing more workers at the job will actually slow things down. They'd be the ones to look at first
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org