Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19269 Several things to discuss: 1. Since Spark can't disable speculation during runtime, currently there is not much benefit to provide an interface for data source to disable speculation, because data source can check the spark conf at the beginning and throw exception if speculation is enabled. We can do it later via mix-in trait. 2. The only contract Spark needs is: data written/committed by tasks should not be visible to data source readers until the job-level commitment. But they can be visible to others like other writing tasks, so it's possible for data sources to implement "abort the output of the other writer". 3. The `WriteCommitMessage` can include statistics(it's an empty interface), so data sources can aggregate statistics at driver side. cc @steveloughran @rdblue
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org