Github user tejasapatil commented on the issue: https://github.com/apache/spark/pull/18975 @gatorsmile : Yes. Hive is not 100% atomic as stuff can go wrong between removing old data and renaming staging location. But its superior in these regards: - Hive would output "no data" OR "complete data". Here we can have "no data" OR "incomplete data" OR "complete data". The "incomplete data" part worries me. Staging dir helps achieving "you either see nothing OR everything" behaviour. - The window of "you see nothing" is much bigger here compared to Hive as the output location is cleaned up before execution.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org