[GitHub] spark issue #18975: [SPARK-4131] Support "Writing data into the filesystem f...

tejasapatil Mon, 04 Sep 2017 12:18:07 -0700

Github user tejasapatil commented on the issue:

    https://github.com/apache/spark/pull/18975
  
    @gatorsmile : Yes. Hive is not 100% atomic as stuff can go wrong between 
removing old data and renaming staging location. But its superior in these 
regards:
    
    - Hive would output "no data" OR "complete data". Here we can have "no 
data" OR "incomplete data" OR "complete data". The "incomplete data" part 
worries me. Staging dir helps achieving "you either see nothing OR everything" 
behaviour.
    - The window of "you see nothing" is much bigger here compared to Hive as 
the output location is cleaned up before execution.




---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #18975: [SPARK-4131] Support "Writing data into the filesystem f...

Reply via email to