Github user mridulm commented on the pull request:

    https://github.com/apache/spark/pull/2366#issuecomment-56293724
  
    @tdas handling (1) deterministically will make (2) in line with what we 
currently have.
    And that should be sufficient imo.
    
    (3) was not in context of this patch - but a general shortcoming of spark 
currently.
    Alleviating (3) might be complicated (not sure how much so) - but will have 
some very interesting consequences to performance (among others).
    
    For example: this prevents us from using block persistance for checkpoint - 
there was a discussion about this in a JIRA a while back (forgot id) ... 
resolving this and with 3x replicated blocks, will mean we get really cheap and 
very performent checkpoint (while having fault tolerance at par with hdfs)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to