Github user tgravescs commented on the issue:

    https://github.com/apache/spark/pull/22112
  
    > I'm proposing an option 3:
    > Retry all the tasks of all the succeeding stages if a stage with 
repartition/zip failed. All RDD actions should tell Spark if it's "repeatable", 
which becomes a property of the result stage. When we retry a result stage that 
has several tasks finished, if the result stage is "repeatable" (e.g. collect), 
retry it. If the result stage is not "repeatable", fail the job with the error 
message to ask users to checkpoint the RDD before repartition/zip.
    
    how does the user then tell spark that the result stage becomes repeatable 
because they did the checkpoint?  Add an option to the api?  Or does Spark 
automatically try to figure that out?    I'm still a bit hesitant about making 
our long term solution that these operations aren't resilient, but I as long as 
the user can make them resilient perhaps its ok.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to