Github user rezasafi commented on the issue:

    https://github.com/apache/spark/pull/19848
  
    @mridulm what I meant by same rdd was to run the same job two times on the 
same cluster but in different spark contexts. So it is not the same rdd, but 
since sparkContext will start rdd ids from zero then we may have same rdd ids 
in different executions. The jobTrackerId will be different, but I actually 
didn't check whether hadoop will cause a different file path based on the 
jobTrackerId. If that is the case then there will not be a problem. But if not 
then the commit will fail I guess. I think this can only happen when 
spark.hadoop.validateOutputSpec is true.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to