[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

cloud-fan Mon, 20 Aug 2018 22:41:20 -0700

Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/22112
  
    > always return the same result with same order when rerun..
    
    maybe the word "idempotent" is not that accurate. Spark doesn't really care 
about the order, so the requirement is, for the same input data set, it should 
return the same output set.
    
    As an example, `iter1.zip(iter2)` will be treated as invalid, unless we 
sort before zip.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #22112: [SPARK-23243][Core] Fix RDD.repartition() data correctne...

Reply via email to