Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/22112 > always return the same result with same order when rerun.. maybe the word "idempotent" is not that accurate. Spark doesn't really care about the order, so the requirement is, for the same input data set, it should return the same output set. As an example, `iter1.zip(iter2)` will be treated as invalid, unless we sort before zip.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org