Github user jiangxb1987 commented on a diff in the pull request: https://github.com/apache/spark/pull/22112#discussion_r212383406 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala --- @@ -305,17 +306,19 @@ object ShuffleExchangeExec { rdd } + // round-robin function is order sensitive if we don't sort the input. + val orderSensitiveFunc = isRoundRobin && !SQLConf.get.sortBeforeRepartition if (needToCopyObjectsBeforeShuffle(part)) { - newRdd.mapPartitionsInternal { iter => + newRdd.mapPartitionsWithIndexInternal((_, iter) => { --- End diff -- Shouldn't we mark `newRdd` as `IDEMPOTENT` if insert a local sort (or `INDETERMINATE` if don't sort), so we don't have to mark the function as order sensitive?
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org