[ https://issues.apache.org/jira/browse/SPARK-23243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16608167#comment-16608167 ]
Bruce Robbins commented on SPARK-23243: --------------------------------------- Any plans to back port this to 2.2? > Shuffle+Repartition on an RDD could lead to incorrect answers > ------------------------------------------------------------- > > Key: SPARK-23243 > URL: https://issues.apache.org/jira/browse/SPARK-23243 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 1.6.0, 2.0.0, 2.1.0, 2.2.0, 2.3.0 > Reporter: Jiang Xingbo > Assignee: Wenchen Fan > Priority: Blocker > Labels: correctness > Fix For: 2.3.2, 2.4.0 > > > The RDD repartition also uses the round-robin way to distribute data, this > can also cause incorrect answers on RDD workload the similar way as in > https://issues.apache.org/jira/browse/SPARK-23207 > The approach that fixes DataFrame.repartition() doesn't apply on the RDD > repartition issue, as discussed in > https://github.com/apache/spark/pull/20393#issuecomment-360912451 > We track for alternative solutions for this issue in this task. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org