[ 
https://issues.apache.org/jira/browse/SPARK-35767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun resolved SPARK-35767.
-----------------------------------
    Fix Version/s: 3.1.3
                   3.2.0
                   3.0.3
         Assignee: Andy Grove
       Resolution: Fixed

This is resolved via https://github.com/apache/spark/pull/32920

> CoalesceExec can execute child plan twice
> -----------------------------------------
>
>                 Key: SPARK-35767
>                 URL: https://issues.apache.org/jira/browse/SPARK-35767
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.0.2, 3.1.2, 3.2.0
>            Reporter: Andy Grove
>            Assignee: Andy Grove
>            Priority: Minor
>             Fix For: 3.0.3, 3.2.0, 3.1.3
>
>
> CoalesceExec calls `child.execute()` in the if condition and throws away the 
> results, then calls `child.execute()` again in the else condition. This could 
> cause a section of the plan to be executed twice.
> {code:java}
> protected override def doExecute(): RDD[InternalRow] = {
>   if (numPartitions == 1 && child.execute().getNumPartitions < 1) {
>     // Make sure we don't output an RDD with 0 partitions, when claiming that 
> we have a
>     // `SinglePartition`.
>     new CoalesceExec.EmptyRDDWithPartitions(sparkContext, numPartitions)
>   } else {
>     child.execute().coalesce(numPartitions, shuffle = false)
>   }
> } {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to