[ 
https://issues.apache.org/jira/browse/SPARK-44660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17750881#comment-17750881
 ] 

Chao Sun commented on SPARK-44660:
----------------------------------

In fact the check is necessary, but it seems 
{code}
postStageCreationRules(outputsColumnar = plan.supportsColumnar)
{code}

can be relaxed: if the new shuffle operator supports columnar, then maybe we 
shouldn't insert {{ColumnarToRow}} to this stage. This is assuming the 
following stage knows the shuffle output is columnar and has corresponding 
{{ColumnarToRow}} if necessary.

> Relax constraint for columnar shuffle check in AQE
> --------------------------------------------------
>
>                 Key: SPARK-44660
>                 URL: https://issues.apache.org/jira/browse/SPARK-44660
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.4.1
>            Reporter: Chao Sun
>            Priority: Major
>
> Currently in AQE, after evaluating the columnar rules, Spark will check if 
> the top operator of the stage is still a shuffle operator, and throw 
> exception if it doesn't.
> {code}
>         val optimized = e.withNewChildren(Seq(optimizeQueryStage(e.child, 
> isFinalStage = false)))
>         val newPlan = applyPhysicalRules(
>           optimized,
>           postStageCreationRules(outputsColumnar = plan.supportsColumnar),
>           Some((planChangeLogger, "AQE Post Stage Creation")))
>         if (e.isInstanceOf[ShuffleExchangeLike]) {
>           if (!newPlan.isInstanceOf[ShuffleExchangeLike]) {
>             throw SparkException.internalError(
>               "Custom columnar rules cannot transform shuffle node to 
> something else.")
>           }
> {code}
> However, once a shuffle operator is transformed into a custom columnar 
> shuffle operator, the {{supportsColumnar}} of the new shuffle operator will 
> return true, and therefore the columnar rules will insert {{ColumnarToRow}} 
> on top of it. This means the {{newPlan}} is likely no longer a 
> {{ShuffleExchangeLike}} but a {{ColumnarToRow}}, and exception will be 
> thrown, even though the use case is valid.
> This JIRA proposes to relax the check by allowing the above case.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to