[ https://issues.apache.org/jira/browse/SPARK-44660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17750881#comment-17750881 ]
Chao Sun commented on SPARK-44660: ---------------------------------- In fact the check is necessary, but it seems {code} postStageCreationRules(outputsColumnar = plan.supportsColumnar) {code} can be relaxed: if the new shuffle operator supports columnar, then maybe we shouldn't insert {{ColumnarToRow}} to this stage. This is assuming the following stage knows the shuffle output is columnar and has corresponding {{ColumnarToRow}} if necessary. > Relax constraint for columnar shuffle check in AQE > -------------------------------------------------- > > Key: SPARK-44660 > URL: https://issues.apache.org/jira/browse/SPARK-44660 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 3.4.1 > Reporter: Chao Sun > Priority: Major > > Currently in AQE, after evaluating the columnar rules, Spark will check if > the top operator of the stage is still a shuffle operator, and throw > exception if it doesn't. > {code} > val optimized = e.withNewChildren(Seq(optimizeQueryStage(e.child, > isFinalStage = false))) > val newPlan = applyPhysicalRules( > optimized, > postStageCreationRules(outputsColumnar = plan.supportsColumnar), > Some((planChangeLogger, "AQE Post Stage Creation"))) > if (e.isInstanceOf[ShuffleExchangeLike]) { > if (!newPlan.isInstanceOf[ShuffleExchangeLike]) { > throw SparkException.internalError( > "Custom columnar rules cannot transform shuffle node to > something else.") > } > {code} > However, once a shuffle operator is transformed into a custom columnar > shuffle operator, the {{supportsColumnar}} of the new shuffle operator will > return true, and therefore the columnar rules will insert {{ColumnarToRow}} > on top of it. This means the {{newPlan}} is likely no longer a > {{ShuffleExchangeLike}} but a {{ColumnarToRow}}, and exception will be > thrown, even though the use case is valid. > This JIRA proposes to relax the check by allowing the above case. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org