Chao Sun created SPARK-44660:
--------------------------------

             Summary: Relax constraint for columnar shuffle check in AQE
                 Key: SPARK-44660
                 URL: https://issues.apache.org/jira/browse/SPARK-44660
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 3.4.1
            Reporter: Chao Sun


Currently in AQE, after evaluating the columnar rules, Spark will check if the 
top operator of the stage is still a shuffle operator, and throw exception if 
it doesn't.

{code}
        val optimized = e.withNewChildren(Seq(optimizeQueryStage(e.child, 
isFinalStage = false)))
        val newPlan = applyPhysicalRules(
          optimized,
          postStageCreationRules(outputsColumnar = plan.supportsColumnar),
          Some((planChangeLogger, "AQE Post Stage Creation")))
        if (e.isInstanceOf[ShuffleExchangeLike]) {
          if (!newPlan.isInstanceOf[ShuffleExchangeLike]) {
            throw SparkException.internalError(
              "Custom columnar rules cannot transform shuffle node to something 
else.")
          }
{code}

However, once a shuffle operator is transformed into a custom columnar shuffle 
operator, the {{supportsColumnar}} of the new shuffle operator will return 
true, and therefore the columnar rules will insert {{ColumnarToRow}} on top of 
it. This means the {{newPlan}} is likely no longer a {{ShuffleExchangeLike}} 
but a {{ColumnarToRow}}, and exception will be thrown, even though the use case 
is valid.

This JIRA proposes to relax the check by allowing the above case.







--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to