[
https://issues.apache.org/jira/browse/SPARK-48871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dongjoon Hyun closed SPARK-48871.
---------------------------------
> Fix INVALID_NON_DETERMINISTIC_EXPRESSIONS validation in CheckAnalysis
> ----------------------------------------------------------------------
>
> Key: SPARK-48871
> URL: https://issues.apache.org/jira/browse/SPARK-48871
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 3.5.2, 3.4.4, 4.0.0
> Reporter: Carmen Kwan
> Assignee: Carmen Kwan
> Priority: Major
> Labels: pull-request-available
> Fix For: 3.5.2, 4.0.0
>
>
> I encountered the following exception when attempting to use a
> non-deterministic udf in my query.
> {code:java}
> [info] org.apache.spark.sql.catalyst.ExtendedAnalysisException:
> [INVALID_NON_DETERMINISTIC_EXPRESSIONS] The operator expects a deterministic
> expression, but the actual expression is "[some expression]".; line 2 pos 1
> [info] [some logical plan]
> [info] at
> org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:52)
> [info] at
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis0$2(CheckAnalysis.scala:761)
> [info] at
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis0$2$adapted(CheckAnalysis.scala:182)
> [info] at
> org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:244)
> [info] at
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis0(CheckAnalysis.scala:182)
> [info] at
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis0$(CheckAnalysis.scala:164)
> [info] at
> org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis0(Analyzer.scala:188)
> [info] at
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis(CheckAnalysis.scala:160)
> [info] at
> org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis$(CheckAnalysis.scala:150)
> [info] at
> org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:188)
> [info] at
> org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeAndCheck$1(Analyzer.scala:211)
> [info] at
> org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:330)
> [info] at
> org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:208)
> [info] at
> org.apache.spark.sql.execution.QueryExecution.$anonfun$analyzed$1(QueryExecution.scala:77)
> [info] at
> org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:138)
> [info] at
> org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$2(QueryExecution.scala:219)
> [info] at
> org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:546)
> [info] at
> org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:219)
> [info] at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:900)
> [info] at
> org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:218)
> [info] at
> org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:77)
> [info] at
> org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:74)
> [info] at
> org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:66){code}
> The non-deterministic expression can be safely allowed for my custom
> LogicalPlan, but it is disabled in the checkAnalysis phase. The CheckAnalysis
> rule is too strict so that reasonable use cases of non-deterministic
> expressions are also disabled.
> To fix this, we could add a trait that logical plans can extend to implement
> a method to decide whether there can be non-deterministic expressions for the
> operator, and check this function in checkAnalysis. This allows delegation of
> this validation to frameworks that extend Spark so we can allow list more
> than just the few explicitly named logical plans (e.g. `Project`, `Filter`).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]