[
https://issues.apache.org/jira/browse/SPARK-53527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wenchen Fan reassigned SPARK-53527:
-----------------------------------
Assignee: Szehon Ho
> Improve fallback of analyzeExistenceDefaultValue
> ------------------------------------------------
>
> Key: SPARK-53527
> URL: https://issues.apache.org/jira/browse/SPARK-53527
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 4.0.1
> Reporter: Szehon Ho
> Assignee: Szehon Ho
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.1.0
>
>
> https://issues.apache.org/jira/browse/SPARK-51119 skips analysis for
> EXISTS_DEFAULT. In most case, it works because EXISTS_DEFAULT column
> metadata is supposed to be resolved.
>
> But there's some known bugs where it is persisted un-resolved. For example,
> something like 'current_database, current_user, current_timestamp' , these
> are non-deterministic and will bring wrong results in EXISTS_DEFAULT, where
> user expects the value resolved when they set the default.
>
> There is fallback in https://issues.apache.org/jira/browse/SPARK-51119 to
> handle corrupt EXISTS_DEFAULT by running full analysis, but it miss some
> case. In this case one where there are nested function calls.
>
> Example: EXISTS_DEFAULT has some nested function call like :
> {code:java}
> CONCAT(YEAR(CURRENT_DATE), LPAD(WEEKOFYEAR(CURRENT_DATE), 2, '0')){code}
>
>
> the current code `Literal.fromSQL(defaultSQL)` will throw the exception
> before getting to the fallback:
> {code:java}
> Caused by: java.lang.AssertionError: assertion failed: function arguments
> must be resolved.
> at scala.Predef$.assert(Predef.scala:279)
> at
> org.apache.spark.sql.catalyst.analysis.FunctionRegistry$.$anonfun$expressionBuilder$1(FunctionRegistry.scala:1278)
> at
> org.apache.spark.sql.catalyst.analysis.SimpleFunctionRegistryBase.lookupFunction(FunctionRegistry.scala:251)
> at
> org.apache.spark.sql.catalyst.analysis.SimpleFunctionRegistryBase.lookupFunction$(FunctionRegistry.scala:245)
> at
> org.apache.spark.sql.catalyst.analysis.SimpleFunctionRegistry.lookupFunction(FunctionRegistry.scala:317)
> at
> org.apache.spark.sql.catalyst.expressions.Literal$$anonfun$fromSQL$1.applyOrElse(literals.scala:325)
> at
> org.apache.spark.sql.catalyst.expressions.Literal$$anonfun$fromSQL$1.applyOrElse(literals.scala:317)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUpWithPruning$4(TreeNode.scala:586)
> at
> org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:121)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode.transformUpWithPruning(TreeNode.scala:586)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformUpWithPruning$1(TreeNode.scala:579)
> at scala.collection.immutable.List.map(List.scala:251)
> at scala.collection.immutable.List.map(List.scala:79)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode.mapChildren(TreeNode.scala:768)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode.transformUpWithPruning(TreeNode.scala:579)
> at
> org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:556)
> at
> org.apache.spark.sql.catalyst.expressions.Literal$.fromSQL(literals.scala:317)
> at
> org.apache.spark.sql.catalyst.util.ResolveDefaultColumns$.analyzeExistenceDefaultValue(ResolveDefaultColumnsUtil.scala:393)
> at
> org.apache.spark.sql.catalyst.util.ResolveDefaultColumns$.getExistenceDefaultValue(ResolveDefaultColumnsUtil.scala:529)
> at
> org.apache.spark.sql.catalyst.util.ResolveDefaultColumns$.$anonfun$getExistenceDefaultValues$1(ResolveDefaultColumnsUtil.scala:524)
> at scala.collection.ArrayOps$.map$extension(ArrayOps.scala:936)
> at
> org.apache.spark.sql.catalyst.util.ResolveDefaultColumns$.getExistenceDefaultValues(ResolveDefaultColumnsUtil.scala:524)
> at
> org.apache.spark.sql.catalyst.util.ResolveDefaultColumns$.$anonfun$existenceDefaultValues$2(ResolveDefaultColumnsUtil.scala:594)
> at scala.Option.getOrElse(Option.scala:201)
> at
> org.apache.spark.sql.catalyst.util.ResolveDefaultColumns$.existenceDefaultValues(ResolveDefaultColumnsUtil.scala:592)
> {code}
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]