[ 
https://issues.apache.org/jira/browse/SPARK-26370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16721025#comment-16721025
 ] 

ASF GitHub Bot commented on SPARK-26370:
----------------------------------------

ueshin opened a new pull request #23320: [SPARK-26370][SQL] Fix resolution of 
higher-order function for the same identifier.
URL: https://github.com/apache/spark/pull/23320
 
 
   ## What changes were proposed in this pull request?
   
   When using a higher-order function with the same variable name as the 
existing columns in `Filter` or something which uses 
`Analyzer.resolveExpressionBottomUp` during the resolution, e.g.,:
   
   ```scala
   val df = Seq(
     (Seq(1, 9, 8, 7), 1, 2),
     (Seq(5, 9, 7), 2, 2),
     (Seq.empty, 3, 2),
     (null, 4, 2)
   ).toDF("i", "x", "d")
   
   checkAnswer(df.filter("exists(i, x -> x % d == 0)"),
     Seq(Row(Seq(1, 9, 8, 7), 1, 2)))
   checkAnswer(df.select("x").filter("exists(i, x -> x % d == 0)"),
     Seq(Row(1)))
   ```
   
   the following exception happens:
   
   ```
   java.lang.ClassCastException: 
org.apache.spark.sql.catalyst.expressions.BoundReference cannot be cast to 
org.apache.spark.sql.catalyst.expressions.NamedExpression
     at 
scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237)
     at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
     at 
scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
     at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
     at scala.collection.TraversableLike.map(TraversableLike.scala:237)
     at scala.collection.TraversableLike.map$(TraversableLike.scala:230)
     at scala.collection.AbstractTraversable.map(Traversable.scala:108)
     at 
org.apache.spark.sql.catalyst.expressions.HigherOrderFunction.$anonfun$functionsForEval$1(higherOrderFunctions.scala:147)
     at 
scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237)
     at scala.collection.immutable.List.foreach(List.scala:392)
     at scala.collection.TraversableLike.map(TraversableLike.scala:237)
     at scala.collection.TraversableLike.map$(TraversableLike.scala:230)
     at scala.collection.immutable.List.map(List.scala:298)
     at 
org.apache.spark.sql.catalyst.expressions.HigherOrderFunction.functionsForEval(higherOrderFunctions.scala:145)
     at 
org.apache.spark.sql.catalyst.expressions.HigherOrderFunction.functionsForEval$(higherOrderFunctions.scala:145)
     at 
org.apache.spark.sql.catalyst.expressions.ArrayExists.functionsForEval$lzycompute(higherOrderFunctions.scala:369)
     at 
org.apache.spark.sql.catalyst.expressions.ArrayExists.functionsForEval(higherOrderFunctions.scala:369)
     at 
org.apache.spark.sql.catalyst.expressions.SimpleHigherOrderFunction.functionForEval(higherOrderFunctions.scala:176)
     at 
org.apache.spark.sql.catalyst.expressions.SimpleHigherOrderFunction.functionForEval$(higherOrderFunctions.scala:176)
     at 
org.apache.spark.sql.catalyst.expressions.ArrayExists.functionForEval(higherOrderFunctions.scala:369)
     at 
org.apache.spark.sql.catalyst.expressions.ArrayExists.nullSafeEval(higherOrderFunctions.scala:387)
     at 
org.apache.spark.sql.catalyst.expressions.SimpleHigherOrderFunction.eval(higherOrderFunctions.scala:190)
     at 
org.apache.spark.sql.catalyst.expressions.SimpleHigherOrderFunction.eval$(higherOrderFunctions.scala:185)
     at 
org.apache.spark.sql.catalyst.expressions.ArrayExists.eval(higherOrderFunctions.scala:369)
     at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificPredicate.eval(Unknown
 Source)
     at 
org.apache.spark.sql.execution.FilterExec.$anonfun$doExecute$3(basicPhysicalOperators.scala:216)
     at 
org.apache.spark.sql.execution.FilterExec.$anonfun$doExecute$3$adapted(basicPhysicalOperators.scala:215)
   
   ...
   ```
   
   because the `UnresolvedAttribute`s in `LambdaFunction` are unexpectedly 
resolved by the rule.
   
   This pr modified to use a placeholder `UnresolvedNamedLambdaVariable` to 
prevent unexpected resolution.
   
   ## How was this patch tested?
   
   Added a test and modified some tests.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fix resolution of higher-order function for the same identifier.
> ----------------------------------------------------------------
>
>                 Key: SPARK-26370
>                 URL: https://issues.apache.org/jira/browse/SPARK-26370
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.4.0
>            Reporter: Takuya Ueshin
>            Priority: Major
>
> When using a higher-order function with the same variable name as the 
> existing columns in {{Filter}} or something which uses 
> {{Analyzer.resolveExpressionBottomUp}} during the resolution, e.g.,:
> {code}
> val df = Seq(
>   (Seq(1, 9, 8, 7), 1, 2),
>   (Seq(5, 9, 7), 2, 2),
>   (Seq.empty, 3, 2),
>   (null, 4, 2)
> ).toDF("i", "x", "d")
> checkAnswer(df.filter("exists(i, x -> x % d == 0)"),
>   Seq(Row(Seq(1, 9, 8, 7), 1, 2)))
> checkAnswer(df.select("x").filter("exists(i, x -> x % d == 0)"),
>   Seq(Row(1)))
> {code}
> the following exception happens:
> {code:java}
> java.lang.ClassCastException: 
> org.apache.spark.sql.catalyst.expressions.BoundReference cannot be cast to 
> org.apache.spark.sql.catalyst.expressions.NamedExpression
>   at 
> scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237)
>   at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
>   at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
>   at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
>   at scala.collection.TraversableLike.map(TraversableLike.scala:237)
>   at scala.collection.TraversableLike.map$(TraversableLike.scala:230)
>   at scala.collection.AbstractTraversable.map(Traversable.scala:108)
>   at 
> org.apache.spark.sql.catalyst.expressions.HigherOrderFunction.$anonfun$functionsForEval$1(higherOrderFunctions.scala:147)
>   at 
> scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237)
>   at scala.collection.immutable.List.foreach(List.scala:392)
>   at scala.collection.TraversableLike.map(TraversableLike.scala:237)
>   at scala.collection.TraversableLike.map$(TraversableLike.scala:230)
>   at scala.collection.immutable.List.map(List.scala:298)
>   at 
> org.apache.spark.sql.catalyst.expressions.HigherOrderFunction.functionsForEval(higherOrderFunctions.scala:145)
>   at 
> org.apache.spark.sql.catalyst.expressions.HigherOrderFunction.functionsForEval$(higherOrderFunctions.scala:145)
>   at 
> org.apache.spark.sql.catalyst.expressions.ArrayExists.functionsForEval$lzycompute(higherOrderFunctions.scala:369)
>   at 
> org.apache.spark.sql.catalyst.expressions.ArrayExists.functionsForEval(higherOrderFunctions.scala:369)
>   at 
> org.apache.spark.sql.catalyst.expressions.SimpleHigherOrderFunction.functionForEval(higherOrderFunctions.scala:176)
>   at 
> org.apache.spark.sql.catalyst.expressions.SimpleHigherOrderFunction.functionForEval$(higherOrderFunctions.scala:176)
>   at 
> org.apache.spark.sql.catalyst.expressions.ArrayExists.functionForEval(higherOrderFunctions.scala:369)
>   at 
> org.apache.spark.sql.catalyst.expressions.ArrayExists.nullSafeEval(higherOrderFunctions.scala:387)
>   at 
> org.apache.spark.sql.catalyst.expressions.SimpleHigherOrderFunction.eval(higherOrderFunctions.scala:190)
>   at 
> org.apache.spark.sql.catalyst.expressions.SimpleHigherOrderFunction.eval$(higherOrderFunctions.scala:185)
>   at 
> org.apache.spark.sql.catalyst.expressions.ArrayExists.eval(higherOrderFunctions.scala:369)
>   at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificPredicate.eval(Unknown
>  Source)
>   at 
> org.apache.spark.sql.execution.FilterExec.$anonfun$doExecute$3(basicPhysicalOperators.scala:216)
>   at 
> org.apache.spark.sql.execution.FilterExec.$anonfun$doExecute$3$adapted(basicPhysicalOperators.scala:215)
> ...
> {code}
> because the {{UnresolvedAttribute}} s in {{LambdaFunction}} are unexpectedly 
> resolved by the rule.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to