[ https://issues.apache.org/jira/browse/SPARK-38666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wenchen Fan resolved SPARK-38666. --------------------------------- Fix Version/s: 3.3.0 Assignee: Bruce Robbins Resolution: Fixed > Missing aggregate filter checks > ------------------------------- > > Key: SPARK-38666 > URL: https://issues.apache.org/jira/browse/SPARK-38666 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 3.4.0 > Reporter: Bruce Robbins > Assignee: Bruce Robbins > Priority: Major > Fix For: 3.3.0 > > > h3. Window function in filter > {noformat} > select sum(a) filter (where nth_value(a, 2) over (order by b) > 1) > from (select 1 a, '2' b); > {noformat} > This query should produce an analysis error, but instead produces a stack > overflow: > {noformat} > java.lang.StackOverflowError: null > at > org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$collect$1(TreeNode.scala:305) > ~[spark-catalyst_2.12-3.4.0-SNAPSHOT.jar:3.4.0-SNAPSHOT] > at > org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$collect$1$adapted(TreeNode.scala:305) > ~[spark-catalyst_2.12-3.4.0-SNAPSHOT.jar:3.4.0-SNAPSHOT] > at > org.apache.spark.sql.catalyst.trees.TreeNode.foreach(TreeNode.scala:264) > ~[spark-catalyst_2.12-3.4.0-SNAPSHOT.jar:3.4.0-SNAPSHOT] > at > org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreach$1(TreeNode.scala:265) > ~[spark-catalyst_2.12-3.4.0-SNAPSHOT.jar:3.4.0-SNAPSHOT] > at > org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreach$1$adapted(TreeNode.scala:265) > ~[spark-catalyst_2.12-3.4.0-SNAPSHOT.jar:3.4.0-SNAPSHOT] > at scala.collection.Iterator.foreach(Iterator.scala:943) > ~[scala-library.jar:?] > ... > {noformat} > h3. Non-boolean filter expression > {noformat} > select sum(a) filter (where a) from (select 1 a, '2' b); > {noformat} > This query should produce an analysis error, but instead causes a projection > compilation error or whole-stage codegen error (depending on the datatype of > the expression): > {noformat} > 22/03/26 17:19:03 ERROR CodeGenerator: failed to compile: > org.codehaus.commons.compiler.CompileException: File 'generated.java', Line > 50, Column 6: Not a boolean expression > org.codehaus.commons.compiler.CompileException: File 'generated.java', Line > 50, Column 6: Not a boolean expression > at > org.codehaus.janino.UnitCompiler.compileError(UnitCompiler.java:12021) > ~[janino-3.0.16.jar:?] > at > org.codehaus.janino.UnitCompiler.compileBoolean2(UnitCompiler.java:4049) > ~[janino-3.0.16.jar:?] > at org.codehaus.janino.UnitCompiler.access$6300(UnitCompiler.java:226) > ~[janino-3.0.16.jar:?] > at > org.codehaus.janino.UnitCompiler$14.visitIntegerLiteral(UnitCompiler.java:4016) > ~[janino-3.0.16.jar:?] > ... > 22/03/26 17:19:05 WARN MutableProjection: Expr codegen error and falling back > to interpreter mode > java.util.concurrent.ExecutionException: > org.codehaus.commons.compiler.CompileException: File 'generated.java', Line > 40, Column 15: failed to compile: > org.codehaus.commons.compiler.CompileException: File 'generated.java', Line > 40, Column 15: Not a boolean expression > at > com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:306) > ~[guava-14.0.1.jar:?] > at > com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:293) > ~[guava-14.0.1.jar:?] > at > com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) > ~[guava-14.0.1.jar:?] > at > com.google.common.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:135) > ~[guava-14.0.1.jar:?] > at > com.google.common.cache.LocalCache$Segment.getAndRecordStats(LocalCache.java:2410) > ~[guava-14.0.1.jar:?] > at > com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2380) > ~[guava-14.0.1.jar:?] > at > com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342) > ~[guava-14.0.1.jar:?] > at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2257) > ~[guava-14.0.1.jar:?] > at com.google.common.cache.LocalCache.get(LocalCache.java:4000) > ~[guava-14.0.1.jar:?] > at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4004) > ~[guava-14.0.1.jar:?] > ... > NULL > Time taken: 5.397 seconds, Fetched 1 row(s) > {noformat} > Interestingly, it also returns a result (NULL). > h3. Aggregate expression in filter expression > {noformat} > select max(b) filter (where max(a) > 1) from (select 1 a, '2' b); > {noformat} > This query should produce an analysis error, but instead causes a projection > compilation error or whole-stage codegen error (depending on the datatype of > the expression being aggregated): > {noformat} > 22/03/26 17:26:38 ERROR TaskSetManager: Task 0 in stage 3.0 failed 1 times; > aborting job > org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in > stage 3.0 failed 1 times, most recent failure: Lost task 0.0 in stage 3.0 > (TID 2) (10.0.0.106 executor driver): > org.apache.spark.SparkUnsupportedOperationException: Cannot evaluate > expression: max(1) > at > org.apache.spark.sql.errors.QueryExecutionErrors$.cannotEvaluateExpressionError(QueryExecutionErrors.scala:79) > at > org.apache.spark.sql.catalyst.expressions.Unevaluable.eval(Expression.scala:344) > at > org.apache.spark.sql.catalyst.expressions.Unevaluable.eval$(Expression.scala:343) > at > org.apache.spark.sql.catalyst.expressions.aggregate.AggregateExpression.eval(interfaces.scala:99) > at > org.apache.spark.sql.catalyst.expressions.BinaryExpression.eval(Expression.scala:593) > at > org.apache.spark.sql.catalyst.expressions.If.eval(conditionalExpressions.scala:68) > ... > {noformat} -- This message was sent by Atlassian Jira (v8.20.7#820007) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org