[ https://issues.apache.org/jira/browse/SPARK-27255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16800130#comment-16800130 ]
Dilip Biswal commented on SPARK-27255: -------------------------------------- [~chakravarthi] Hello, I had started working on this and hadn't seen your comment. If you haven't started, can i submit a PR for this and you can help review ? Please let me know. > Aggregate functions should not be allowed in WHERE > -------------------------------------------------- > > Key: SPARK-27255 > URL: https://issues.apache.org/jira/browse/SPARK-27255 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.4.0 > Reporter: Mingcong Han > Priority: Minor > > Aggregate functions should not be allowed in WHERE clause. But Spark SQL > throws an exception when generating codes. It is supposed to throw an > exception during parsing or analyzing. > Here is an example: > {code:scala} > val df = spark.sql("select * from t where sum(ta) > 0") > df.explain(true) > df.show() > {code} > Spark SQL explains it as: > {noformat} > == Parsed Logical Plan == > 'Project [*] > +- 'Filter ('sum('ta) > 0) > +- 'UnresolvedRelation `t` > == Analyzed Logical Plan == > ta: int, tb: int > Project [ta#5, tb#6] > +- Filter (sum(cast(ta#5 as bigint)) > cast(0 as bigint)) > +- SubqueryAlias `t` > +- Project [ta#5, tb#6] > +- SubqueryAlias `as` > +- LocalRelation [ta#5, tb#6] > == Optimized Logical Plan == > Filter (sum(cast(ta#5 as bigint)) > 0) > +- LocalRelation [ta#5, tb#6] > == Physical Plan == > *(1) Filter (sum(cast(ta#5 as bigint)) > 0) > +- LocalTableScan [ta#5, tb#6] > {noformat} > But when executing `df.show()`: > {noformat} > Exception in thread "main" java.lang.UnsupportedOperationException: Cannot > generate code for expression: sum(cast(input[0, int, false] as bigint)) > at > org.apache.spark.sql.catalyst.expressions.Unevaluable.doGenCode(Expression.scala:291) > at > org.apache.spark.sql.catalyst.expressions.Unevaluable.doGenCode$(Expression.scala:290) > at > org.apache.spark.sql.catalyst.expressions.aggregate.AggregateExpression.doGenCode(interfaces.scala:87) > at > org.apache.spark.sql.catalyst.expressions.Expression.$anonfun$genCode$3(Expression.scala:138) > at scala.Option.getOrElse(Option.scala:138) > {noformat} > I have tried it in PostgreSQL, and it directly throws an error: > {noformat} > ERROR: Aggregate functions are not allowed in WHERE. > {noformat} > We'd better throw an AnalysisException here. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org