Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22696#discussion_r224457317 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/PlanParserSuite.scala --- @@ -108,7 +108,7 @@ class PlanParserSuite extends AnalysisTest { assertEqual("select a, b from db.c where x < 1", table("db", "c").where('x < 1).select('a, 'b)) assertEqual( "select a, b from db.c having x < 1", - table("db", "c").select('a, 'b).where('x < 1)) + table("db", "c").groupBy()('a, 'b).where('x < 1)) --- End diff -- Is this query legal? Can we run such query in a test? I read the articles [here](https://blog.jooq.org/2014/12/04/do-you-really-understand-sqls-group-by-and-having-clauses/) and [here](https://stackoverflow.com/questions/5496786/having-clause-in-postgresql/5496829#5496829). One point gets my attention. Below is Postgres documentation about `HAVING` without `GROUP BY`: > The presence of HAVING turns a query into a grouped query even if there is no GROUP BY clause. This is the same as what happens when the query contains aggregate functions but no GROUP BY clause. All the selected rows are considered to form a single group, and **the SELECT list and HAVING clause can only reference table columns from within aggregate functions**. Such a query will emit a single row if the HAVING condition is true, zero rows if it is not true. Please see the bold text. Seems to me in this query, we can't have `x < 1` as condition in `HAVING` because `x` is not within aggregate functions. ditto for `a` and `b` in `SELECT` list.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org