Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22696#discussion_r224457317
  
    --- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/PlanParserSuite.scala
 ---
    @@ -108,7 +108,7 @@ class PlanParserSuite extends AnalysisTest {
         assertEqual("select a, b from db.c where x < 1", table("db", 
"c").where('x < 1).select('a, 'b))
         assertEqual(
           "select a, b from db.c having x < 1",
    -      table("db", "c").select('a, 'b).where('x < 1))
    +      table("db", "c").groupBy()('a, 'b).where('x < 1))
    --- End diff --
    
    Is this query legal? Can we run such query in a test?
    
    I read the articles 
[here](https://blog.jooq.org/2014/12/04/do-you-really-understand-sqls-group-by-and-having-clauses/)
 and 
[here](https://stackoverflow.com/questions/5496786/having-clause-in-postgresql/5496829#5496829).
 One point gets my attention. Below is Postgres documentation about `HAVING` 
without `GROUP BY`:
    
    > The presence of HAVING turns a query into a grouped query even if there 
is no GROUP BY clause. This is the same as what happens when the query contains 
aggregate functions but no GROUP BY clause. All the selected rows are 
considered to form a single group, and **the SELECT list and HAVING clause can 
only reference table columns from within aggregate functions**. Such a query 
will emit a single row if the HAVING condition is true, zero rows if it is not 
true.
    
    Please see the bold text. Seems to me in this query, we can't have `x < 1` 
as condition in `HAVING` because `x` is not within aggregate functions. ditto 
for `a` and `b` in `SELECT` list.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to