[ https://issues.apache.org/jira/browse/SPARK-30182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dongjoon Hyun updated SPARK-30182: ---------------------------------- Affects Version/s: (was: 3.0.0) 3.1.0 > Support nested aggregates > ------------------------- > > Key: SPARK-30182 > URL: https://issues.apache.org/jira/browse/SPARK-30182 > Project: Spark > Issue Type: Sub-task > Components: SQL > Affects Versions: 3.1.0 > Reporter: jiaan.geng > Priority: Major > > Spark SQL cannot supports a SQL with nested aggregate as below: > {code:java} > SELECT sum(salary), row_number() OVER (ORDER BY depname), sum( > sum(salary) FILTER (WHERE enroll_date > '2007-01-01') > ) FILTER (WHERE depname <> 'sales') OVER (ORDER BY depname DESC) AS > "filtered_sum", > depname > FROM empsalary GROUP BY depname;{code} > And Spark will throw exception as follows: > {code:java} > org.apache.spark.sql.AnalysisException > It is not allowed to use an aggregate function in the argument of another > aggregate function. Please use the inner aggregate function in a > sub-query.{code} > But PostgreSQL supports this syntax. > {code:java} > SELECT sum(salary), row_number() OVER (ORDER BY depname), sum( > sum(salary) FILTER (WHERE enroll_date > '2007-01-01') > ) FILTER (WHERE depname <> 'sales') OVER (ORDER BY depname DESC) AS > "filtered_sum", > depname > FROM empsalary GROUP BY depname; > sum | row_number | filtered_sum | depname > -------+------------+--------------+----------- > 25100 | 1 | 22600 | develop > 7400 | 2 | 3500 | personnel > 14600 | 3 | | sales > (3 rows){code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org