[ https://issues.apache.org/jira/browse/SPARK-30220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
jiaan.geng updated SPARK-30220: ------------------------------- Description: Spark SQL cannot supports a SQL with nested aggregate as below: {code:java} select sum(unique1) FILTER (WHERE unique1 IN (SELECT unique1 FROM onek where unique1 < 100)) FROM tenk1;{code} And Spark will throw exception as follows: {code:java} org.apache.spark.sql.AnalysisException IN/EXISTS predicate sub-queries can only be used in Filter/Join and a few commands: Aggregate [sum(cast(unique1#x as bigint)) AS sum(unique1)#xL] : +- Project [unique1#x] : +- Filter (unique1#x < 100) : +- SubqueryAlias `onek` : +- RelationV2[unique1#x, unique2#x, two#x, four#x, ten#x, twenty#x, hundred#x, thousand#x, twothousand#x, fivethous#x, tenthous#x, odd#x, even#x, stringu1#x, stringu2#x, string4#x] csv file:/home/xitong/code/gengjiaan/spark/sql/core/target/scala-2.12/test-classes/test-data/postgresql/onek.data +- SubqueryAlias `tenk1` +- RelationV2[unique1#x, unique2#x, two#x, four#x, ten#x, twenty#x, hundred#x, thousand#x, twothousand#x, fivethous#x, tenthous#x, odd#x, even#x, stringu1#x, stringu2#x, string4#x] csv file:/home/xitong/code/gengjiaan/spark/sql/core/target/scala-2.12/test-classes/test-data/postgresql/tenk.data{code} But PostgreSQL supports this syntax. {code:java} select sum(unique1) FILTER (WHERE unique1 IN (SELECT unique1 FROM onek where unique1 < 100)) FROM tenk1; sum ------ 4950 (1 row){code} > Support Filter expression uses IN/EXISTS predicate sub-queries > -------------------------------------------------------------- > > Key: SPARK-30220 > URL: https://issues.apache.org/jira/browse/SPARK-30220 > Project: Spark > Issue Type: Sub-task > Components: SQL > Affects Versions: 3.0.0 > Reporter: jiaan.geng > Priority: Major > > Spark SQL cannot supports a SQL with nested aggregate as below: > > {code:java} > select sum(unique1) FILTER (WHERE > unique1 IN (SELECT unique1 FROM onek where unique1 < 100)) FROM tenk1;{code} > > And Spark will throw exception as follows: > > {code:java} > org.apache.spark.sql.AnalysisException > IN/EXISTS predicate sub-queries can only be used in Filter/Join and a few > commands: Aggregate [sum(cast(unique1#x as bigint)) AS sum(unique1)#xL] > : +- Project [unique1#x] > : +- Filter (unique1#x < 100) > : +- SubqueryAlias `onek` > : +- RelationV2[unique1#x, unique2#x, two#x, four#x, ten#x, twenty#x, > hundred#x, thousand#x, twothousand#x, fivethous#x, tenthous#x, odd#x, even#x, > stringu1#x, stringu2#x, string4#x] csv > file:/home/xitong/code/gengjiaan/spark/sql/core/target/scala-2.12/test-classes/test-data/postgresql/onek.data > +- SubqueryAlias `tenk1` > +- RelationV2[unique1#x, unique2#x, two#x, four#x, ten#x, twenty#x, > hundred#x, thousand#x, twothousand#x, fivethous#x, tenthous#x, odd#x, even#x, > stringu1#x, stringu2#x, string4#x] csv > file:/home/xitong/code/gengjiaan/spark/sql/core/target/scala-2.12/test-classes/test-data/postgresql/tenk.data{code} > > But PostgreSQL supports this syntax. > {code:java} > select sum(unique1) FILTER (WHERE > unique1 IN (SELECT unique1 FROM onek where unique1 < 100)) FROM tenk1; > sum > ------ > 4950 > (1 row){code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org