[ https://issues.apache.org/jira/browse/SPARK-29708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17017478#comment-17017478 ]
Dongjoon Hyun commented on SPARK-29708: --------------------------------------- This is backported to branch-2.4 via https://github.com/apache/spark/pull/27229 . > Different answers in aggregates of duplicate grouping sets > ---------------------------------------------------------- > > Key: SPARK-29708 > URL: https://issues.apache.org/jira/browse/SPARK-29708 > Project: Spark > Issue Type: Sub-task > Components: SQL > Affects Versions: 2.4.0, 2.4.1, 2.4.2, 2.4.3, 2.4.4, 3.0.0 > Reporter: Takeshi Yamamuro > Assignee: Takeshi Yamamuro > Priority: Major > Labels: correctness > Fix For: 2.4.5, 3.0.0 > > > A query below with multiple grouping sets seems to have different answers > between PgSQL and Spark; > {code:java} > postgres=# create table gstest4(id integer, v integer, unhashable_col bit(4), > unsortable_col xid); > postgres=# insert into gstest4 > postgres-# values (1,1,b'0000','1'), (2,2,b'0001','1'), > postgres-# (3,4,b'0010','2'), (4,8,b'0011','2'), > postgres-# (5,16,b'0000','2'), (6,32,b'0001','2'), > postgres-# (7,64,b'0010','1'), (8,128,b'0011','1'); > INSERT 0 8 > postgres=# select unsortable_col, count(*) > postgres-# from gstest4 group by grouping sets > ((unsortable_col),(unsortable_col)) > postgres-# order by text(unsortable_col); > unsortable_col | count > ----------------+------- > 1 | 8 > 1 | 8 > 2 | 8 > 2 | 8 > (4 rows) > {code} > {code:java} > scala> sql("""create table gstest4(id integer, v integer, unhashable_col /* > bit(4) */ byte, unsortable_col /* xid */ integer) using parquet""") > scala> sql(""" > | insert into gstest4 > | values (1,1,tinyint('0'),1), (2,2,tinyint('1'),1), > | (3,4,tinyint('2'),2), (4,8,tinyint('3'),2), > | (5,16,tinyint('0'),2), (6,32,tinyint('1'),2), > | (7,64,tinyint('2'),1), (8,128,tinyint('3'),1) > | """) > res21: org.apache.spark.sql.DataFrame = [] > scala> > scala> sql(""" > | select unsortable_col, count(*) > | from gstest4 group by grouping sets > ((unsortable_col),(unsortable_col)) > | order by string(unsortable_col) > | """).show > +--------------+--------+ > |unsortable_col|count(1)| > +--------------+--------+ > | 1| 8| > | 2| 8| > +--------------+--------+ > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org