Takeshi Yamamuro created SPARK-29708: ----------------------------------------
Summary: Different answers in aggregates of multiple grouping sets Key: SPARK-29708 URL: https://issues.apache.org/jira/browse/SPARK-29708 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 3.0.0 Reporter: Takeshi Yamamuro A query below with multiple grouping sets seems to have different answers between PgSQL and Spark; {code:java} postgres=# create table gstest4(id integer, v integer, unhashable_col bit(4), unsortable_col xid); postgres=# insert into gstest4 postgres-# values (1,1,b'0000','1'), (2,2,b'0001','1'), postgres-# (3,4,b'0010','2'), (4,8,b'0011','2'), postgres-# (5,16,b'0000','2'), (6,32,b'0001','2'), postgres-# (7,64,b'0010','1'), (8,128,b'0011','1'); INSERT 0 8 postgres=# select unsortable_col, count(*) postgres-# from gstest4 group by grouping sets ((unsortable_col),(unsortable_col)) postgres-# order by text(unsortable_col); unsortable_col | count ----------------+------- 1 | 8 1 | 8 2 | 8 2 | 8 (4 rows) {code} {code:java} scala> sql("""create table gstest4(id integer, v integer, unhashable_col /* bit(4) */ byte, unsortable_col /* xid */ integer) using parquet""") scala> sql(""" | insert into gstest4 | values (1,1,tinyint('0'),1), (2,2,tinyint('1'),1), | (3,4,tinyint('2'),2), (4,8,tinyint('3'),2), | (5,16,tinyint('0'),2), (6,32,tinyint('1'),2), | (7,64,tinyint('2'),1), (8,128,tinyint('3'),1) | """) res21: org.apache.spark.sql.DataFrame = [] scala> scala> sql(""" | select unsortable_col, count(*) | from gstest4 group by grouping sets ((unsortable_col),(unsortable_col)) | order by string(unsortable_col) | """).show +--------------+--------+ |unsortable_col|count(1)| +--------------+--------+ | 1| 8| | 2| 8| +--------------+--------+ {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org