Takeshi Yamamuro created SPARK-30279: ----------------------------------------
Summary: Support 32 or more grouping attributes for GROUPING_ID Key: SPARK-30279 URL: https://issues.apache.org/jira/browse/SPARK-30279 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.0.0 Reporter: Takeshi Yamamuro This ticket targets to support 32 or more grouping attributes for GROUPING_ID. In the current master, an integer overflow can occur to compute grouping IDs; https://github.com/apache/spark/blob/e75d9afb2f282ce79c9fd8bce031287739326a4f/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala#L613 For example, the query below generates wrong grouping IDs in the master; {code} scala> val numCols = 32 // or, 31 scala> val cols = (0 until numCols).map { i => s"c$i" } scala> sql(s"create table test_$numCols (${cols.map(c => s"$c int").mkString(",")}, v int) using parquet") scala> val insertVals = (0 until numCols).map { _ => 1 }.mkString(",") scala> sql(s"insert into test_$numCols values ($insertVals,3)") scala> sql(s"select grouping_id(), sum(v) from test_$numCols group by grouping sets ((${cols.mkString(",")}), (${cols.init.mkString(",")}))").show(10, false) scala> sql(s"drop table test_$numCols") // numCols = 32 +-------------+------+ |grouping_id()|sum(v)| +-------------+------+ |0 |3 | |0 |3 | // Wrong Grouping ID +-------------+------+ // numCols = 31 +-------------+------+ |grouping_id()|sum(v)| +-------------+------+ |0 |3 | |1 |3 | +-------------+------+ {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org