[jira] [Updated] (SPARK-8972) Incorrect result for rollup
[ https://issues.apache.org/jira/browse/SPARK-8972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-8972: Assignee: Cheng Hao Incorrect result for rollup --- Key: SPARK-8972 URL: https://issues.apache.org/jira/browse/SPARK-8972 Project: Spark Issue Type: Bug Components: SQL Reporter: Cheng Hao Assignee: Cheng Hao Priority: Critical Fix For: 1.5.0 {code:java} import sqlContext.implicits._ case class KeyValue(key: Int, value: String) val df = sc.parallelize(1 to 5).map(i=KeyValue(i, i.toString)).toDF df.registerTempTable(foo) sqlContext.sql(select count(*) as cnt, key % 100,GROUPING__ID from foo group by key%100 with rollup).show(100) // output +---+---++ |cnt|_c1|GROUPING__ID| +---+---++ | 1| 4| 0| | 1| 4| 1| | 1| 5| 0| | 1| 5| 1| | 1| 1| 0| | 1| 1| 1| | 1| 2| 0| | 1| 2| 1| | 1| 3| 0| | 1| 3| 1| +---+---++ {code} After checking with the code, seems we does't support the complex expressions (not just simple column names) for GROUP BY keys for rollup, as well as the cube. And it even will not report it if we have complex expression in the rollup keys, hence we get very confusing result as the example above. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-8972) Incorrect result for rollup
[ https://issues.apache.org/jira/browse/SPARK-8972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated SPARK-8972: - Description: {code:java} import sqlContext.implicits._ case class KeyValue(key: Int, value: String) val df = sc.parallelize(1 to 5).map(i=KeyValue(i, i.toString)).toDF df.registerTempTable(foo) sqlContext.sql(select count(*) as cnt, key % 100,GROUPING__ID from foo group by key%100 with rollup).show(100) // output +---+---++ |cnt|_c1|GROUPING__ID| +---+---++ | 1| 4| 0| | 1| 4| 1| | 1| 5| 0| | 1| 5| 1| | 1| 1| 0| | 1| 1| 1| | 1| 2| 0| | 1| 2| 1| | 1| 3| 0| | 1| 3| 1| +---+---++ {code} After checking with the code, seems we does't support the complex expressions (not just simple column names) for GROUP BY keys for rollup, as well as the cube. And it even will not report it if we have complex expression in the rollup keys, hence we get very confusing result as the example above. was: {code:java} import sqlContext.implicits._ case class KeyValue(key: Int, value: String) val df = sc.parallelize(1 to 5).map(i=KeyValue(i, i.toString)).toDF df.registerTempTable(foo) sqlContext.sql(select count(*) as cnt, key % 100,GROUPING__ID from foo group by key%100 with rollup).show(100) // output +---+---++ |cnt|_c1|GROUPING__ID| +---+---++ | 1| 4| 0| | 1| 4| 1| | 1| 5| 0| | 1| 5| 1| | 1| 1| 0| | 1| 1| 1| | 1| 2| 0| | 1| 2| 1| | 1| 3| 0| | 1| 3| 1| +---+---++ {code} Incorrect result for rollup --- Key: SPARK-8972 URL: https://issues.apache.org/jira/browse/SPARK-8972 Project: Spark Issue Type: Bug Components: SQL Reporter: Cheng Hao Priority: Critical {code:java} import sqlContext.implicits._ case class KeyValue(key: Int, value: String) val df = sc.parallelize(1 to 5).map(i=KeyValue(i, i.toString)).toDF df.registerTempTable(foo) sqlContext.sql(select count(*) as cnt, key % 100,GROUPING__ID from foo group by key%100 with rollup).show(100) // output +---+---++ |cnt|_c1|GROUPING__ID| +---+---++ | 1| 4| 0| | 1| 4| 1| | 1| 5| 0| | 1| 5| 1| | 1| 1| 0| | 1| 1| 1| | 1| 2| 0| | 1| 2| 1| | 1| 3| 0| | 1| 3| 1| +---+---++ {code} After checking with the code, seems we does't support the complex expressions (not just simple column names) for GROUP BY keys for rollup, as well as the cube. And it even will not report it if we have complex expression in the rollup keys, hence we get very confusing result as the example above. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-8972) Incorrect result for rollup
[ https://issues.apache.org/jira/browse/SPARK-8972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated SPARK-8972: - Summary: Incorrect result for rollup (was: Wrong result for rollup) Incorrect result for rollup --- Key: SPARK-8972 URL: https://issues.apache.org/jira/browse/SPARK-8972 Project: Spark Issue Type: Bug Components: SQL Reporter: Cheng Hao Priority: Critical {code:java} import sqlContext.implicits._ case class KeyValue(key: Int, value: String) val df = sc.parallelize(1 to 5).map(i=KeyValue(i, i.toString)).toDF df.registerTempTable(foo) sqlContext.sql(select count(*) as cnt, key % 100,GROUPING__ID from foo group by key%100 with rollup).show(100) // output +---+---++ |cnt|_c1|GROUPING__ID| +---+---++ | 1| 4| 0| | 1| 4| 1| | 1| 5| 0| | 1| 5| 1| | 1| 1| 0| | 1| 1| 1| | 1| 2| 0| | 1| 2| 1| | 1| 3| 0| | 1| 3| 1| +---+---++ {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org