[jira] [Updated] (SPARK-8972) Incorrect result for rollup

2015-07-16 Thread Yin Huai (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-8972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yin Huai updated SPARK-8972:

Assignee: Cheng Hao

 Incorrect result for rollup
 ---

 Key: SPARK-8972
 URL: https://issues.apache.org/jira/browse/SPARK-8972
 Project: Spark
  Issue Type: Bug
  Components: SQL
Reporter: Cheng Hao
Assignee: Cheng Hao
Priority: Critical
 Fix For: 1.5.0


 {code:java}
 import sqlContext.implicits._
 case class KeyValue(key: Int, value: String)
 val df = sc.parallelize(1 to 5).map(i=KeyValue(i, i.toString)).toDF
 df.registerTempTable(foo)
 sqlContext.sql(select count(*) as cnt, key % 100,GROUPING__ID from foo group 
 by key%100 with rollup).show(100)
 // output
 +---+---++
 |cnt|_c1|GROUPING__ID|
 +---+---++
 |  1|  4|   0|
 |  1|  4|   1|
 |  1|  5|   0|
 |  1|  5|   1|
 |  1|  1|   0|
 |  1|  1|   1|
 |  1|  2|   0|
 |  1|  2|   1|
 |  1|  3|   0|
 |  1|  3|   1|
 +---+---++
 {code}
 After checking with the code, seems we does't support the complex expressions 
 (not just simple column names) for GROUP BY keys for rollup, as well as the 
 cube. And it even will not report it if we have complex expression in the 
 rollup keys, hence we get very confusing result as the example above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-8972) Incorrect result for rollup

2015-07-12 Thread Cheng Hao (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-8972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Hao updated SPARK-8972:
-
Description: 
{code:java}
import sqlContext.implicits._
case class KeyValue(key: Int, value: String)
val df = sc.parallelize(1 to 5).map(i=KeyValue(i, i.toString)).toDF
df.registerTempTable(foo)
sqlContext.sql(select count(*) as cnt, key % 100,GROUPING__ID from foo group 
by key%100 with rollup).show(100)
// output
+---+---++
|cnt|_c1|GROUPING__ID|
+---+---++
|  1|  4|   0|
|  1|  4|   1|
|  1|  5|   0|
|  1|  5|   1|
|  1|  1|   0|
|  1|  1|   1|
|  1|  2|   0|
|  1|  2|   1|
|  1|  3|   0|
|  1|  3|   1|
+---+---++
{code}
After checking with the code, seems we does't support the complex expressions 
(not just simple column names) for GROUP BY keys for rollup, as well as the 
cube. And it even will not report it if we have complex expression in the 
rollup keys, hence we get very confusing result as the example above.

  was:
{code:java}
import sqlContext.implicits._
case class KeyValue(key: Int, value: String)
val df = sc.parallelize(1 to 5).map(i=KeyValue(i, i.toString)).toDF
df.registerTempTable(foo)
sqlContext.sql(select count(*) as cnt, key % 100,GROUPING__ID from foo group 
by key%100 with rollup).show(100)
// output
+---+---++
|cnt|_c1|GROUPING__ID|
+---+---++
|  1|  4|   0|
|  1|  4|   1|
|  1|  5|   0|
|  1|  5|   1|
|  1|  1|   0|
|  1|  1|   1|
|  1|  2|   0|
|  1|  2|   1|
|  1|  3|   0|
|  1|  3|   1|
+---+---++
{code}


 Incorrect result for rollup
 ---

 Key: SPARK-8972
 URL: https://issues.apache.org/jira/browse/SPARK-8972
 Project: Spark
  Issue Type: Bug
  Components: SQL
Reporter: Cheng Hao
Priority: Critical

 {code:java}
 import sqlContext.implicits._
 case class KeyValue(key: Int, value: String)
 val df = sc.parallelize(1 to 5).map(i=KeyValue(i, i.toString)).toDF
 df.registerTempTable(foo)
 sqlContext.sql(select count(*) as cnt, key % 100,GROUPING__ID from foo group 
 by key%100 with rollup).show(100)
 // output
 +---+---++
 |cnt|_c1|GROUPING__ID|
 +---+---++
 |  1|  4|   0|
 |  1|  4|   1|
 |  1|  5|   0|
 |  1|  5|   1|
 |  1|  1|   0|
 |  1|  1|   1|
 |  1|  2|   0|
 |  1|  2|   1|
 |  1|  3|   0|
 |  1|  3|   1|
 +---+---++
 {code}
 After checking with the code, seems we does't support the complex expressions 
 (not just simple column names) for GROUP BY keys for rollup, as well as the 
 cube. And it even will not report it if we have complex expression in the 
 rollup keys, hence we get very confusing result as the example above.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-8972) Incorrect result for rollup

2015-07-09 Thread Cheng Hao (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-8972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Hao updated SPARK-8972:
-
Summary: Incorrect result for rollup  (was: Wrong result for rollup)

 Incorrect result for rollup
 ---

 Key: SPARK-8972
 URL: https://issues.apache.org/jira/browse/SPARK-8972
 Project: Spark
  Issue Type: Bug
  Components: SQL
Reporter: Cheng Hao
Priority: Critical

 {code:java}
 import sqlContext.implicits._
 case class KeyValue(key: Int, value: String)
 val df = sc.parallelize(1 to 5).map(i=KeyValue(i, i.toString)).toDF
 df.registerTempTable(foo)
 sqlContext.sql(select count(*) as cnt, key % 100,GROUPING__ID from foo group 
 by key%100 with rollup).show(100)
 // output
 +---+---++
 |cnt|_c1|GROUPING__ID|
 +---+---++
 |  1|  4|   0|
 |  1|  4|   1|
 |  1|  5|   0|
 |  1|  5|   1|
 |  1|  1|   0|
 |  1|  1|   1|
 |  1|  2|   0|
 |  1|  2|   1|
 |  1|  3|   0|
 |  1|  3|   1|
 +---+---++
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org