[jira] [Updated] (SPARK-29699) Different answers in nested aggregates with window functions

2020-02-19 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-29699:
-
Priority: Critical  (was: Blocker)

> Different answers in nested aggregates with window functions
> 
>
> Key: SPARK-29699
> URL: https://issues.apache.org/jira/browse/SPARK-29699
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Takeshi Yamamuro
>Priority: Critical
>
> A nested aggregate below with a window function seems to have different 
> answers in the `rsum` column  between PgSQL and Spark;
> {code:java}
> postgres=# create table gstest2 (a integer, b integer, c integer, d integer, 
> e integer, f integer, g integer, h integer);
> postgres=# insert into gstest2 values
> postgres-#   (1, 1, 1, 1, 1, 1, 1, 1),
> postgres-#   (1, 1, 1, 1, 1, 1, 1, 2),
> postgres-#   (1, 1, 1, 1, 1, 1, 2, 2),
> postgres-#   (1, 1, 1, 1, 1, 2, 2, 2),
> postgres-#   (1, 1, 1, 1, 2, 2, 2, 2),
> postgres-#   (1, 1, 1, 2, 2, 2, 2, 2),
> postgres-#   (1, 1, 2, 2, 2, 2, 2, 2),
> postgres-#   (1, 2, 2, 2, 2, 2, 2, 2),
> postgres-#   (2, 2, 2, 2, 2, 2, 2, 2);
> INSERT 0 9
> postgres=# 
> postgres=# select a, b, sum(c), sum(sum(c)) over (order by a,b) as rsum
> postgres-#   from gstest2 group by rollup (a,b) order by rsum, a, b;
>  a | b | sum | rsum 
> ---+---+-+--
>  1 | 1 |   8 |8
>  1 | 2 |   2 |   10
>  1 |   |  10 |   20
>  2 | 2 |   2 |   22
>  2 |   |   2 |   24
>|   |  12 |   36
> (6 rows)
> {code}
> {code:java}
> scala> sql("""
>  | select a, b, sum(c), sum(sum(c)) over (order by a,b) as rsum
>  |   from gstest2 group by rollup (a,b) order by rsum, a, b
>  | """).show()
> +++--++   
>   
> |   a|   b|sum(c)|rsum|
> +++--++
> |null|null|12|  12|
> |   1|null|10|  22|
> |   1|   1| 8|  30|
> |   1|   2| 2|  32|
> |   2|null| 2|  34|
> |   2|   2| 2|  36|
> +++--++
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-29699) Different answers in nested aggregates with window functions

2020-02-19 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-29699:
-
Target Version/s:   (was: 3.0.0)

> Different answers in nested aggregates with window functions
> 
>
> Key: SPARK-29699
> URL: https://issues.apache.org/jira/browse/SPARK-29699
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Takeshi Yamamuro
>Priority: Critical
>
> A nested aggregate below with a window function seems to have different 
> answers in the `rsum` column  between PgSQL and Spark;
> {code:java}
> postgres=# create table gstest2 (a integer, b integer, c integer, d integer, 
> e integer, f integer, g integer, h integer);
> postgres=# insert into gstest2 values
> postgres-#   (1, 1, 1, 1, 1, 1, 1, 1),
> postgres-#   (1, 1, 1, 1, 1, 1, 1, 2),
> postgres-#   (1, 1, 1, 1, 1, 1, 2, 2),
> postgres-#   (1, 1, 1, 1, 1, 2, 2, 2),
> postgres-#   (1, 1, 1, 1, 2, 2, 2, 2),
> postgres-#   (1, 1, 1, 2, 2, 2, 2, 2),
> postgres-#   (1, 1, 2, 2, 2, 2, 2, 2),
> postgres-#   (1, 2, 2, 2, 2, 2, 2, 2),
> postgres-#   (2, 2, 2, 2, 2, 2, 2, 2);
> INSERT 0 9
> postgres=# 
> postgres=# select a, b, sum(c), sum(sum(c)) over (order by a,b) as rsum
> postgres-#   from gstest2 group by rollup (a,b) order by rsum, a, b;
>  a | b | sum | rsum 
> ---+---+-+--
>  1 | 1 |   8 |8
>  1 | 2 |   2 |   10
>  1 |   |  10 |   20
>  2 | 2 |   2 |   22
>  2 |   |   2 |   24
>|   |  12 |   36
> (6 rows)
> {code}
> {code:java}
> scala> sql("""
>  | select a, b, sum(c), sum(sum(c)) over (order by a,b) as rsum
>  |   from gstest2 group by rollup (a,b) order by rsum, a, b
>  | """).show()
> +++--++   
>   
> |   a|   b|sum(c)|rsum|
> +++--++
> |null|null|12|  12|
> |   1|null|10|  22|
> |   1|   1| 8|  30|
> |   1|   2| 2|  32|
> |   2|null| 2|  34|
> |   2|   2| 2|  36|
> +++--++
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-29699) Different answers in nested aggregates with window functions

2020-02-19 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-29699:
-
Labels:   (was: correctness)

> Different answers in nested aggregates with window functions
> 
>
> Key: SPARK-29699
> URL: https://issues.apache.org/jira/browse/SPARK-29699
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Takeshi Yamamuro
>Priority: Blocker
>
> A nested aggregate below with a window function seems to have different 
> answers in the `rsum` column  between PgSQL and Spark;
> {code:java}
> postgres=# create table gstest2 (a integer, b integer, c integer, d integer, 
> e integer, f integer, g integer, h integer);
> postgres=# insert into gstest2 values
> postgres-#   (1, 1, 1, 1, 1, 1, 1, 1),
> postgres-#   (1, 1, 1, 1, 1, 1, 1, 2),
> postgres-#   (1, 1, 1, 1, 1, 1, 2, 2),
> postgres-#   (1, 1, 1, 1, 1, 2, 2, 2),
> postgres-#   (1, 1, 1, 1, 2, 2, 2, 2),
> postgres-#   (1, 1, 1, 2, 2, 2, 2, 2),
> postgres-#   (1, 1, 2, 2, 2, 2, 2, 2),
> postgres-#   (1, 2, 2, 2, 2, 2, 2, 2),
> postgres-#   (2, 2, 2, 2, 2, 2, 2, 2);
> INSERT 0 9
> postgres=# 
> postgres=# select a, b, sum(c), sum(sum(c)) over (order by a,b) as rsum
> postgres-#   from gstest2 group by rollup (a,b) order by rsum, a, b;
>  a | b | sum | rsum 
> ---+---+-+--
>  1 | 1 |   8 |8
>  1 | 2 |   2 |   10
>  1 |   |  10 |   20
>  2 | 2 |   2 |   22
>  2 |   |   2 |   24
>|   |  12 |   36
> (6 rows)
> {code}
> {code:java}
> scala> sql("""
>  | select a, b, sum(c), sum(sum(c)) over (order by a,b) as rsum
>  |   from gstest2 group by rollup (a,b) order by rsum, a, b
>  | """).show()
> +++--++   
>   
> |   a|   b|sum(c)|rsum|
> +++--++
> |null|null|12|  12|
> |   1|null|10|  22|
> |   1|   1| 8|  30|
> |   1|   2| 2|  32|
> |   2|null| 2|  34|
> |   2|   2| 2|  36|
> +++--++
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-29699) Different answers in nested aggregates with window functions

2020-01-22 Thread Dongjoon Hyun (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dongjoon Hyun updated SPARK-29699:
--
Priority: Blocker  (was: Major)

> Different answers in nested aggregates with window functions
> 
>
> Key: SPARK-29699
> URL: https://issues.apache.org/jira/browse/SPARK-29699
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Takeshi Yamamuro
>Priority: Blocker
>  Labels: correctness
>
> A nested aggregate below with a window function seems to have different 
> answers in the `rsum` column  between PgSQL and Spark;
> {code:java}
> postgres=# create table gstest2 (a integer, b integer, c integer, d integer, 
> e integer, f integer, g integer, h integer);
> postgres=# insert into gstest2 values
> postgres-#   (1, 1, 1, 1, 1, 1, 1, 1),
> postgres-#   (1, 1, 1, 1, 1, 1, 1, 2),
> postgres-#   (1, 1, 1, 1, 1, 1, 2, 2),
> postgres-#   (1, 1, 1, 1, 1, 2, 2, 2),
> postgres-#   (1, 1, 1, 1, 2, 2, 2, 2),
> postgres-#   (1, 1, 1, 2, 2, 2, 2, 2),
> postgres-#   (1, 1, 2, 2, 2, 2, 2, 2),
> postgres-#   (1, 2, 2, 2, 2, 2, 2, 2),
> postgres-#   (2, 2, 2, 2, 2, 2, 2, 2);
> INSERT 0 9
> postgres=# 
> postgres=# select a, b, sum(c), sum(sum(c)) over (order by a,b) as rsum
> postgres-#   from gstest2 group by rollup (a,b) order by rsum, a, b;
>  a | b | sum | rsum 
> ---+---+-+--
>  1 | 1 |   8 |8
>  1 | 2 |   2 |   10
>  1 |   |  10 |   20
>  2 | 2 |   2 |   22
>  2 |   |   2 |   24
>|   |  12 |   36
> (6 rows)
> {code}
> {code:java}
> scala> sql("""
>  | select a, b, sum(c), sum(sum(c)) over (order by a,b) as rsum
>  |   from gstest2 group by rollup (a,b) order by rsum, a, b
>  | """).show()
> +++--++   
>   
> |   a|   b|sum(c)|rsum|
> +++--++
> |null|null|12|  12|
> |   1|null|10|  22|
> |   1|   1| 8|  30|
> |   1|   2| 2|  32|
> |   2|null| 2|  34|
> |   2|   2| 2|  36|
> +++--++
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-29699) Different answers in nested aggregates with window functions

2019-12-22 Thread Takeshi Yamamuro (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Takeshi Yamamuro updated SPARK-29699:
-
Description: 
A nested aggregate below with a window function seems to have different answers 
in the `rsum` column  between PgSQL and Spark;
{code:java}
postgres=# create table gstest2 (a integer, b integer, c integer, d integer, e 
integer, f integer, g integer, h integer);
postgres=# insert into gstest2 values
postgres-#   (1, 1, 1, 1, 1, 1, 1, 1),
postgres-#   (1, 1, 1, 1, 1, 1, 1, 2),
postgres-#   (1, 1, 1, 1, 1, 1, 2, 2),
postgres-#   (1, 1, 1, 1, 1, 2, 2, 2),
postgres-#   (1, 1, 1, 1, 2, 2, 2, 2),
postgres-#   (1, 1, 1, 2, 2, 2, 2, 2),
postgres-#   (1, 1, 2, 2, 2, 2, 2, 2),
postgres-#   (1, 2, 2, 2, 2, 2, 2, 2),
postgres-#   (2, 2, 2, 2, 2, 2, 2, 2);
INSERT 0 9
postgres=# 
postgres=# select a, b, sum(c), sum(sum(c)) over (order by a,b) as rsum
postgres-#   from gstest2 group by rollup (a,b) order by rsum, a, b;
 a | b | sum | rsum 
---+---+-+--
 1 | 1 |   8 |8
 1 | 2 |   2 |   10
 1 |   |  10 |   20
 2 | 2 |   2 |   22
 2 |   |   2 |   24
   |   |  12 |   36
(6 rows)
{code}
{code:java}
scala> sql("""
 | select a, b, sum(c), sum(sum(c)) over (order by a,b) as rsum
 |   from gstest2 group by rollup (a,b) order by rsum, a, b
 | """).show()
+++--++ 
|   a|   b|sum(c)|rsum|
+++--++
|null|null|12|  12|
|   1|null|10|  22|
|   1|   1| 8|  30|
|   1|   2| 2|  32|
|   2|null| 2|  34|
|   2|   2| 2|  36|
+++--++
{code}

  was:
A nested aggregate below with a window function seems to have different answers 
in the `rsum` column  between PgSQL and Spark;
{code:java}
postgres=# create table gstest2 (a integer, b integer, c integer, d integer, e 
integer, f integer, g integer, h integer);
postgres=# insert into gstest2 values
postgres-#   (1, 1, 1, 1, 1, 1, 1, 1),
postgres-#   (1, 1, 1, 1, 1, 1, 1, 2),
postgres-#   (1, 1, 1, 1, 1, 1, 2, 2),
postgres-#   (1, 1, 1, 1, 1, 2, 2, 2),
postgres-#   (1, 1, 1, 1, 2, 2, 2, 2),
postgres-#   (1, 1, 1, 2, 2, 2, 2, 2),
postgres-#   (1, 1, 2, 2, 2, 2, 2, 2),
postgres-#   (1, 2, 2, 2, 2, 2, 2, 2),
postgres-#   (2, 2, 2, 2, 2, 2, 2, 2);
INSERT 0 9
postgres=# 
postgres=# select a, b, sum(c), sum(sum(c)) over (order by a,b) as rsum
postgres-#   from gstest2 group by rollup (a,b) order by rsum, a, b;
 a | b | sum | rsum 
---+---+-+--
 1 | 1 |  16 |   16
 1 | 2 |   4 |   20
 1 |   |  20 |   40
 2 | 2 |   4 |   44
 2 |   |   4 |   48
   |   |  24 |   72
(6 rows)
{code}
{code:java}
scala> sql("""
 | select a, b, sum(c), sum(sum(c)) over (order by a,b) as rsum
 |   from gstest2 group by rollup (a,b) order by rsum, a, b
 | """).show()
+++--++ 
|   a|   b|sum(c)|rsum|
+++--++
|null|null|12|  12|
|   1|null|10|  22|
|   1|   1| 8|  30|
|   1|   2| 2|  32|
|   2|null| 2|  34|
|   2|   2| 2|  36|
+++--++
{code}


> Different answers in nested aggregates with window functions
> 
>
> Key: SPARK-29699
> URL: https://issues.apache.org/jira/browse/SPARK-29699
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Takeshi Yamamuro
>Priority: Major
>  Labels: correctness
>
> A nested aggregate below with a window function seems to have different 
> answers in the `rsum` column  between PgSQL and Spark;
> {code:java}
> postgres=# create table gstest2 (a integer, b integer, c integer, d integer, 
> e integer, f integer, g integer, h integer);
> postgres=# insert into gstest2 values
> postgres-#   (1, 1, 1, 1, 1, 1, 1, 1),
> postgres-#   (1, 1, 1, 1, 1, 1, 1, 2),
> postgres-#   (1, 1, 1, 1, 1, 1, 2, 2),
> postgres-#   (1, 1, 1, 1, 1, 2, 2, 2),
> postgres-#   (1, 1, 1, 1, 2, 2, 2, 2),
> postgres-#   (1, 1, 1, 2, 2, 2, 2, 2),
> postgres-#   (1, 1, 2, 2, 2, 2, 2, 2),
> postgres-#   (1, 2, 2, 2, 2, 2, 2, 2),
> postgres-#   (2, 2, 2, 2, 2, 2, 2, 2);
> INSERT 0 9
> postgres=# 
> postgres=# select a, b, sum(c), sum(sum(c)) over (order by a,b) as rsum
> postgres-#   from gstest2 group by rollup (a,b) order by rsum, a, b;
>  a | b | sum | rsum 
> ---+---+-+--
>  1 | 1 |   8 |8
>  1 | 2 |   2 |   10
>  1 |   |  10 |   20
>  2 | 2 |   2 |   22
>  2 |   |   2 |   24
>|   |  12 |   36
> (6 rows)
> {code}
> {code:java}
> scala> sql("""
>  | select a, b, sum(c), sum(sum(c)) over (order by a,b) as rsum
>  |   from gstest2 group by rollup (a,b) order by rsum, a, b
>  | """).show()
> +++--++   
>   
> |   a|   b|sum(c)|rsum|
> 

[jira] [Updated] (SPARK-29699) Different answers in nested aggregates with window functions

2019-12-01 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-29699:

Labels: correctness  (was: )

> Different answers in nested aggregates with window functions
> 
>
> Key: SPARK-29699
> URL: https://issues.apache.org/jira/browse/SPARK-29699
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Takeshi Yamamuro
>Priority: Major
>  Labels: correctness
>
> A nested aggregate below with a window function seems to have different 
> answers in the `rsum` column  between PgSQL and Spark;
> {code:java}
> postgres=# create table gstest2 (a integer, b integer, c integer, d integer, 
> e integer, f integer, g integer, h integer);
> postgres=# insert into gstest2 values
> postgres-#   (1, 1, 1, 1, 1, 1, 1, 1),
> postgres-#   (1, 1, 1, 1, 1, 1, 1, 2),
> postgres-#   (1, 1, 1, 1, 1, 1, 2, 2),
> postgres-#   (1, 1, 1, 1, 1, 2, 2, 2),
> postgres-#   (1, 1, 1, 1, 2, 2, 2, 2),
> postgres-#   (1, 1, 1, 2, 2, 2, 2, 2),
> postgres-#   (1, 1, 2, 2, 2, 2, 2, 2),
> postgres-#   (1, 2, 2, 2, 2, 2, 2, 2),
> postgres-#   (2, 2, 2, 2, 2, 2, 2, 2);
> INSERT 0 9
> postgres=# 
> postgres=# select a, b, sum(c), sum(sum(c)) over (order by a,b) as rsum
> postgres-#   from gstest2 group by rollup (a,b) order by rsum, a, b;
>  a | b | sum | rsum 
> ---+---+-+--
>  1 | 1 |  16 |   16
>  1 | 2 |   4 |   20
>  1 |   |  20 |   40
>  2 | 2 |   4 |   44
>  2 |   |   4 |   48
>|   |  24 |   72
> (6 rows)
> {code}
> {code:java}
> scala> sql("""
>  | select a, b, sum(c), sum(sum(c)) over (order by a,b) as rsum
>  |   from gstest2 group by rollup (a,b) order by rsum, a, b
>  | """).show()
> +++--++   
>   
> |   a|   b|sum(c)|rsum|
> +++--++
> |null|null|12|  12|
> |   1|null|10|  22|
> |   1|   1| 8|  30|
> |   1|   2| 2|  32|
> |   2|null| 2|  34|
> |   2|   2| 2|  36|
> +++--++
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-29699) Different answers in nested aggregates with window functions

2019-12-01 Thread Xiao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-29699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-29699:

Target Version/s: 3.0.0

> Different answers in nested aggregates with window functions
> 
>
> Key: SPARK-29699
> URL: https://issues.apache.org/jira/browse/SPARK-29699
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.0.0
>Reporter: Takeshi Yamamuro
>Priority: Major
>  Labels: correctness
>
> A nested aggregate below with a window function seems to have different 
> answers in the `rsum` column  between PgSQL and Spark;
> {code:java}
> postgres=# create table gstest2 (a integer, b integer, c integer, d integer, 
> e integer, f integer, g integer, h integer);
> postgres=# insert into gstest2 values
> postgres-#   (1, 1, 1, 1, 1, 1, 1, 1),
> postgres-#   (1, 1, 1, 1, 1, 1, 1, 2),
> postgres-#   (1, 1, 1, 1, 1, 1, 2, 2),
> postgres-#   (1, 1, 1, 1, 1, 2, 2, 2),
> postgres-#   (1, 1, 1, 1, 2, 2, 2, 2),
> postgres-#   (1, 1, 1, 2, 2, 2, 2, 2),
> postgres-#   (1, 1, 2, 2, 2, 2, 2, 2),
> postgres-#   (1, 2, 2, 2, 2, 2, 2, 2),
> postgres-#   (2, 2, 2, 2, 2, 2, 2, 2);
> INSERT 0 9
> postgres=# 
> postgres=# select a, b, sum(c), sum(sum(c)) over (order by a,b) as rsum
> postgres-#   from gstest2 group by rollup (a,b) order by rsum, a, b;
>  a | b | sum | rsum 
> ---+---+-+--
>  1 | 1 |  16 |   16
>  1 | 2 |   4 |   20
>  1 |   |  20 |   40
>  2 | 2 |   4 |   44
>  2 |   |   4 |   48
>|   |  24 |   72
> (6 rows)
> {code}
> {code:java}
> scala> sql("""
>  | select a, b, sum(c), sum(sum(c)) over (order by a,b) as rsum
>  |   from gstest2 group by rollup (a,b) order by rsum, a, b
>  | """).show()
> +++--++   
>   
> |   a|   b|sum(c)|rsum|
> +++--++
> |null|null|12|  12|
> |   1|null|10|  22|
> |   1|   1| 8|  30|
> |   1|   2| 2|  32|
> |   2|null| 2|  34|
> |   2|   2| 2|  36|
> +++--++
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org