[ 
https://issues.apache.org/jira/browse/SPARK-24424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-24424:
----------------------------
    Description: 
Currently, our Group By clause follows Hive 
[https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation,+Cube,+Grouping+and+Rollup]
 :
 However, this does not match ANSI SQL compliance. The proposal is to update 
our parser and analyzer for ANSI compliance. 
 For example,
{code:java}
GROUP BY col1, col2 WITH ROLLUP

GROUP BY col1, col2 WITH CUBE

GROUP BY col1, col2 GROUPING SET ...
{code}
It is nice to support ANSI SQL syntax at the same time.
{code:java}
GROUP BY ROLLUP(col1, col2)

GROUP BY CUBE(col1, col2)

GROUP BY GROUPING SET(...) 
{code}
Note, we only need to support one-level grouping set in this stage. That means, 
nested grouping set is not supported.

The parser changes should be like

group-by-expressions

>>-GROUP BY----+-hive-sql-group-by-expressions-----+---><
 '-ansi-sql-grouping-set-expressions-'

hive-sql-group-by-expressions

'--GROUPING SETS--(--grouping-set-expressions--)--'
 .-,--------------. +--WITH CUBE--------------------------------------+
 V | +--WITH ROLLUP------------------------------------+
>>---+-expression-+-+---+-------------------------------------------------+-><

grouping-expressions-list

.-,--------------. 
 V | 
>>---+-expression-+-+--><


grouping-set-expressions

.-,----------------------------.
 | .-,--------------. |
 | V | |
 V '-(------expression---+-)-' |
>>----+-expression--------------+--+-><


ansi-sql-grouping-set-expressions

>>-+-ROLLUP--(--grouping-expression-list--)---------+--><
 +-CUBE--(--grouping-expression-list--)-----------+ 
 '-GROUPING SETS--(--grouping-set-expressions--)--'

  was:
Currently, our Group By clause follows Hive 
[https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation,+Cube,+Grouping+and+Rollup]
 :
 However, this does not match ANSI SQL compliance. The proposal is to update 
our parser and analyzer for ANSI compliance. 
 For example,
{code:java}
GROUP BY col1, col2 WITH ROLLUP

GROUP BY col1, col2 WITH CUBE

GROUP BY col1, col2 GROUPING SET ...
{code}
It is nice to support ANSI SQL syntax at the same time.
{code:java}
GROUP BY ROLLUP(col1, col2)

GROUP BY CUBE(col1, col2)

GROUP BY GROUPING SET(...) 
{code}
Note, we only need to support one-level grouping set in this stage. That means, 
nested grouping set is not supported.

The parser changes should be like

 
group-by-expressions
 
>>-GROUP BY----+-hive-sql-group-by-expressions-----+---><
               *{color:#ff0000}'-ansi-sql-grouping-set-expressions-'{color}*    
 
hive-sql-group-by-expressions
 
                        '--GROUPING SETS--(--grouping-set-expressions--)--'
   .-,--------------.   +--WITH CUBE--------------------------------------+
   V                |   +--WITH ROLLUP------------------------------------+
>>---+-expression-+-+---+-------------------------------------------------+-><
 
grouping-expressions-list
 
   .-,--------------.  
   V                |  
>>---+-expression-+-+--><
 
 
grouping-set-expressions
 
    .-,----------------------------.
    |      .-,--------------.      |
    |      V                |      |
    V '-(------expression---+-)-'  |
>>----+-expression--------------+--+-><
 
 
{color:#ff0000}ansi-sql-grouping-set-expressions{color}
{color:#ff0000} {color}
{color:#ff0000}>>-+-ROLLUP--(--grouping-expression-list--)---------+--><{color}
{color:#ff0000}   +-CUBE--(--grouping-expression-list--)-----------+   {color}
{color:#ff0000}   '-GROUPING SETS--(--grouping-set-expressions--)--' {color}


> Support ANSI-SQL compliant syntax for ROLLUP, CUBE and GROUPING SET
> -------------------------------------------------------------------
>
>                 Key: SPARK-24424
>                 URL: https://issues.apache.org/jira/browse/SPARK-24424
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.3.0
>            Reporter: Xiao Li
>            Priority: Major
>
> Currently, our Group By clause follows Hive 
> [https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation,+Cube,+Grouping+and+Rollup]
>  :
>  However, this does not match ANSI SQL compliance. The proposal is to update 
> our parser and analyzer for ANSI compliance. 
>  For example,
> {code:java}
> GROUP BY col1, col2 WITH ROLLUP
> GROUP BY col1, col2 WITH CUBE
> GROUP BY col1, col2 GROUPING SET ...
> {code}
> It is nice to support ANSI SQL syntax at the same time.
> {code:java}
> GROUP BY ROLLUP(col1, col2)
> GROUP BY CUBE(col1, col2)
> GROUP BY GROUPING SET(...) 
> {code}
> Note, we only need to support one-level grouping set in this stage. That 
> means, nested grouping set is not supported.
> The parser changes should be like
> group-by-expressions
> >>-GROUP BY----+-hive-sql-group-by-expressions-----+---><
>  '-ansi-sql-grouping-set-expressions-'
> hive-sql-group-by-expressions
> '--GROUPING SETS--(--grouping-set-expressions--)--'
>  .-,--------------. +--WITH CUBE--------------------------------------+
>  V | +--WITH ROLLUP------------------------------------+
> >>---+-expression-+-+---+-------------------------------------------------+-><
> grouping-expressions-list
> .-,--------------. 
>  V | 
> >>---+-expression-+-+--><
> grouping-set-expressions
> .-,----------------------------.
>  | .-,--------------. |
>  | V | |
>  V '-(------expression---+-)-' |
> >>----+-expression--------------+--+-><
> ansi-sql-grouping-set-expressions
> >>-+-ROLLUP--(--grouping-expression-list--)---------+--><
>  +-CUBE--(--grouping-expression-list--)-----------+ 
>  '-GROUPING SETS--(--grouping-set-expressions--)--'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to