[ 
https://issues.apache.org/jira/browse/SPARK-24424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Li updated SPARK-24424:
----------------------------
    Description: 
Currently, our Group By clause follows Hive 
[https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation,+Cube,+Grouping+and+Rollup]
 :
 However, this does not match ANSI SQL compliance. The proposal is to update 
our parser and analyzer for ANSI compliance. 
 For example,
{code:java}
GROUP BY col1, col2 WITH ROLLUP

GROUP BY col1, col2 WITH CUBE

GROUP BY col1, col2 GROUPING SET ...
{code}
It is nice to support ANSI SQL syntax at the same time.
{code:java}
GROUP BY ROLLUP(col1, col2)

GROUP BY CUBE(col1, col2)

GROUP BY GROUPING SET(...) 
{code}
Note, we only need to support one-level grouping set in this stage. That means, 
nested grouping set is not supported.

The parser changes should be like
{code}
group-by-expressions

>>-GROUP BY----+-hive-sql-group-by-expressions-----+---><
               '-ansi-sql-grouping-set-expressions-'    

hive-sql-group-by-expressions

                        '--GROUPING SETS--(--grouping-set-expressions--)--'
   .-,--------------.   +--WITH CUBE--------------------------------------+
   V                |   +--WITH ROLLUP------------------------------------+
>>---+-expression-+-+---+-------------------------------------------------+-><

grouping-expressions-list

   .-,--------------.  
   V                |  
>>---+-expression-+-+--><


grouping-set-expressions

    .-,----------------------------.
    |      .-,--------------.      |
    |      V                |      |
    V '-(------expression---+-)-'  |
>>----+-expression--------------+--+-><


ansi-sql-grouping-set-expressions

>>-+-ROLLUP--(--grouping-expression-list--)---------+--><
   +-CUBE--(--grouping-expression-list--)-----------+   
   '-GROUPING SETS--(--grouping-set-expressions--)--'  
{code}
 

  was:
Currently, our Group By clause follows Hive 
[https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation,+Cube,+Grouping+and+Rollup]
 :
 However, this does not match ANSI SQL compliance. The proposal is to update 
our parser and analyzer for ANSI compliance. 
 For example,
{code:java}
GROUP BY col1, col2 WITH ROLLUP

GROUP BY col1, col2 WITH CUBE

GROUP BY col1, col2 GROUPING SET ...
{code}
It is nice to support ANSI SQL syntax at the same time.
{code:java}
GROUP BY ROLLUP(col1, col2)

GROUP BY CUBE(col1, col2)

GROUP BY GROUPING SET(...) 
{code}
Note, we only need to support one-level grouping set in this stage. That means, 
nested grouping set is not supported.

The parser changes should be like

group-by-expressions

>>-GROUP BY----+-hive-sql-group-by-expressions-----+---><
 '-ansi-sql-grouping-set-expressions-'

hive-sql-group-by-expressions

'--GROUPING SETS--(--grouping-set-expressions--)--'
 .-,--------------. +--WITH CUBE--------------------------------------+
 V | +--WITH ROLLUP------------------------------------+
>>---+-expression-+-+---+-------------------------------------------------+-><

grouping-expressions-list

.-,--------------. 
 V | 
>>---+-expression-+-+--><


grouping-set-expressions

.-,----------------------------.
 | .-,--------------. |
 | V | |
 V '-(------expression---+-)-' |
>>----+-expression--------------+--+-><


ansi-sql-grouping-set-expressions

>>-+-ROLLUP--(--grouping-expression-list--)---------+--><
 +-CUBE--(--grouping-expression-list--)-----------+ 
 '-GROUPING SETS--(--grouping-set-expressions--)--'


> Support ANSI-SQL compliant syntax for ROLLUP, CUBE and GROUPING SET
> -------------------------------------------------------------------
>
>                 Key: SPARK-24424
>                 URL: https://issues.apache.org/jira/browse/SPARK-24424
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 2.3.0
>            Reporter: Xiao Li
>            Priority: Major
>
> Currently, our Group By clause follows Hive 
> [https://cwiki.apache.org/confluence/display/Hive/Enhanced+Aggregation,+Cube,+Grouping+and+Rollup]
>  :
>  However, this does not match ANSI SQL compliance. The proposal is to update 
> our parser and analyzer for ANSI compliance. 
>  For example,
> {code:java}
> GROUP BY col1, col2 WITH ROLLUP
> GROUP BY col1, col2 WITH CUBE
> GROUP BY col1, col2 GROUPING SET ...
> {code}
> It is nice to support ANSI SQL syntax at the same time.
> {code:java}
> GROUP BY ROLLUP(col1, col2)
> GROUP BY CUBE(col1, col2)
> GROUP BY GROUPING SET(...) 
> {code}
> Note, we only need to support one-level grouping set in this stage. That 
> means, nested grouping set is not supported.
> The parser changes should be like
> {code}
> group-by-expressions
> >>-GROUP BY----+-hive-sql-group-by-expressions-----+---><
>                '-ansi-sql-grouping-set-expressions-'    
> hive-sql-group-by-expressions
>                         '--GROUPING SETS--(--grouping-set-expressions--)--'
>    .-,--------------.   +--WITH CUBE--------------------------------------+
>    V                |   +--WITH ROLLUP------------------------------------+
> >>---+-expression-+-+---+-------------------------------------------------+-><
> grouping-expressions-list
>    .-,--------------.  
>    V                |  
> >>---+-expression-+-+--><
> grouping-set-expressions
>     .-,----------------------------.
>     |      .-,--------------.      |
>     |      V                |      |
>     V '-(------expression---+-)-'  |
> >>----+-expression--------------+--+-><
> ansi-sql-grouping-set-expressions
> >>-+-ROLLUP--(--grouping-expression-list--)---------+--><
>    +-CUBE--(--grouping-expression-list--)-----------+   
>    '-GROUPING SETS--(--grouping-set-expressions--)--'  
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to