[ 
https://issues.apache.org/jira/browse/CASSANDRA-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15179992#comment-15179992
 ] 

 Brian Hess commented on CASSANDRA-10707:
-----------------------------------------

A quick question/clarification on the GROUP BY and ORDER BY discussion from 
above.  The following are valid SQL:

{code:SQL}
SELECT pkey, ccol1, Max(x) As maxx FROM myTable GROUP BY pkey, ccol1 ORDER BY 
ccol1 pkey;
SELECT pkey, ccol1, Max(x) As maxx FROM myTable GROUP BY pkey, ccol1 ORDER BY 
maxx;
{code}

I think you are suggesting that the only real ordering that is allowed is the 
native ordering in the CQL tables.  Specifically, these 2 queries would not be 
supported.  Is that correct?

I think the logic is more like that the following CQL

{code:SQL}
SELECT pkey, ccol1 Max(x) AS maxx FROM myTable GROUP BY pkey, ccol1 ORDER BY 
pkey, ccol1
{code}

turns into

{code:SQL}
SELECT pkey, ccol1, Max(x) AS maxx FROM (
  SELECT pkey, ccol1, x FROM myTable ORDER BY pkey, ccol1
) AS sub1 GROUP BY pkey, ccol1;
{code}

That means that the ORDER BY clause must work in that inner query as valid CQL. 
 More generally:

{code:SQL}
SELECT [grouping columns], [aggregate function]([aggregate columns]) FROM 
[table] GROUP BY [grouping columns] ORDER BY [ordering columns]
{code}

Must satisfy the transformation to:

{code:SQL}
SELECT [grouping columns], [aggregate function]([aggregate columns]) FROM (
  SELECT [grouping columns], [aggregate columns] FROM [table] ORDER BY 
[ordering columns]
) AS sub1 GROUP BY [grouping columns]
{code}

And specifically that the inner sub-query is valid CQL, namely:

{code:SQL}
SELECT [grouping columns], [aggregate columns] FROM [table] ORDER BY [ordering 
columns]
{code}

That is certainly different than SQL, which does not have this restriction.  
I'm +0.5 on having the syntax be the same as SQL as I think it is slightly 
better than the alternative.  I'm just noting that the semantics really are a 
bit different and there are more restrictions with the ORDER BY clause in CQL 
with this ticket than in SQL.  That nuance needs to be called out in the 
documentation or folks will certainly run into the error.

I would also add that if someone uses an incorrect ORDER BY, the error should 
not only call out that it is an error, but also indicate what sorts of ORDER BY 
clauses are supported.

> Add support for Group By to Select statement
> --------------------------------------------
>
>                 Key: CASSANDRA-10707
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10707
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: CQL
>            Reporter: Benjamin Lerer
>            Assignee: Benjamin Lerer
>
> Now that Cassandra support aggregate functions, it makes sense to support 
> {{GROUP BY}} on the {{SELECT}} statements.
> It should be possible to group either at the partition level or at the 
> clustering column level.
> {code}
> SELECT partitionKey, max(value) FROM myTable GROUP BY partitionKey;
> SELECT partitionKey, clustering0, clustering1, max(value) FROM myTable GROUP 
> BY partitionKey, clustering0, clustering1; 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to