[jira] [Commented] (CASSANDRA-10707) Add support for Group By to Select statement

Sylvain Lebresne (JIRA) Tue, 02 Aug 2016 07:12:18 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15404044#comment-15404044
 ]


Sylvain Lebresne commented on CASSANDRA-10707:
----------------------------------------------

Last version mostly look good. The main thing I still don't like is the 
{{filterOnReplica}} method: I feel it's easy to misuse and doesn't feel 
particulary natural. Thinking about this, I feel the underlying issue we're 
trying to solve is more general: the {{DataLimits}} holds state (for paging and 
grouping) which somewhat assumes things are queried sequentially (and in 
order). However, when we do range queries and send queries in parallel to 
nodes, that's not true anymore (except maybe for the first range sent), at 
least not for the queries sent to replica (we still process them in order on 
the coordinator). So anyway, I think a better way to handle this is to 
acknowledge that fact in {{StorageProxy.getRangeSlice}} and drop any state from 
the sub-range commands sent in parallel. I've tried such change in the branch 
attached below (which is also rebased).

The branch also include a commit with a few nits, mostly around comments. Feel 
free to ignore some of it if you don't like it.

| [10707-trunk|https://github.com/pcmanus/cassandra/commits/10707-trunk] | 
[utests|http://cassci.datastax.com/job/pcmanus-10707-trunk-testall] | 
[dtests|http://cassci.datastax.com/job/pcmanus-10707-trunk-dtest] |

I'll note that the dtest run has failures, but this is a ongoing problem with 
CI today. Random tests fail with {{Host has been marked down or removed}} but 
you get that on today trunk run as well: 
http://cassci.datastax.com/view/trunk/job/trunk_dtest/1322/

Anyway, if we can agree on those 2 small commits, then I'm +1 (though we might 
want to wait on CI to stabilize on dtests to make sure).

> Add support for Group By to Select statement
> --------------------------------------------
>
>                 Key: CASSANDRA-10707
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10707
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: CQL
>            Reporter: Benjamin Lerer
>            Assignee: Benjamin Lerer
>             Fix For: 3.x
>
>
> Now that Cassandra support aggregate functions, it makes sense to support 
> {{GROUP BY}} on the {{SELECT}} statements.
> It should be possible to group either at the partition level or at the 
> clustering column level.
> {code}
> SELECT partitionKey, max(value) FROM myTable GROUP BY partitionKey;
> SELECT partitionKey, clustering0, clustering1, max(value) FROM myTable GROUP 
> BY partitionKey, clustering0, clustering1; 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-10707) Add support for Group By to Select statement

Reply via email to