[jira] [Comment Edited] (CASSANDRA-9767) Issue in count with results in the CQL

Ajay (JIRA) Thu, 09 Jul 2015 05:20:22 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-9767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14620397#comment-14620397
 ]


Ajay edited comment on CASSANDRA-9767 at 7/9/15 12:19 PM:
----------------------------------------------------------

This bug is for the below query 

 select count, country from sample where track_id = 1 and user_id = 1;

Bad Request: line 1:15 mismatched input ',' expecting K_FROM.


    Also in SQL, it is possible to select columns along with aggregate 
functions.

It is true but the value returned for the columns is database dependent. MySQL, 
for example, returns any random value from within the group.

Yes it is database dependent. MySQL returns the first value from the group. 

 In you example, you already know the values so I am not sure of what would be 
the benefit of having C* returning them. 

What I meant it all rows for the given partition key have the same value for 
the column. So it doesn't affect even it is returns random or the first one.
With a such query, we can get count along with some columns as required.


was (Author: ajaygarga):
This bug is for the below query 

 select count, country from sample where track_id = 1 and user_id = 1;

Bad Request: line 1:15 mismatched input ',' expecting K_FROM.


    Also in SQL, it is possible to select columns along with aggregate 
functions.

It is true but the value returned for the columns is database dependent. MySQL, 
for example, returns any random value from within the group.

Yes it is database dependent. MySQL returns the first value from the group. 

 In you example, you already know the values so I am not sure of what would be 
the benefit of having C* returning them. 

What I meant it all rows have the same value for the column. So it doesn't 
affect even it is returns random or the first one.
With a such query, we can get count along with some columns as required.

> Issue in count with results in the CQL
> --------------------------------------
>
>                 Key: CASSANDRA-9767
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9767
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: Cassandra 2.0.16
> Ubuntu 15.04
>            Reporter: Ajay
>             Fix For: 2.0.x
>
>
> Lets assume we have a column family as below:
> create table sample ( track_id int, user_id int, country varchar, primary key 
> ((track_id), user_id));
> where track_id is the partition key.
> Now to aggregate the number of rows for a single track_id, we can query using 
> CQL as below:
> select count(*) where track_id = 1 and user_id = 1;
> But that will return only the count. If we need the other columns along with 
> the count, we cannot query as below as it throws error:
>  select count(*), user_id  from sample where track_id = 1 and user_id = 1;
> Bad Request: line 1:15 mismatched input ',' expecting K_FROM.
> In this case, all rows for a given track_id and user_id will have the same 
> value for country. So we should be able to query as above.  Also in SQL, it 
> is possible to select columns along with aggregate functions.
> Though I know that Cassandra is not analytics (unlike Hadoop and Spark), we 
> need some basic aggregate functions like min, max, avg etc....Though 
> performance wise it might not be efficient, but it is better done in the 
> cassandra side (as it uses native protocol) than we getting all rows in the 
> client and doing the basic aggregation.  It cannot used just as a data store 
> (as garbage-in garbage-out). In that context, currently CQL is pretty 
> limited. Just for getting data out of cassandra, we will have to spark though 
> we will not be doing much analytics on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (CASSANDRA-9767) Issue in count with results in the CQL

Reply via email to