[jira] [Commented] (CASSANDRA-11153) Row offset within a partition

2016-02-10 Thread DOAN DuyHai (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15142012#comment-15142012
 ] 

DOAN DuyHai commented on CASSANDRA-11153:
-

Ok I see your point for those scenarios but adding OFFSET will require updating 
CQL syntax. Don't know how core dev team feel about it.

-- digression ON --
Now if we ever add *OFFSET* keyword to CQL, I can see another interesting 
use-case for future integration with Kakfa: SELECT payload FROM TOPIC ... WHERE 
partition = ... AND OFFSET xyz;
-- digression OFF --

> Row offset within a partition
> -
>
> Key: CASSANDRA-11153
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11153
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Ryan Svihla
>Priority: Minor
>
> While doing this across partitions would be awful, inside of a partition this 
> seems like a reasonable request. Something like:
> SELECT * FROM my_table WHERE bucket='2015-10-10 12:00:00' LIMIT 100 OFFSET 100
> with a schema such as:
> CREATE TABLE my_table (bucket timestamp, id timeuuid, value text, PRIMARY 
> KEY(bucket, id));
> This could ease pain in migration of legacy use cases and I'm not convinced 
> the read cost has to be horrible when it's inside of a single partition.
> EDIT: I'm aware there is already an issue 
> https://issues.apache.org/jira/browse/CASSANDRA-6511. I think the partition 
> key requirement is where we get enough performance to provide the flexibility 
> in dealing with legacy apps that are stuck on a 'go to page 8' concept for 
> their application flow without incurring a huge hit scanning a cluster and 
> tossing the first 5 nodes results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11153) Row offset within a partition

2016-02-10 Thread Ryan Svihla (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15142003#comment-15142003
 ] 

Ryan Svihla commented on CASSANDRA-11153:
-

Use case is something like this:

1. Stateless servers so all useful data has to be passed over the URL (and can 
bump around randomly to different servers so you start talking shared cache for 
the state)
2. permalinks based on page number to never changing data.

the only valid alternative to this is they would have to retain the "start id" 
which still requires them figuring out what that is in the first place, and so 
they'll have to do something like this, client side, still.

Again this is not how I would ever design an application, but this is hyper 
common in legacy use case and it would be nice to give them some approach that 
is "fast enough" for their needs.

> Row offset within a partition
> -
>
> Key: CASSANDRA-11153
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11153
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Ryan Svihla
>Priority: Minor
>
> While doing this across partitions would be awful, inside of a partition this 
> seems like a reasonable request. Something like:
> SELECT * FROM my_table WHERE bucket='2015-10-10 12:00:00' LIMIT 100 OFFSET 100
> with a schema such as:
> CREATE TABLE my_table (bucket timestamp, id timeuuid, value text, PRIMARY 
> KEY(bucket, id));
> This could ease pain in migration of legacy use cases and I'm not convinced 
> the read cost has to be horrible when it's inside of a single partition.
> EDIT: I'm aware there is already an issue 
> https://issues.apache.org/jira/browse/CASSANDRA-6511. I think the partition 
> key requirement is where we get enough performance to provide the flexibility 
> in dealing with legacy apps that are stuck on a 'go to page 8' concept for 
> their application flow without incurring a huge hit scanning a cluster and 
> tossing the first 5 nodes results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11153) Row offset within a partition

2016-02-10 Thread DOAN DuyHai (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15141995#comment-15141995
 ] 

DOAN DuyHai commented on CASSANDRA-11153:
-

If we're talking about paging, why don't people pass the paging state 
serialized as string to the front-end and save them the hassle of managing it 
manually with offset ?

> Row offset within a partition
> -
>
> Key: CASSANDRA-11153
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11153
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Ryan Svihla
>Priority: Minor
>
> While doing this across partitions would be awful, inside of a partition this 
> seems like a reasonable request. Something like:
> SELECT * FROM my_table WHERE bucket='2015-10-10 12:00:00' LIMIT 100 OFFSET 100
> with a schema such as:
> CREATE TABLE my_table (bucket timestamp, id timeuuid, value text, PRIMARY 
> KEY(bucket, id));
> This could ease pain in migration of legacy use cases and I'm not convinced 
> the read cost has to be horrible when it's inside of a single partition.
> EDIT: I'm aware there is already an issue 
> https://issues.apache.org/jira/browse/CASSANDRA-6511. I think the partition 
> key requirement is where we get enough performance to provide the flexibility 
> in dealing with legacy apps that are stuck on a 'go to page 8' concept for 
> their application flow without incurring a huge hit scanning a cluster and 
> tossing the first 5 nodes results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11153) Row offset within a partition

2016-02-10 Thread Ryan Svihla (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15141673#comment-15141673
 ] 

Ryan Svihla commented on CASSANDRA-11153:
-

Ok thinking about this is more, this is basically a "what is worse" option. 
Folks that need to do horrible paging queries because of some legacy interface 
that they pass around (http://foo.com/?page=9) will just read the whole 
partition and throw away the extra AKA lousy client side mode.

If we required this be an "ALLOW FILTERING" option, I think that would tag it 
as a bad idea, but still enable people to do something less horrible than what 
they're already doing.

> Row offset within a partition
> -
>
> Key: CASSANDRA-11153
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11153
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Ryan Svihla
>Priority: Minor
>
> While doing this across partitions would be awful, inside of a partition this 
> seems like a reasonable request. Something like:
> SELECT * FROM my_table WHERE bucket='2015-10-10 12:00:00' LIMIT 100 OFFSET 100
> with a schema such as:
> CREATE TABLE my_table (bucket timestamp, id timeuuid, value text, PRIMARY 
> KEY(bucket, id));
> This could ease pain in migration of legacy use cases and I'm not convinced 
> the read cost has to be horrible when it's inside of a single partition.
> EDIT: I'm aware there is already an issue 
> https://issues.apache.org/jira/browse/CASSANDRA-6511. I think the partition 
> key requirement is where we get enough performance to provide the flexibility 
> in dealing with legacy apps that are stuck on a 'go to page 8' concept for 
> their application flow without incurring a huge hit scanning a cluster and 
> tossing the first 5 nodes results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11153) Row offset within a partition

2016-02-10 Thread Ryan Svihla (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15141646#comment-15141646
 ] 

Ryan Svihla commented on CASSANDRA-11153:
-

That's a true argument of pagers in general and people still implement them 
anyway. But if it's data that isn't updated I'm not sure why it wouldn't be 
consistent.

> Row offset within a partition
> -
>
> Key: CASSANDRA-11153
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11153
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Ryan Svihla
>Priority: Minor
>
> While doing this across partitions would be awful, inside of a partition this 
> seems like a reasonable request. Something like:
> SELECT * FROM my_table WHERE bucket='2015-10-10 12:00:00' LIMIT 100 OFFSET 100
> with a schema such as:
> CREATE TABLE my_table (bucket timestamp, id timeuuid, value text, PRIMARY 
> KEY(bucket, id));
> This could ease pain in migration of legacy use cases and I'm not convinced 
> the read cost has to be horrible when it's inside of a single partition.
> EDIT: I'm aware there is already an issue 
> https://issues.apache.org/jira/browse/CASSANDRA-6511. I think the partition 
> key requirement is where we get enough performance to provide the flexibility 
> in dealing with legacy apps that are stuck on a 'go to page 8' concept for 
> their application flow without incurring a huge hit scanning a cluster and 
> tossing the first 5 nodes results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11153) Row offset within a partition

2016-02-10 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15141641#comment-15141641
 ] 

Brandon Williams commented on CASSANDRA-11153:
--

I suspect a numeric offset isn't going to provide a consistent result, which is 
why we allow something like:

SELECT * FROM my_table WHERE bucket='2015-10-10 12:00:00' and id > 'foo' LIMIT 
100;

> Row offset within a partition
> -
>
> Key: CASSANDRA-11153
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11153
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Ryan Svihla
>Priority: Minor
>
> While doing this across partitions would be awful, inside of a partition this 
> seems like a reasonable request. Something like:
> SELECT * FROM my_table WHERE bucket='2015-10-10 12:00:00' LIMIT 100 OFFSET 100
> with a schema such as:
> CREATE TABLE my_table (bucket timestamp, id timeuuid, value text, PRIMARY 
> KEY(bucket, id));
> This could ease pain in migration of legacy use cases and I'm not convinced 
> the read cost has to be horrible when it's inside of a single partition.
> EDIT: I'm aware there is already an issue 
> https://issues.apache.org/jira/browse/CASSANDRA-6511. I think the partition 
> key requirement is where we get enough performance to provide the flexibility 
> in dealing with legacy apps that are stuck on a 'go to page 8' concept for 
> their application flow without incurring a huge hit scanning a cluster and 
> tossing the first 5 nodes results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)