Re: Composite Column Query Modeling

2012-09-16 Thread aaron morton
>  I may be missing something, but it looks like you pass multiple keys but 
> only a singular SlicePredicate
My bad. 
I was probably thinking "multiple gets" but wrote multigets. 

If Collections don't help maybe you need to support both query types using 
separate CF's. Or a secondary index for the s value. 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 15/09/2012, at 3:08 AM, Adam Holmberg  wrote:

> I think what you're describing might give me what I'm after, but I don't see 
> how I can pass different column slices in a multiget call. I may be missing 
> something, but it looks like you pass multiple keys but only a singular 
> SlicePredicate. Please let me know if that's not what you meant.
> 
> I'm aware of CQL3 collections, but I don't think they quite suite my needs in 
> this case.
> 
> Thanks for the suggestions!
> 
> Adam
> 
> On Fri, Sep 14, 2012 at 1:56 AM, aaron morton  wrote:
> You _could_ use one wide row and do a multiget against the same row for 
> different column slices. Would be less efficient than a single get against 
> the row. But you could still do big contiguous column slices. 
> 
> You may get some benefit from the collections in CQL 3 
> http://www.datastax.com/dev/blog/cql3_collections
> 
> Hope that helps. 
> 
> 
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 14/09/2012, at 8:31 AM, Adam Holmberg  wrote:
> 
>> I'm modeling a new application and considering the use of SuperColumn vs. 
>> Composite Column paradigms. I understand that SuperColumns are discouraged 
>> in new development, but I'm pondering a query where it seems like 
>> SuperColumns might be better suited.
>> 
>> Consider a CF with SuperColumn layout as follows
>> 
>> t = {
>>   k1: {
>> s1: { c1:v1, c2:v2 },
>> s2: { c1:v3, c2:v4 },
>> s3: { c1:v5, c2:v6}
>> ...
>>   }
>>   ...
>> }
>> 
>> Which might be modeled in CQL3:
>> 
>> CREATE TABLE t (
>>   k text,
>>   s text,
>>   c1 text,
>>   c2 text,
>>   PRIMARY KEY (k, s)
>> );
>> 
>> I know that it is possible to do range slices with either approach. However, 
>> with SuperColumns I can do sparse slice queries with a set (list) of column 
>> names as the SlicePredicate. I understand that the composites API only 
>> returns contiguous slices, but I keep finding myself wanting to do a query 
>> as follows:
>> 
>> SELECT * FROM t WHERE k = 'foo' AND s IN (1,3);
>> 
>> The question: Is there a recommended technique for emulating sparse column 
>> slices in composites?
>> 
>> One suggestion I've read is to get the entire range and filter client side. 
>> This is pretty punishing if the range is large and the second keys being 
>> queried are sparse. Additionally, there are enough keys being queried that 
>> calling once per key is undesirable.
>> 
>> I also realize that I could manually composite k:s as the row key and use 
>> multiget, but this gives away the benefit of having these records proximate 
>> when range queries are used. 
>> 
>> Any input on modeling/query techniques would be appreciated.
>> 
>> Regards,
>> Adam Holmberg
>> 
>> 
>> P.S./Sidebar:
>> 
>> What this seems like to me is a desire for 'multiget' at the second key 
>> level analogous to multiget at the row key level. Is this something that 
>> could be implemented in the server using SlicePredicate.column_names? Is 
>> this just an implementation gap, or is there something technical I'm 
>> overlooking? 
> 
> 



Re: Composite Column Query Modeling

2012-09-14 Thread Hiller, Dean
There is another trick here.  On the playOrm open source project, we need to do 
a sparse query for a join and so we send out 100 async requests and cache up 
the java "Future" objects and return the first needed result back without 
waiting for the others.  With the S-SQLin playOrm, we have the IN clause coming 
soon as well in which we will use the same technique so as you iterate over the 
1, 3, 29, 56 rows, results may still be coming in as it only blocks if it gets 
to 3 and the result for 3 has not come back yet.

Anyways, just an option.  (ps. This option helps us query a 1,000,000 row 
partition in 60ms ;) and we still haven't added the lookahead cursors which 
should speed some systems up as well as it fetches stuff while you are working 
on the first batch of results)

Later,
Dean

From: Adam Holmberg 
mailto:adam.holmberg.l...@gmail.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
mailto:user@cassandra.apache.org>>
Date: Friday, September 14, 2012 9:08 AM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
mailto:user@cassandra.apache.org>>
Subject: Re: Composite Column Query Modeling

I think what you're describing might give me what I'm after, but I don't see 
how I can pass different column slices in a multiget call. I may be missing 
something, but it looks like you pass multiple keys but only a singular 
SlicePredicate. Please let me know if that's not what you meant.

I'm aware of CQL3 collections, but I don't think they quite suite my needs in 
this case.

Thanks for the suggestions!

Adam

On Fri, Sep 14, 2012 at 1:56 AM, aaron morton 
mailto:aa...@thelastpickle.com>> wrote:
You _could_ use one wide row and do a multiget against the same row for 
different column slices. Would be less efficient than a single get against the 
row. But you could still do big contiguous column slices.

You may get some benefit from the collections in CQL 3 
http://www.datastax.com/dev/blog/cql3_collections

Hope that helps.


-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 14/09/2012, at 8:31 AM, Adam Holmberg 
mailto:adam.holmberg.l...@gmail.com>> wrote:

I'm modeling a new application and considering the use of SuperColumn vs. 
Composite Column paradigms. I understand that SuperColumns are discouraged in 
new development, but I'm pondering a query where it seems like SuperColumns 
might be better suited.

Consider a CF with SuperColumn layout as follows

t = {
  k1: {
s1: { c1:v1, c2:v2 },
s2: { c1:v3, c2:v4 },
s3: { c1:v5, c2:v6}
...
  }
  ...
}

Which might be modeled in CQL3:

CREATE TABLE t (
  k text,
  s text,
  c1 text,
  c2 text,
  PRIMARY KEY (k, s)
);

I know that it is possible to do range slices with either approach. However, 
with SuperColumns I can do sparse slice queries with a set (list) of column 
names as the SlicePredicate. I understand that the composites API only returns 
contiguous slices, but I keep finding myself wanting to do a query as follows:

SELECT * FROM t WHERE k = 'foo' AND s IN (1,3);

The question: Is there a recommended technique for emulating sparse column 
slices in composites?

One suggestion I've read is to get the entire range and filter client side. 
This is pretty punishing if the range is large and the second keys being 
queried are sparse. Additionally, there are enough keys being queried that 
calling once per key is undesirable.

I also realize that I could manually composite k:s as the row key and use 
multiget, but this gives away the benefit of having these records proximate 
when range queries are used.

Any input on modeling/query techniques would be appreciated.

Regards,
Adam Holmberg


P.S./Sidebar:

What this seems like to me is a desire for 'multiget' at the second key level 
analogous to multiget at the row key level. Is this something that could be 
implemented in the server using SlicePredicate.column_names? Is this just an 
implementation gap, or is there something technical I'm overlooking?




Re: Composite Column Query Modeling

2012-09-14 Thread Adam Holmberg
I think what you're describing might give me what I'm after, but I don't
see how I can pass different column slices in a multiget call. I may be
missing something, but it looks like you pass multiple keys but only a
singular SlicePredicate. Please let me know if that's not what you meant.

I'm aware of CQL3 collections, but I don't think they quite suite my needs
in this case.

Thanks for the suggestions!

Adam

On Fri, Sep 14, 2012 at 1:56 AM, aaron morton wrote:

> You _could_ use one wide row and do a multiget against the same row for
> different column slices. Would be less efficient than a single get against
> the row. But you could still do big contiguous column slices.
>
> You may get some benefit from the collections in CQL 3
> http://www.datastax.com/dev/blog/cql3_collections
>
> Hope that helps.
>
>
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 14/09/2012, at 8:31 AM, Adam Holmberg 
> wrote:
>
> I'm modeling a new application and considering the use of SuperColumn vs.
> Composite Column paradigms. I understand that SuperColumns are discouraged
> in new development, but I'm pondering a query where it seems like
> SuperColumns might be better suited.
>
> Consider a CF with SuperColumn layout as follows
>
> t = {
>   k1: {
> s1: { c1:v1, c2:v2 },
> s2: { c1:v3, c2:v4 },
> s3: { c1:v5, c2:v6}
> ...
>   }
>   ...
> }
>
> Which might be modeled in CQL3:
>
> CREATE TABLE t (
>   k text,
>   s text,
>   c1 text,
>   c2 text,
>   PRIMARY KEY (k, s)
> );
>
> I know that it is possible to do range slices with either approach.
> However, with SuperColumns I can do sparse slice queries with a set (list)
> of column names as the SlicePredicate. I understand that the composites API
> only returns contiguous slices, but I keep finding myself wanting to do a
> query as follows:
>
> SELECT * FROM t WHERE k = 'foo' AND s IN (1,3);
>
> The question: Is there a recommended technique for emulating sparse column
> slices in composites?
>
> One suggestion I've read is to get the entire range and filter client
> side. This is pretty punishing if the range is large and the second keys
> being queried are sparse. Additionally, there are enough keys being queried
> that calling once per key is undesirable.
>
> I also realize that I could manually composite k:s as the row key and use
> multiget, but this gives away the benefit of having these records proximate
> when range queries *are* used.
>
> Any input on modeling/query techniques would be appreciated.
>
> Regards,
> Adam Holmberg
>
>
> P.S./Sidebar:
> 
> What this seems like to me is a desire for 'multiget' at the second key
> level analogous to multiget at the row key level. Is this something that
> could be implemented in the server using SlicePredicate.column_names? Is
> this just an implementation gap, or is there something technical I'm
> overlooking?
>
>
>


Re: Composite Column Query Modeling

2012-09-14 Thread aaron morton
You _could_ use one wide row and do a multiget against the same row for 
different column slices. Would be less efficient than a single get against the 
row. But you could still do big contiguous column slices. 

You may get some benefit from the collections in CQL 3 
http://www.datastax.com/dev/blog/cql3_collections

Hope that helps. 


-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 14/09/2012, at 8:31 AM, Adam Holmberg  wrote:

> I'm modeling a new application and considering the use of SuperColumn vs. 
> Composite Column paradigms. I understand that SuperColumns are discouraged in 
> new development, but I'm pondering a query where it seems like SuperColumns 
> might be better suited.
> 
> Consider a CF with SuperColumn layout as follows
> 
> t = {
>   k1: {
> s1: { c1:v1, c2:v2 },
> s2: { c1:v3, c2:v4 },
> s3: { c1:v5, c2:v6}
> ...
>   }
>   ...
> }
> 
> Which might be modeled in CQL3:
> 
> CREATE TABLE t (
>   k text,
>   s text,
>   c1 text,
>   c2 text,
>   PRIMARY KEY (k, s)
> );
> 
> I know that it is possible to do range slices with either approach. However, 
> with SuperColumns I can do sparse slice queries with a set (list) of column 
> names as the SlicePredicate. I understand that the composites API only 
> returns contiguous slices, but I keep finding myself wanting to do a query as 
> follows:
> 
> SELECT * FROM t WHERE k = 'foo' AND s IN (1,3);
> 
> The question: Is there a recommended technique for emulating sparse column 
> slices in composites?
> 
> One suggestion I've read is to get the entire range and filter client side. 
> This is pretty punishing if the range is large and the second keys being 
> queried are sparse. Additionally, there are enough keys being queried that 
> calling once per key is undesirable.
> 
> I also realize that I could manually composite k:s as the row key and use 
> multiget, but this gives away the benefit of having these records proximate 
> when range queries are used. 
> 
> Any input on modeling/query techniques would be appreciated.
> 
> Regards,
> Adam Holmberg
> 
> 
> P.S./Sidebar:
> 
> What this seems like to me is a desire for 'multiget' at the second key level 
> analogous to multiget at the row key level. Is this something that could be 
> implemented in the server using SlicePredicate.column_names? Is this just an 
> implementation gap, or is there something technical I'm overlooking? 



Composite Column Query Modeling

2012-09-13 Thread Adam Holmberg
I'm modeling a new application and considering the use of SuperColumn vs.
Composite Column paradigms. I understand that SuperColumns are discouraged
in new development, but I'm pondering a query where it seems like
SuperColumns might be better suited.

Consider a CF with SuperColumn layout as follows

t = {
  k1: {
s1: { c1:v1, c2:v2 },
s2: { c1:v3, c2:v4 },
s3: { c1:v5, c2:v6}
...
  }
  ...
}

Which might be modeled in CQL3:

CREATE TABLE t (
  k text,
  s text,
  c1 text,
  c2 text,
  PRIMARY KEY (k, s)
);

I know that it is possible to do range slices with either approach.
However, with SuperColumns I can do sparse slice queries with a set (list)
of column names as the SlicePredicate. I understand that the composites API
only returns contiguous slices, but I keep finding myself wanting to do a
query as follows:

SELECT * FROM t WHERE k = 'foo' AND s IN (1,3);

The question: Is there a recommended technique for emulating sparse column
slices in composites?

One suggestion I've read is to get the entire range and filter client side.
This is pretty punishing if the range is large and the second keys being
queried are sparse. Additionally, there are enough keys being queried that
calling once per key is undesirable.

I also realize that I could manually composite k:s as the row key and use
multiget, but this gives away the benefit of having these records proximate
when range queries *are* used.

Any input on modeling/query techniques would be appreciated.

Regards,
Adam Holmberg


P.S./Sidebar:

What this seems like to me is a desire for 'multiget' at the second key
level analogous to multiget at the row key level. Is this something that
could be implemented in the server using SlicePredicate.column_names? Is
this just an implementation gap, or is there something technical I'm
overlooking?