Composite Column Query Modeling

Adam Holmberg Thu, 13 Sep 2012 13:32:02 -0700

I'm modeling a new application and considering the use of SuperColumn vs.
Composite Column paradigms. I understand that SuperColumns are discouraged
in new development, but I'm pondering a query where it seems like
SuperColumns might be better suited.


Consider a CF with SuperColumn layout as follows

t = {
  k1: {
    s1: { c1:v1, c2:v2 },
    s2: { c1:v3, c2:v4 },
    s3: { c1:v5, c2:v6}
    ...
  }
  ...
}

Which might be modeled in CQL3:

CREATE TABLE t (
  k text,
  s text,
  c1 text,
  c2 text,
  PRIMARY KEY (k, s)
);

I know that it is possible to do range slices with either approach.
However, with SuperColumns I can do sparse slice queries with a set (list)
of column names as the SlicePredicate. I understand that the composites API
only returns contiguous slices, but I keep finding myself wanting to do a
query as follows:

SELECT * FROM t WHERE k = 'foo' AND s IN (1,3);

The question: Is there a recommended technique for emulating sparse column
slices in composites?

One suggestion I've read is to get the entire range and filter client side.
This is pretty punishing if the range is large and the second keys being
queried are sparse. Additionally, there are enough keys being queried that
calling once per key is undesirable.

I also realize that I could manually composite k:s as the row key and use
multiget, but this gives away the benefit of having these records proximate
when range queries *are* used.

Any input on modeling/query techniques would be appreciated.

Regards,
Adam Holmberg


P.S./Sidebar:
--------------------
What this seems like to me is a desire for 'multiget' at the second key
level analogous to multiget at the row key level. Is this something that
could be implemented in the server using SlicePredicate.column_names? Is
this just an implementation gap, or is there something technical I'm
overlooking?

Composite Column Query Modeling

Reply via email to