On Wed, Apr 28, 2010 at 5:24 AM, David Boxenhorn <da...@lookin2.com> wrote:
> If I understand correctly, the distinction between supercolumns and > subcolumns is critical to good database design if you want to use random > partitioning: you can do range queries on subcolumns but not on > supercolumns. > > Is this correct? > You can do efficient range queries of normal (not super) columns in a ColumnFamily. I think SuperColumn's are not indexed, so it's less efficient to do a slice of subcolumns from a column, if there are lots of subcolumns. I agree that SuperColumns are technically unnecessary. There aren't any use cases I can come up with that a SuperColumn satisfies that normal Columns can't. You can simulate SuperColumn behavior by concatenating key parts with a separator and using the concatenated key as your column name, then doing a slice. So if you had a SuperColumn that stored usernames, and sub-columns that stored document IDs, you could instead have a normal CF that stores <username>:<document-id>. The only thing SuperColumns appear to buy you (as someone pointed out to me at the Cassandra meetup - I think it was Eric Florenzano) is that you can use different comparator types for the Super/SubColumns, I guess..? But you should be able to do the same thing by creating your own Column comparator. I guess my point is that SuperColumns are mostly a convenience mechanism, as far as I can tell. Mike