On Thu, May 6, 2010 at 5:38 PM, Vijay <vijay2...@gmail.com> wrote: > I would rather be interested in Tree type structure where supercolumns have > supercolumns in it..... you dont need to compare all the columns to find a > set of columns and will also reduce the bytes transfered for separator, at > least string concatenation (Or something like that) for read and write > column name generation. it is more logically stored and structured by this > way.... and also we can make caching work better by selectively caching the > tree (User defined if you will).... > > But nothing wrong in supporting both :) >
I'm 99% sure we're talking about the same thing and we don't need to support both. How names/values are separated is pretty irrelevant. It has to happen somewhere. I agree that it'd be nice if it happened on the server, but doing it in the client makes it easier to explore ideas. On Thu, May 6, 2010 at 5:27 PM, philip andrew <philip14...@gmail.com> wrote: > Please create a new term word if the existing terms are misleading, if its > not a file system then its not good to call it a file system. While it's seriously bikesheddy, I guess you're right. Let's call them "thingies" for now, then. So you can have a top-level "thingy" and it can have an arbitrarily nested tree of sub-"thingies." Each "thingy" has a "thingy type" [1]. You can also tell Cassandra if you want a particular level of "thingy" to be indexed. At one (or maybe more) levels you can tell Cassandra you want your "thingies" to be split onto separate nodes in your cluster. At one (or maybe more) levels you could also tell Cassandra that you want your "thingies" split into separate files [2]. The upshot is, the Cassandra data model would go from being "it's a nested dictionary, just kidding no it's not!" to being "it's a nested dictionary, for serious." Again, these are all just ideas... but I think this simplified data model would allow you to express pretty much any query in a graph of simple primitives like Predicates, Filters, Aggregations, Transformations, etc. The indexes would allow you to cheat when evaluating certain types of queries - if you get a SlicePredicate on an indexed "thingy" you don't have to enumerate the entire set of "sub-thingies" for example. So, you'd query your "thingies" by building out a predicate, transformations, filters, etc., serializing the graph of primitives, and sending it over the wire to Cassandra. Cassandra would rebuild the graph and run it over your dataset. So instead of: Cassandra.get_range_slices( keyspace="AwesomeApp", column_parent=ColumnParent(column_family="user"), slice_predicate=SlicePredicate(column_names=['username', 'dob']), range=KeyRange(start_key='a', end_key='m'), consistency_level=ONE ) You'd do something like: Cassandra.query( SubThingyTransformer( NamePredicate(names=["AwesomeApp"], SubThingyTransformer( NamePredicate(names=["user"]), SubThingyTransformer( SlicePredicate(start="a", end="m"), NamePredicate(names=["username", "dob"]) ) ) ), consistency_level=ONE ) Which seems complicated, but it's basically just [(user['username'], user['dob']) for user in Cassandra['AwesomeApp']['user'].slice('a', 'm')] and could probably be expressed that way in a client library. I think batch_mutate is awesome the way it is and should be the only way to insert/update data. I'd rename it mutate. So our interface becomes: Cassandra.query(query, consistency_level) Cassandra.mutate(mutation, consistency_level) Ta-da. Anyways, I was trying to avoid writing all of this out in prose and try mocking some of it up in code instead. I guess this this works too. Either way, I do think something like this would simplify the codebase, simplify the data model, simplify the interface, make the entire system more flexible, and be generally awesome. Mike [1] These can be subclasses of Thingy in Java... or maybe they'd implement IThingy. But either way they'd handle serialization and probably implement compareTo to define natural ordering. So you'd have classes like ASCIIThingy, UTF8Thingy, and LongThingy (ahem) - these would replace comparators. [2] I think there's another simplification here. Splitting into separate files is really very similar to splitting onto separate nodes. There might be a way around some of the row size limitations with this sort of concept. And we may be able to get better utilization of multiple disks by giving each disk (or data directory) a subset of the node's token range. Caveat: thought not fully baked.