Incidentally, is there any specific reason the collation has to be pre-defined at the CF? What if any column could be an optional supercolumn with a collation set at runtime? Then all CFs would be the same.
Evan On Wed, Aug 12, 2009 at 10:02 PM, Jonathan Ellis<[email protected]> wrote: > If thrift were sane it would look something like > > struct Column { > byte[] name, > optional list<Column> subcolumns, > optional int64 timestamp, > optional byte[] value > } > > "you can either have the subcolumns, or the timestamp and value" seems > reasonable to me. > > of course in the real world, thrift can't do recursive structures, so > we'd have to go with Column/SubColumn like SuperColumn/Column today. > So... maybe not really an improvement after all. :) > > (Why am I not surprised to find out that protocol buffers does support > this? Sigh.) > > On Wed, Aug 12, 2009 at 8:51 PM, Evan Weaver<[email protected]> wrote: >> Hmm, my Ruby client internally refers to columns and subcolumns, >> rather than supercolumns and columns...mainly because the subcolumn >> position is optional, but the column_or_supercolumn position is not. >> So there is something we agree on. >> >> Do you think the lack of a timestamp in the supercolumn is confusing? >> It's still not exactly a kind of column. >> >> Evan >> >> On Wed, Aug 12, 2009 at 9:47 PM, Jonathan Ellis<[email protected]> wrote: >>> I agree with the proposition that the SuperColumn name is weak. >>> (Although not, as I mentioned, Column or ColumnFamily.) And I could >>> go with schema over keyspace. >>> >>> One option to deal with SC would be to excise the term SC (and SCF >>> from the config) and instead just have Columns, which may or may not >>> have SubColumns. You would define this as >>> >>> <ColumnFamily withSubColumns="true" .../> >>> >>> "Insert a subcolumn named A into the Column named B" fits pretty well >>> with how I think of things working. And now you just have Rows and >>> Columns! Just like a RDB! :P >>> >>> -Jonathan >>> >>> On Wed, Aug 12, 2009 at 8:34 PM, Evan Weaver<[email protected]> wrote: >>>> Points taken, and I agree, except in my experience the current names >>>> are not Pretty Good but rather Pretty Weird; the primary issues being >>>> column family and super column. >>>> >>>> If we go by the shorter-is-better principle, we might get: >>>> >>>> Cluster >>>> Schema >>>> Row set >>>> Row w/key >>>> Field set >>>> Field >>>> >>>> "You take the user's key, and use that to insert into the Row Set >>>> 'user_associations' at Field Set 'user_timeline,' a field named with a >>>> time-based UUID representing now, and with a value of the new tweet's >>>> key." >>>> >>>> But let me study for a while and come up with a more researched proposal. >>>> >>>> Evan >>>> >>>> On Wed, Aug 12, 2009 at 9:21 PM, Jonathan Ellis<[email protected]> wrote: >>>>> On Wed, Aug 12, 2009 at 7:52 PM, Michael Koziarski<[email protected]> >>>>> wrote: >>>>>> However I think it's worth considering this from a strategic >>>>>> perspective, looking at how we want the project do grow and change, >>>>>> rather than just as it is right now. The key to successful adoption >>>>>> is having a successful elevator pitch, you can start using a database >>>>>> without understanding relational-algebra because 'table' and 'column' >>>>>> are such simple ways to reason about the tool. As it stands >>>>>> cassandra's takes a whiteboard and 15 minutes, before people get what >>>>>> you're talking about. >>>>> >>>>> If you want to explain it as "sort of like a relational db" then >>>>> >>>>> table -> CF >>>>> column -> column >>>>> key -> key >>>>> row -> row >>>>> >>>>> That's the simple case, then all you have is "supercolumns can contain >>>>> a list of simple columns." >>>>> >>>>> That really doesn't seem so hard to me. I have explained this to >>>>> *managers*. >>>>> >>>>>> Assuming the project gets anything like the adoption it deserves, the >>>>>> users we have today will be a *tiny minority* of the users we have in >>>>>> the future. So imposing costs on the current userbase which will give >>>>>> huge benefits to future users, should be something we're willing to >>>>>> do. In fact it's something that has been done repeatedly over the >>>>>> last few weeks. >>>>> >>>>> I agree. But as I said before I just don't see this as being an >>>>> improvement. >>>>> >>>>>> Given those changes went in without debate, I'm not sure what the >>>>>> reluctance is for making changes to the nomenclature for the project. >>>>> >>>>> As above. >>>>> >>>>>> Speaking as someone who's only been doing this a month, the naming is >>>>>> *still* confusing, and when I talk with people who wonder what >>>>>> cassandra is all about I get blank looks when telling them what things >>>>>> are called. If you step back and want to tell someone how you'd >>>>>> insert a tweet into someone's timeline using evan's weblog post: >>>>>> >>>>>> "You just take the user's key, and use that to insert into the >>>>>> SuperColumnFamily 'UserAssociations' at SubColumn 'user_timeline', a >>>>>> ColumnName of a time based uuid representing now, and a value of the >>>>>> new tweet's key" >>>>>> >>>>>> Column is in the name of 3 of the 5 concepts expressed, and in each >>>>>> cases it's different. >>>>> >>>>> When you're inserting something nested 3 levels deep a certain amount >>>>> of verbosity is unavoidable. With Evan's nomenclature, >>>>> >>>>> "You take the user's record ID, and use that to insert into the Record >>>>> Collection 'user associations' at Attribute Collection >>>>> 'user_timeline,' an Attribute named with a time based uuid >>>>> representing now, and with a value of the new tweet's key." >>>>> >>>>> I think that is a negative improvement. Yay, now we are talking about >>>>> Attribute Collections and Attributes instead of SuperColumns and >>>>> Columns. The same objections ("one object's name contains the >>>>> other's!) apply, plus the new one of sounding so generic that it could >>>>> apply to practically any system. >>>>> >>>>> -Jonathan >>>>> >>>> >>>> >>>> >>>> -- >>>> Evan Weaver >>>> >>> >> >> >> >> -- >> Evan Weaver >> > -- Evan Weaver
