Shaun, I agree with you, but marking them as deprecated is not good enough for me. I can't easily stop using supercolumns. I need an upgrade path.
On Tue, Feb 8, 2011 at 3:53 AM, Shaun Cutts <sh...@cuttshome.net> wrote: > > I'm a newbie here, but, with apologies for my presumptuousness, I think you > should deprecate SuperColumns. They are already distracting you, and as the > years go by the cost of supporting them as you add more and more > functionality is only likely to get worse. It would be better to concentrate > on making the "core" column families better (and I'm sure we can all think > of lots of things we'd like). > > Just dropping SuperColumns would be bad for your reputation -- and for > users like David who are currently using them. But if you mark them clearly > as deprecated and explain why and what to do instead (perhaps putting a bit > of effort into migration tools... or even a "virtual" layer supporting > arbitrary hierarchical data), then you can drop them in a few years (when > you get to 1.0, say), without people feeling betrayed. > > -- Shaun > > On Feb 6, 2011, at 3:48 AM, David Boxenhorn wrote: > > "My main point was to say that it's think it is better to create tickets > for what you want, rather than for something else completely different that > would, as a by-product, give you what you want." > > Then let me say what I want: I want supercolumn families to have any > feature that regular column families have. > > My data model is full of supercolumns. I used them, even though I knew it > didn't *have to*, "because they were there", which implied to me that I was > supposed to use them for some good reason. Now I suspect that they will > gradually become less and less functional, as features are added to regular > column families and not supported for supercolumn families. > > > On Fri, Feb 4, 2011 at 10:58 AM, Sylvain Lebresne <sylv...@datastax.com>wrote: > >> On Fri, Feb 4, 2011 at 12:35 AM, Mike Malone <m...@simplegeo.com> wrote: >> >>> On Thu, Feb 3, 2011 at 6:44 AM, Sylvain Lebresne >>> <sylv...@datastax.com>wrote: >>> >>>> On Thu, Feb 3, 2011 at 3:00 PM, David Boxenhorn <da...@lookin2.com>wrote: >>>> >>>>> The advantage would be to enable secondary indexes on supercolumn >>>>> families. >>>>> >>>> >>>> Then I suggest opening a ticket for adding secondary indexes to >>>> supercolumn families and voting on it. This will be 1 or 2 order of >>>> magnitude less work than getting rid of super column internally, and >>>> probably a much better solution anyway. >>>> >>> >>> I realize that this is largely subjective, and on such matters code >>> speaks louder than words, but I don't think I agree with you on the issue of >>> which alternative is less work, or even which is a better solution. >>> >> >> You are right, I put probably too much emphase in that sentence. My main >> point was to say that it's think it is better to create tickets for what you >> want, rather than for something else completely different that would, as a >> by-product, give you what you want. >> Then I suspect that *if* the only goal is to get secondary indexes on >> super columns, then there is a good chance this would be less work than >> getting rid of super columns. But to be fair, secondary indexes on super >> columns may not make too much sense without #598, which itself would require >> quite some work, so clearly I spoke a bit quickly. >> >> >>> If the goal is to have a hierarchical model, limiting the depth to two >>> seems arbitrary. Why not go all the way and allow an arbitrarily deep >>> hierarchy? >>> >>> If a more sophisticated hierarchical model is deemed unnecessary, or >>> impractical, allowing a depth of two seems inconsistent and >>> unnecessary. It's pretty trivial to overlay a hierarchical model on top of >>> the map-of-sorted-maps model that Cassandra implements. Ed Anuff has >>> implemented a custom comparator that does the job [1]. Google's Megastore >>> has a similar architecture and goes even further [2]. >>> >>> It seems to me that super columns are a historical artifact from >>> Cassandra's early life as Facebook's inbox storage system. They needed >>> posting lists of messages, sharded by user. So that's what they built. In my >>> dealings with the Cassandra code, super columns end up making a mess all >>> over the place when algorithms need to be special cased and branch based on >>> the column/supercolumn distinction. >>> >>> I won't even mention what it does to the thrift interface. >>> >> >> Actually, I agree with you, more than you know. If I were to start coding >> Cassandra now, I wouldn't include super columns (and I would probably not go >> for a depth unlimited hierarchical model either). But it's there and I'm not >> sure getting rid of them fully (meaning, including in thrift) is an option >> (it would be a big compatibility breakage). And (even though I certainly >> though about this more than once :)) I'm slightly less enthusiastic about >> keeping them in thrift but encoding them in regular column family >> internally: it would still be a lot of work but we would still probably end >> up with nasty tricks to stick to the thrift api. >> >> -- >> Sylvain >> >> >>> Mike >>> >>> [1] http://www.anuff.com/2010/07/secondary-indexes-in-cassandra.html >>> [2] http://www.cidrdb.org/cidr2011/Papers/CIDR11_Paper32.pdf >>> >> >> > >