Hi, > does this seem like a generally useful feature?
I do think this could be a useful feature. If only because I don't think there is any satisfactory/efficient way to do this client side. > if so, would it be hard to implement (maybe it could be done at compaction > time like the TTL feature)? Out of the top of my hat (aka, I haven't really think that through but I'll still give my opinion), I see the following difficulties: 1) You can only do this limiting during major compaction or the same cases as CASSANDRA-1074 for minor, since you need to make sure the x columns you are keeping are not deleted ones. Or you'll want to disable deletes altogether on the cf with this 'limit' option (I feel like this last option would really simplify things). 2) Even if the removal of the column exceeding the limit is eventual (and it will), you'll want query to only ever return column inside the limit (otherwise the feature would be too unpredictable). But I think this will be quite challenging. That is, slice query from the start of the row are easy. Everything else is harder (at least if you want to make it efficient). That was my 2 cents. Anyway, you can always open a JIRA ticket. -- Sylvain On Fri, Jan 14, 2011 at 7:38 AM, mike dooley <doo...@apple.com> wrote: > hi, > > the time-to-live feature in 0.7 is very nice and it made me want to ask > about > a somewhat similar feature. > > i have a stream of data consisting of entities and associated samples. so > i create > a row for each entity and the columns in each row contain the samples for > that entity. > when i get around to processing an entity i only care about the most > recent N samples. > so i read the most recent N columns and delete all the rest. > > what i would like would be a column family property that allows me to > specify a maximum number of columns per row. then i could just keep > writing > and not have to do the deletes. > > in my case it would be fine if the limit is only 'eventually' applied (so > that > sometimes there might be extra columns). > > does this seem like a generally useful feature? if so, would it be hard to > implement (maybe it could be done at compaction time like the TTL feature)? > > thanks, > -mike