Even if you had compaction enforcing a limit on the number of columns in a row, there would still be issues with concurrent writes at the same time and with read-repair. i.e. node a says the this is the first n columns but node b says something else, you only know who is correct at read time.
Have you considered using a TTL on the columns ? Depending on the use case you could also consider have writes periodically or randomly trim the data size, or trim on reads. It will also make sense to partition the time series data into different rows, and Viva la Standard Column Families! Hope that helps. ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 25/12/2011, at 7:48 PM, Praveen Baratam wrote: > Hello Everybody, > > Happy Christmas. > > I know that this topic has come up quiet a few times on Dev and User lists > but did not culminate into a solution. > > http://www.mail-archive.com/[email protected]/msg15367.html > > The above discussion on User list talks about AbstractCompactionStrategy but > I could not find any relevant documentation as its a fairly new feature in > Cassandra. > > Let me state this necessity and use-case again. > > I need a ColumnFamily (CF) wide or SuperColumn (SC) wide option to > approximately limit the number of columns to "n". "n" can vary a lot and the > intention is to throw away stale data and not to maintain any hard limit on > the CF or SC. Its very useful for storing time-series data where stale data > is not necessary. The goal is to achieve this with minimum overhead and since > compaction happens all the time it would be clever to implement it as part of > compaction. > > Thanks in advance. > > Praveen
