You're right, if the TTL will be dynamically set, then we'd need to make room for it. Otherwise, if it's globally set, we could save that space.
-Kelvin On Wed, Jan 13, 2010 at 1:16 PM, Kelvin Kakugawa <kakug...@gmail.com> wrote: > Are you thinking about storing the expiration time explicitly? Or, > would it be reasonable to calculate it dynamically? > > -Kelvin > > On Wed, Jan 13, 2010 at 1:01 PM, Jonathan Ellis <jbel...@gmail.com> wrote: >> I think that is more or less what Sylvain is proposing. The main >> downside is adding the extra 8 bytes for a long (or 4 for an int, >> which should actually be plenty of resolution for this use case) to >> each Column object. >> >> On Wed, Jan 13, 2010 at 4:57 PM, Kelvin Kakugawa <kakug...@gmail.com> wrote: >>> An alternative implementation that may be worth exploring would be to >>> modify IColumn's isMarkedForDelete() method to check TTL. >>> >>> It probably wouldn't be as performant as straight dropping SSTables. >>> You'd probably also need to periodically compact old tables to remove >>> expired rows. However, on the surface, it appears to be a more >>> seamless and fine-grained approach to this problem. >>> >>> -Kelvin >>> >>> A little more background: >>> db.IColumn is the shared interface that db.Column and db.SuperColumn >>> implement. db.Column's isMarkedForDelete() method only checks if a >>> flag has been set, right now. So, it would be relatively >>> straightforward to slip some logic into that method to check if its >>> timestamp has expired beyond some TTL. >>> >>> However, I suspect that there may be other methods that may need to be >>> slightly modified, as well. And, the compaction code would have to be >>> inspected to make sure that old tables are periodically compacted to >>> remove expired rows. >>> >>> On Wed, Jan 13, 2010 at 12:30 PM, Mark Robson <mar...@gmail.com> wrote: >>>> I also agree: Some mechanism to expire rolling data would be really good if >>>> we can incorporate it. Using the existing client interface, deleting old >>>> data is very cumbersome. >>>> >>>> We want to store lots of audit data in Cassandra, this will need to be >>>> expired eventually. >>>> >>>> Nodes should be able to do expiry locally without needing to talk to other >>>> nodes in the cluster. As we have a timestamp on everything anyway, can we >>>> not use that somehow? >>>> >>>> If we only ever append data rather than update it (or update it very >>>> rarely), can we somehow store timestamp ranges in each sstable file and >>>> then >>>> have the server know when it's time to expire one? >>>> >>>> I'm guessing here from my limited understanding of how Cassandra works. >>>> >>>> Mark >>>> >>> >> >