> - ... ExpiringColumn not create any tombstones? Imo this could be safely > > done if the columns TTL is >= gcgrace. > > Yes, if the TTL >= gcgrace this would be safe and I'm pretty sure we > use to have a ticket for that (can't find it back with a quick search > but JIRA search suck and I didn't bother long). But basically we > decided to not do it for now for 2 reasons:... > The only ticket I found that was anything similar is CASSANDRA-4565. I have my doubts that you meant that one :-)
I dont know what your approach was back then, but maybe it could be solved quite easily: When creating tombstones for ExpiringColumns, we could use the ExpiringColumn.timestamp to set the DeletedColumn.localDeletionTime . So instead of using the deletiontime of the ExpiringColumn, we use the creationtime. In the ExpiringColumn class this would like this: public static Column create(ByteBuffer name, ByteBuffer value, long timestamp, int timeToLive, int localExpirationTime, int expireBefore, IColumnSerializer.Flag flag) { if (localExpirationTime >= expireBefore || flag == IColumnSerializer.Flag.PRESERVE_SIZE) return new ExpiringColumn(name, value, timestamp, timeToLive, localExpirationTime); // the column is now expired, we can safely return a simple tombstone return new DeletedColumn(name, *timestamp/1000*, timestamp); // uses creation timestamp for ExpiringColumn // return new DeletedColumn(name, localExpirationTime, timestamp); // old code } Imo this makes tombstones of DeletedColumns live only as long as they need to be: In case you specify ExpireColumn.TTL > 10days, then the created DeletedColumn would have a timestamp thats >10days in the past, which makes it obsolete for gc right away. With ttl=5days the tombstone stays for 5 days, enough for either the ExpiringColumn or the Tombstone to be repaired. > > - ... ExpiringColumn not add local timestamp to digest? > > As I said in a previous thread, I don't see what the problem is here. > The timestamp is not local to the node, it is assigned once and for > all by the coordinator at insert time. I can agree that it's not > really useful per se to the digest, but I don't think it matters in > any case. > Oh sorry, you're right, I mixed something up there. Its DeletedColumn that has the localtimestamp (as value). It takes a localDeletionTime (which is supplied by RowMutation.delete) and uses that a value for the DeletedColumn. This value is used by Column to update the digest. Sorry for not letting this go, but I think there are some low hanging fruits here. cheers, Christian