Haebin Na created CASSANDRA-7115: ------------------------------------ Summary: Partitioned Column Family (Table) based on Column Keys (Sorta TTLed Table) Key: CASSANDRA-7115 URL: https://issues.apache.org/jira/browse/CASSANDRA-7115 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Haebin Na Priority: Minor
We need a better solution to expire columns than TTLed columns. If you set TTL 6 months for a column in a frequently updated(deleted, yes, this is anti-pattern) wide row, it is not likely to be deleted since the row would be highly fragmented. In order to solve the problem above, I suggest partitioning column family (table) with column key (column1) as partition key. It is like a set of column families (tables) which share the same structure and cover certain range of columns per CF. This means that a row is deterministically fragmented by column key. If you use timestamp like column key, then you would be able to truncate specific partition (a sub-table or CF with specific range) if it is older than certain age easily without worrying about zombie tombstones. It is not optimal to have many column families, yet even with small set like by biyearly or quarterly, we could achieve whole lot more efficient than TTLed columns. What do you think? -- This message was sent by Atlassian JIRA (v6.2#6252)