Re: Compaction Filter in Cassandra
Fyi, this is the jira, https://issues.apache.org/jira/browse/CASSANDRA-11348 . We can move the discussion to the jira if want. On Thu, Mar 17, 2016 at 11:46 AM, Dikang Guwrote: > Hi Eric, > > Thanks for sharing the information! > > We also mainly want to use it for trimming data, either by the time or the > number of columns in a row. We haven't started the work yet, do you mind to > share some patches? We'd love to try it and test it in our environment. > > Thanks. > > On Tue, Mar 15, 2016 at 9:36 PM, Eric Stevens wrote: > >> We have been working on filtering compaction for a month or so (though we >> call it deleting compaction, its implementation is as a filtering >> compaction strategy). The feature is nearing completion, and we have used >> it successfully in a limited production capacity against DSE 4.8 series. >> >> Our use case is that our records are written anywhere between a month, up >> to several years before they are scheduled for deletion. Tombstones are >> too expensive, as we have tables with hundreds of billions of rows. In >> addition, traditional TTLs don't work for us because our customers are >> permitted to change their retention policy such that already-written >> records should not be deleted if they increase their retention after the >> record was written (or vice versa). >> >> We can clean up data more cheaply and more quickly with filtered >> compaction than with tombstones and traditional compaction. Our >> implementation is a wrapper compaction strategy for another underlying >> strategy, so that you can have the characteristics of whichever strategy >> makes sense in terms of managing your SSTables, while interceding and >> removing records during compaction (including cleaning up secondary >> indexes) that otherwise would have survived into the new SSTable. >> >> We are hoping to contribute it back to the community, so if you'd be >> interested in helping test it out, I'd love to hear from you. >> >> On Sat, Mar 12, 2016 at 5:12 AM Marcus Eriksson >> wrote: >> >>> We don't have anything like that, do you have a specific use case in >>> mind? >>> >>> Could you create a JIRA ticket and we can discuss there? >>> >>> /Marcus >>> >>> On Sat, Mar 12, 2016 at 7:05 AM, Dikang Gu wrote: >>> Hello there, RocksDB has the feature called "Compaction Filter" to allow application to modify/delete a key-value during the background compaction. https://github.com/facebook/rocksdb/blob/v4.1/include/rocksdb/options.h#L201-L226 I'm wondering is there a plan/value to add this into C* as well? Or is there already a similar thing in C*? Thanks -- Dikang >>> > > > -- > Dikang > > -- Dikang
Re: Compaction Filter in Cassandra
I would definitely be interested in this. Clint On Mar 15, 2016 9:36 PM, "Eric Stevens"wrote: > We have been working on filtering compaction for a month or so (though we > call it deleting compaction, its implementation is as a filtering > compaction strategy). The feature is nearing completion, and we have used > it successfully in a limited production capacity against DSE 4.8 series. > > Our use case is that our records are written anywhere between a month, up > to several years before they are scheduled for deletion. Tombstones are > too expensive, as we have tables with hundreds of billions of rows. In > addition, traditional TTLs don't work for us because our customers are > permitted to change their retention policy such that already-written > records should not be deleted if they increase their retention after the > record was written (or vice versa). > > We can clean up data more cheaply and more quickly with filtered > compaction than with tombstones and traditional compaction. Our > implementation is a wrapper compaction strategy for another underlying > strategy, so that you can have the characteristics of whichever strategy > makes sense in terms of managing your SSTables, while interceding and > removing records during compaction (including cleaning up secondary > indexes) that otherwise would have survived into the new SSTable. > > We are hoping to contribute it back to the community, so if you'd be > interested in helping test it out, I'd love to hear from you. > > On Sat, Mar 12, 2016 at 5:12 AM Marcus Eriksson wrote: > >> We don't have anything like that, do you have a specific use case in mind? >> >> Could you create a JIRA ticket and we can discuss there? >> >> /Marcus >> >> On Sat, Mar 12, 2016 at 7:05 AM, Dikang Gu wrote: >> >>> Hello there, >>> >>> RocksDB has the feature called "Compaction Filter" to allow application >>> to modify/delete a key-value during the background compaction. >>> https://github.com/facebook/rocksdb/blob/v4.1/include/rocksdb/options.h#L201-L226 >>> >>> I'm wondering is there a plan/value to add this into C* as well? Or is >>> there already a similar thing in C*? >>> >>> Thanks >>> >>> -- >>> Dikang >>> >>> >>
Re: Compaction Filter in Cassandra
Hi Eric, Thanks for sharing the information! We also mainly want to use it for trimming data, either by the time or the number of columns in a row. We haven't started the work yet, do you mind to share some patches? We'd love to try it and test it in our environment. Thanks. On Tue, Mar 15, 2016 at 9:36 PM, Eric Stevenswrote: > We have been working on filtering compaction for a month or so (though we > call it deleting compaction, its implementation is as a filtering > compaction strategy). The feature is nearing completion, and we have used > it successfully in a limited production capacity against DSE 4.8 series. > > Our use case is that our records are written anywhere between a month, up > to several years before they are scheduled for deletion. Tombstones are > too expensive, as we have tables with hundreds of billions of rows. In > addition, traditional TTLs don't work for us because our customers are > permitted to change their retention policy such that already-written > records should not be deleted if they increase their retention after the > record was written (or vice versa). > > We can clean up data more cheaply and more quickly with filtered > compaction than with tombstones and traditional compaction. Our > implementation is a wrapper compaction strategy for another underlying > strategy, so that you can have the characteristics of whichever strategy > makes sense in terms of managing your SSTables, while interceding and > removing records during compaction (including cleaning up secondary > indexes) that otherwise would have survived into the new SSTable. > > We are hoping to contribute it back to the community, so if you'd be > interested in helping test it out, I'd love to hear from you. > > On Sat, Mar 12, 2016 at 5:12 AM Marcus Eriksson wrote: > >> We don't have anything like that, do you have a specific use case in mind? >> >> Could you create a JIRA ticket and we can discuss there? >> >> /Marcus >> >> On Sat, Mar 12, 2016 at 7:05 AM, Dikang Gu wrote: >> >>> Hello there, >>> >>> RocksDB has the feature called "Compaction Filter" to allow application >>> to modify/delete a key-value during the background compaction. >>> https://github.com/facebook/rocksdb/blob/v4.1/include/rocksdb/options.h#L201-L226 >>> >>> I'm wondering is there a plan/value to add this into C* as well? Or is >>> there already a similar thing in C*? >>> >>> Thanks >>> >>> -- >>> Dikang >>> >>> >> -- Dikang
Re: Compaction Filter in Cassandra
We have been working on filtering compaction for a month or so (though we call it deleting compaction, its implementation is as a filtering compaction strategy). The feature is nearing completion, and we have used it successfully in a limited production capacity against DSE 4.8 series. Our use case is that our records are written anywhere between a month, up to several years before they are scheduled for deletion. Tombstones are too expensive, as we have tables with hundreds of billions of rows. In addition, traditional TTLs don't work for us because our customers are permitted to change their retention policy such that already-written records should not be deleted if they increase their retention after the record was written (or vice versa). We can clean up data more cheaply and more quickly with filtered compaction than with tombstones and traditional compaction. Our implementation is a wrapper compaction strategy for another underlying strategy, so that you can have the characteristics of whichever strategy makes sense in terms of managing your SSTables, while interceding and removing records during compaction (including cleaning up secondary indexes) that otherwise would have survived into the new SSTable. We are hoping to contribute it back to the community, so if you'd be interested in helping test it out, I'd love to hear from you. On Sat, Mar 12, 2016 at 5:12 AM Marcus Erikssonwrote: > We don't have anything like that, do you have a specific use case in mind? > > Could you create a JIRA ticket and we can discuss there? > > /Marcus > > On Sat, Mar 12, 2016 at 7:05 AM, Dikang Gu wrote: > >> Hello there, >> >> RocksDB has the feature called "Compaction Filter" to allow application >> to modify/delete a key-value during the background compaction. >> https://github.com/facebook/rocksdb/blob/v4.1/include/rocksdb/options.h#L201-L226 >> >> I'm wondering is there a plan/value to add this into C* as well? Or is >> there already a similar thing in C*? >> >> Thanks >> >> -- >> Dikang >> >> >
Re: Compaction Filter in Cassandra
We don't have anything like that, do you have a specific use case in mind? Could you create a JIRA ticket and we can discuss there? /Marcus On Sat, Mar 12, 2016 at 7:05 AM, Dikang Guwrote: > Hello there, > > RocksDB has the feature called "Compaction Filter" to allow application to > modify/delete a key-value during the background compaction. > https://github.com/facebook/rocksdb/blob/v4.1/include/rocksdb/options.h#L201-L226 > > I'm wondering is there a plan/value to add this into C* as well? Or is > there already a similar thing in C*? > > Thanks > > -- > Dikang > >
Compaction Filter in Cassandra
Hello there, RocksDB has the feature called "Compaction Filter" to allow application to modify/delete a key-value during the background compaction. https://github.com/facebook/rocksdb/blob/v4.1/include/rocksdb/options.h#L201-L226 I'm wondering is there a plan/value to add this into C* as well? Or is there already a similar thing in C*? Thanks -- Dikang