[ https://issues.apache.org/jira/browse/HBASE-8753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13687299#comment-13687299 ]
Lars Hofhansl commented on HBASE-8753: -------------------------------------- Do you want to only delete columns of a specific version, or columns older than a specific version? The latter is currently possible by a family delete with a timestamp. In your use-case, do you ever want to keep version X but target target X+1 for delete? Currently we have: # version delete: target a specific version of a specific column for delete # column delete: target all version (optionally older than a ts) of a column for delete # family delete: target all version (optionally older than a ts) of all columns of a family for delete Will this break backwards-compatibility during rolling restarts? (because of the new KV type) > Provide new delete flag which can delete all cells under a column-family > which have a same designated timestamp > --------------------------------------------------------------------------------------------------------------- > > Key: HBASE-8753 > URL: https://issues.apache.org/jira/browse/HBASE-8753 > Project: HBase > Issue Type: New Feature > Components: Deletes > Reporter: Feng Honghua > Attachments: HBASE-8753-0.94-V0.patch > > > In one of our production scenario (Xiaomi message search), multiple cells > will be put in batch using a same timestamp with different column names under > a specific column-family. > And after some time these cells also need to be deleted in batch by given a > specific timestamp. But the column names are parsed tokens which can be > arbitrary words , so such batch delete is impossible without first retrieving > all KVs from that CF and get the column name list which has KV with that > given timestamp, and then issuing individual deleteColumn for each column in > that column-list. > Though it's possible to do such batch delete, its performance is poor, and > customers also find their code is quite clumsy by first retrieving and > populating the column list and then issuing a deleteColumn for each column in > that column-list. > This feature resolves this problem by introducing a new delete flag: > DeleteFamilyVersion. > 1). When you need to delete all KVs under a column-family with a given > timestamp, just call Delete.deleteFamilyVersion(cfName, timestamp); only a > DeleteFamilyVersion type KV is put to HBase (like DeleteFamily / DeleteColumn > / Delete) without read operation; > 2). Like other delete types, DeleteFamilyVersion takes effect in > get/scan/flush/compact operations, the ScanDeleteTracker now parses out and > uses DeleteFamilyVersion to prevent all KVs under the specific CF which has > the same timestamp as the DeleteFamilyVersion KV to pop-up as part of a > get/scan result (also in flush/compact). > Our customers find this feature efficient, clean and easy-to-use since it > does its work without knowing the exact column names list that needs to be > deleted. > This feature has been running smoothly for a couple of months in our > production clusters. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira