[jira] [Commented] (HBASE-4071) Data GC: Remove all versions > TTL EXCEPT the last written version

stack (JIRA) Tue, 16 Aug 2011 11:29:50 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-4071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13085889#comment-13085889
 ]


stack commented on HBASE-4071:
------------------------------

bq. In both cases I think only at compaction time do we have enough information 
to remove expired cells (if minversions is >0).

Fair enough.

bq. So you think test coverage of the existing functionality is sufficient? 
That is very good to know.

IIRC, there are some decent tests for this stuff.  Also, the operation is so 
fundamental that if broke, it'd bubble up as a broken test somewhere (the 
connection back down into this code might be an indirection on top of an 
indirection, but a test will break I'd say).

bq. What's the general feeling? Should I aim for minimal intrusion or attempt 
to do a bit refactoring to abstract these policies into an interface? Leaning 
towards the latter, but on the other hand the change would be more risky.

I'm for the latter.  We're trying to push out next major revision of hbase.  
It'll get some scrutiny and testing so if you've broken something it should 
show (or if not, our coverage needs improving).

Good stuff LarsH.

> Data GC: Remove all versions > TTL EXCEPT the last written version
> ------------------------------------------------------------------
>
>                 Key: HBASE-4071
>                 URL: https://issues.apache.org/jira/browse/HBASE-4071
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: stack
>         Attachments: MinVersions.diff
>
>
> We were chatting today about our backup cluster.  What we want is to be able 
> to restore the dataset from any point of time but only within a limited 
> timeframe -- say one week.  Thereafter, if the versions are older than one 
> week, rather than as we do with TTL where we let go of all versions older 
> than TTL, instead, let go of all versions EXCEPT the last one written.  So, 
> its like versions==1 when TTL > one week.  We want to allow that if an error 
> is caught within a week of its happening -- user mistakenly removes a 
> critical table -- then we'll be able to restore up the the moment just before 
> catastrophe hit otherwise, we keep one version only.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-4071) Data GC: Remove all versions > TTL EXCEPT the last written version

Reply via email to