[ 
https://issues.apache.org/jira/browse/HBASE-23602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17001165#comment-17001165
 ] 

Geoffrey Jacoby commented on HBASE-23602:
-----------------------------------------

Some potential use cases where this is required for correctness:
1. A change stream producer that shows both new and old versions of a row after 
a change (DynamoDB has something similar to this)
2. Guaranteeing that all intermediate values are still present on disk when a 
backup tool generates a daily HBase snapshot. 
3. Raw consistency verification between two replicated tables both taking 
writes. 

> TTL Before Which No Data is Purged
> ----------------------------------
>
>                 Key: HBASE-23602
>                 URL: https://issues.apache.org/jira/browse/HBASE-23602
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Geoffrey Jacoby
>            Priority: Major
>
> HBase currently offers operators a choice. They can set 
> KEEP_DELETED_CELLS=true and VERSIONS to max value, plus no TTL, and they will 
> always have a complete history of all changes (but high storage costs and 
> penalties to read performance). Or they can have KEEP_DELETED_CELLS=false and 
> VERSIONS/TTL set to some reasonable values, but that means that major 
> compactions can destroy the ability to do a consistent snapshot read of any 
> prior time. (This limits the usefulness and correctness of, for example, 
> Phoenix's SCN lookback feature.) 
> I propose having a new TTL property to give a minimum age that an expired or 
> deleted Cell would have to achieve before it could be purged. (I see that 
> HBASE-10118 already does something similar for the delete markers 
> themselves.) 
> This would allow operators to have a consistent history for some finite 
> amount of recent time while still purging out the "long tail" of obsolete / 
> deleted versions. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to