[ https://issues.apache.org/jira/browse/HBASE-2376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13501743#comment-13501743 ]
Kannan Muthukkaruppan commented on HBASE-2376: ---------------------------------------------- Lars wrote: <<<Flashback queries only makes sense with TTL>>>. This is not true. A simple CF with VERSIONS=1 & no TTL (i.e. TTL of infinity) can also benefit from ability to FlashBack query. Flash back is simply an ability to query the DB as of a previous point in time. Why should we overload that functionality with versions, TTL, etc.? I think it is useful to think of FlashBack as completely independent of other settings like TTL, MAXVERSIONS, MINVERSIONS, etc. The latter should be picked at schema design time based on the application requirements. For example, you may have many tables in your system with different TTL, VERSIONS requirements. Maybe you have different CFs within a table, with differing TTL & VERSION requirements. But on top of all those, suppose across all my tables I want to be able to query the entire DB as of a previous point in time. From a user's point of view, the only setting they need to worry about is the "time period" (back in time) up to which flash back queries are supported. For example, you might have one CF, with VERSIONS=1, where you are keeping hourly rollup data that you want to retain for 1 month (TTL) and, another CF where you keep daily rollup data also with VERSIONS=1 where you want to retain data for 3 years. But separately, I want the ability to be able to do flash back queries up to say 7 days back. This "7 days" should be a completely different setting, and there seems to be no reason to confuse this with TTL & Verions. Now, API wise, we need the ability to say that we are doing a flashback query i.e. "Scan @ T" instead of regular "Scan". In Oracle DB too, for instance, flash back queries have this special syntax: SELECT * FROM employee AS OF TIMESTAMP <TS> WHERE name = 'JOHN'; Regarding <<< So the snapshot scanner is special in that only through this specific scanner you can look further back than the TTL.>>>: I think that is by design. Note: Scan @ T (flash back query) is different than doing a Scan with setTimeRange(0, T). A delete done a T+1 of a key is immaterial for Scan @ T query; whereas for a Scan with setTimeRange(0, T), you will still see the effect of the delete done at T+1. ---- In summary, we should not confuse our users by forcing them to change their schema design (i.e. choice of VERSIONS, TTL, etc.) to support flashback queries. Flashback support should be configured using a simple extra knob that can be set a system, table or CF level. We should NOT overload that knob with TTL and Versions. ---- > Add special SnapshotScanner which presents view of all data at some time in > the past > ------------------------------------------------------------------------------------ > > Key: HBASE-2376 > URL: https://issues.apache.org/jira/browse/HBASE-2376 > Project: HBase > Issue Type: New Feature > Components: Client, regionserver > Affects Versions: 0.20.3 > Reporter: Jonathan Gray > Assignee: Pritam Damania > > In order to support a particular kind of database "snapshot" feature which > doesn't require copying data, we came up with the idea for a special > SnapshotScanner that would present a view of your data at some point in the > past. The primary use case for this would be to be able to recover > particular data/rows (but not all data, like a global rollback) should they > have somehow been messed up (application fault, application bug, user error, > etc.). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira