[jira] [Commented] (HBASE-6055) Snapshots in HBase 0.96

Jonathan Hsieh (JIRA) Sun, 03 Jun 2012 12:05:25 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13288238#comment-13288238
 ]


Jonathan Hsieh commented on HBASE-6055:
---------------------------------------

Jesse,

Thanks for answering the questions.  A strong +1 for doing the simplest hbase 
timestamp-based approach first, and then looking into the more complicated 
version as an option afterwards.  Maybe start a sub issue with the 
point-in-time approach to move discussion there? (I still have questions there, 
might be better to ask there)

The main use case I care about is ability to quickly "snapshot" without 
downtime and quickly recover it (ideally with no downtime, but possibly with a 
short downtime window).  Although it is a "sloppy snapshot" conceptually it is 
pretty simple to define and I think the caveats are fairly well undestood.  I 
don't expect something with stronger consistency guarantees than what hbase 
currently offers but do expect something better (cheaper/faster) than the 
current closest thing which is a CopyTable.  

I have a bunch of new questions - some just asking for precision and some for 
clarification.  It might be helpful to define terms in the beginning of the doc 
so it stays consistent? 

- Hm.. how do you restore a snapshot from references files if it hasn't been 
scan/copied yet?  Require scan/copy "materialization" of the snapshot first?  
(which means slower restore, but probably would likely be simplest for a first 
cut)
- Snapshot restore needs to be "transactional" like snapshotting right?
- what is "export"? is this taking a snapshot or the materialization or the 
snapshot restore or something else?
- If we restore snapshots to the same hbase instance, in dir structure, you 
probably need .regioninfo files as well. (contains region startkey/endkey info 
necessary to reconsistute META later).  
- Is restoring to a separate instance in scope?  If so bulk loads can be 
expensive -- if regions don't line up there will be a bunch of spliting that 
happens.  Again, keeping the regionsinfos and the snapshot's splits may be 
worthwhile.
- Where do the materialized versions of the snapshot reference files end up?  
in the snapshot dirs? elsewhere?  
-- This potentially gets a little trickier with markers as opposed to log rolls.
-- The HLog will have edits from regions not relevant to the table's regions.  
Not a huge problem but maybe an optmization would be that the materialization 
step will do an "offline hlogsplit/flush" to just keep the data relevent to 
this table/region?


                
> Snapshots in HBase 0.96
> -----------------------
>
>                 Key: HBASE-6055
>                 URL: https://issues.apache.org/jira/browse/HBASE-6055
>             Project: HBase
>          Issue Type: New Feature
>          Components: client, master, regionserver, zookeeper
>            Reporter: Jesse Yates
>            Assignee: Jesse Yates
>             Fix For: 0.96.0
>
>         Attachments: Snapshots in HBase.docx
>
>
> Continuation of HBASE-50 for the current trunk. Since the implementation has 
> drastically changed, opening as a new ticket.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6055) Snapshots in HBase 0.96

Reply via email to