[ https://issues.apache.org/jira/browse/HBASE-7912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13585053#comment-13585053 ]
Jing Chen He commented on HBASE-7912: ------------------------------------- Hi, Ted Since this depends on snapshot (offline and online), 0.96 is the hope-for target release. > HBase Backup/Restore Based on HBase Snapshot and FileLink > --------------------------------------------------------- > > Key: HBASE-7912 > URL: https://issues.apache.org/jira/browse/HBASE-7912 > Project: HBase > Issue Type: New Feature > Reporter: Richard Ding > Assignee: Richard Ding > > There have been attempts in the past to come up with a viable HBase > backup/restore solution (e.g., HBASE-4618). Recently, there are many > advancements and new features in HBase, for example, FileLink, Snapshot, and > Distributed Barrier Procedure. This is a proposal for a backup/restore > solution that utilizes these new features to achieve better performance and > consistency. > > A common practice of backup and restore in database is to first take full > baseline backup, and then periodically take incremental backup that capture > the changes since the full baseline backup. HBase cluster can store massive > amount data. Combination of full backups with incremental backups has > tremendous benefit for HBase as well. The following is a typical scenario > for full and incremental backup. > # The user takes a full backup of a table or a set of tables in HBase. > # The user schedules periodical incremental backups to capture the changes > from the full backup, or from last incremental backup. > # The user needs to restore table data to a past point of time. > # The full backup is restored to the table(s) or to different table name(s). > Then the incremental backups that are up to the desired point in time are > applied on top of the full backup. > We would support the following key features and capabilities. > * Full backup uses HBase snapshot to capture HFiles. > * Use HBase WALs to capture incremental changes, but we use bulk load of > HFiles for fast incremental restore. > * Support single table or a set of tables, and column family level backup and > restore. > * Restore to different table names. > * Support adding additional tables or CF to backup set without interruption > of incremental backup schedule. > * Support rollup/combining of incremental backups into longer period and > bigger incremental backups. > * Unified command line interface for all the above. > The solution will support HBase backup to FileSystem, either on the same > cluster or across clusters. It has the flexibility to support backup to > other devices and servers in the future. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira