[ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274659#comment-16274659
 ] 

Josh Elser commented on HBASE-17852:
------------------------------------

bq. Hmm... I don't think we can publish backup/restore without HBASE-16391 in a 
2.0 release. I'd like to have confidence that the feature is rock solid before 
telling users that it's ok to use, parallel operations seems like a major 
shortcoming to me.

Let's dig in on this some more, [~mdrob]. B&R is much more of an 
"administrative function" as opposed to a "client feature". My general 
expectation would be that, most aggressively, HBase admins (a couple of people) 
would run incremental backups on the order of "hours", e.g. incremental backup 
every 8 hours . I could see the extremely paranoid wanting to do incremental 
backups every hour over some collection of tables which _could_ cause issues if 
we can only execute one backup operation at a time (I'm thinking along the 
lines of 3 backup sets, incremental backups every hour, merging of those 
backups every few hours, full backup every day, etc).

As such, my opinion differs in that I don't see the lack of concurrent backup 
operations being a major impediment for "most" users. I completely agree with 
you that there will be some users in which this limitation would be problematic 
on what they want to use it, but, even for these edge cases, B&R without this 
would still have value to them. I think getting this feature into the hands of 
users (with the extremely clear caveats on current implementation) would 
actually better serve the feature than letting it fester more on JIRA. Thoughts?

> Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental 
> backup)
> ------------------------------------------------------------------------------------
>
>                 Key: HBASE-17852
>                 URL: https://issues.apache.org/jira/browse/HBASE-17852
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Vladimir Rodionov
>            Assignee: Vladimir Rodionov
>             Fix For: 2.0.0
>
>         Attachments: HBASE-17852-v1.patch, HBASE-17852-v2.patch, 
> HBASE-17852-v3.patch, HBASE-17852-v4.patch, HBASE-17852-v5.patch, 
> HBASE-17852-v6.patch, HBASE-17852-v7.patch, HBASE-17852-v8.patch, 
> HBASE-17852-v9.patch
>
>
> Design approach rollback-via-snapshot implemented in this ticket:
> # Before backup create/delete/merge starts we take a snapshot of the backup 
> meta-table (backup system table). This procedure is lightweight because meta 
> table is small, usually should fit a single region.
> # When operation fails on a server side, we handle this failure by cleaning 
> up partial data in backup destination, followed by restoring backup 
> meta-table from a snapshot. 
> # When operation fails on a client side (abnormal termination, for example), 
> next time user will try create/merge/delete he(she) will see error message, 
> that system is in inconsistent state and repair is required, he(she) will 
> need to run backup repair tool.
> # To avoid multiple writers to the backup system table (backup client and 
> BackupObserver's) we introduce small table ONLY to keep listing of bulk 
> loaded files. All backup observers will work only with this new tables. The 
> reason: in case of a failure during backup create/delete/merge/restore, when 
> system performs automatic rollback, some data written by backup observers 
> during failed operation may be lost. This is what we try to avoid.
> # Second table keeps only bulk load related references. We do not care about 
> consistency of this table, because bulk load is idempotent operation and can 
> be repeated after failure. Partially written data in second table does not 
> affect on BackupHFileCleaner plugin, because this data (list of bulk loaded 
> files) correspond to a files which have not been loaded yet successfully and, 
> hence - are not visible to the system 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to