[ https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344404#comment-16344404 ]
Josh Elser commented on HBASE-17852: ------------------------------------ bq. I did say it was okay to go in master, but that's like 4 days after the commit - 2018-01-16T12:46:19-0800 I'm not messing with you, [~appy]. Check the push logs/comments on the other JIRA issue.. I swear to you that I did not push this until after I heard back from you. My guess is that this is due to me using git-am or cherry picking a commit from a local branch. > Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental > backup) > ------------------------------------------------------------------------------------ > > Key: HBASE-17852 > URL: https://issues.apache.org/jira/browse/HBASE-17852 > Project: HBase > Issue Type: Sub-task > Reporter: Vladimir Rodionov > Assignee: Vladimir Rodionov > Priority: Major > Fix For: 3.0.0 > > Attachments: HBASE-17852-v10.patch, screenshot-1.png > > > Design approach rollback-via-snapshot implemented in this ticket: > # Before backup create/delete/merge starts we take a snapshot of the backup > meta-table (backup system table). This procedure is lightweight because meta > table is small, usually should fit a single region. > # When operation fails on a server side, we handle this failure by cleaning > up partial data in backup destination, followed by restoring backup > meta-table from a snapshot. > # When operation fails on a client side (abnormal termination, for example), > next time user will try create/merge/delete he(she) will see error message, > that system is in inconsistent state and repair is required, he(she) will > need to run backup repair tool. > # To avoid multiple writers to the backup system table (backup client and > BackupObserver's) we introduce small table ONLY to keep listing of bulk > loaded files. All backup observers will work only with this new tables. The > reason: in case of a failure during backup create/delete/merge/restore, when > system performs automatic rollback, some data written by backup observers > during failed operation may be lost. This is what we try to avoid. > # Second table keeps only bulk load related references. We do not care about > consistency of this table, because bulk load is idempotent operation and can > be repeated after failure. Partially written data in second table does not > affect on BackupHFileCleaner plugin, because this data (list of bulk loaded > files) correspond to a files which have not been loaded yet successfully and, > hence - are not visible to the system -- This message was sent by Atlassian JIRA (v7.6.3#76005)