[ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275203#comment-16275203
 ] 

Vladimir Rodionov commented on HBASE-17852:
-------------------------------------------

[~appy],

# Only Admin user can run backups, therefore, there is no need to run multiple 
backups in parallel. Admin can run them in a single backup command.
# Restore can be run in parallel with other commands. That is artificial 
limitation and can be removed easily. It means Admin can run backups session 
and multiple restore sessions in parallel. I personally, do not see or 
anticipate strong request to allow multiple backup sessions in parallel. I 
advise you to go through doc and you fill find and easy to work-around parallel 
sessions by combining them into single one, [~appy]
# There is no issues with cross - RPC in backup case, because RPC call is a 
single hop and, hence, deadlock - free
# Failure of BackupObserver to record bulk loaded file with result in bulk load 
failure - yes. *But I do not see an alternative here*, do you? We need to 
record *all bulk loaded file names and store them persistently before bulk load 
operation completes*. Do you have an idea, how can this be achieved, w/o 
failing bulk load itself and w/o touching hbase core code? 

The only thing I agree here is support for parallel deletes, merges and if we 
will introduce this support we can easily add multiple backup session support 
for free.

I personally, was very impressed by you, guys, you spent so much time looking 
for design and implementation flaws, when time was running out literally, 
during this week. Good job. Why haven't you done this couple months before? 




> Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental 
> backup)
> ------------------------------------------------------------------------------------
>
>                 Key: HBASE-17852
>                 URL: https://issues.apache.org/jira/browse/HBASE-17852
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Vladimir Rodionov
>            Assignee: Vladimir Rodionov
>             Fix For: 2.0.0
>
>         Attachments: HBASE-17852-v1.patch, HBASE-17852-v2.patch, 
> HBASE-17852-v3.patch, HBASE-17852-v4.patch, HBASE-17852-v5.patch, 
> HBASE-17852-v6.patch, HBASE-17852-v7.patch, HBASE-17852-v8.patch, 
> HBASE-17852-v9.patch
>
>
> Design approach rollback-via-snapshot implemented in this ticket:
> # Before backup create/delete/merge starts we take a snapshot of the backup 
> meta-table (backup system table). This procedure is lightweight because meta 
> table is small, usually should fit a single region.
> # When operation fails on a server side, we handle this failure by cleaning 
> up partial data in backup destination, followed by restoring backup 
> meta-table from a snapshot. 
> # When operation fails on a client side (abnormal termination, for example), 
> next time user will try create/merge/delete he(she) will see error message, 
> that system is in inconsistent state and repair is required, he(she) will 
> need to run backup repair tool.
> # To avoid multiple writers to the backup system table (backup client and 
> BackupObserver's) we introduce small table ONLY to keep listing of bulk 
> loaded files. All backup observers will work only with this new tables. The 
> reason: in case of a failure during backup create/delete/merge/restore, when 
> system performs automatic rollback, some data written by backup observers 
> during failed operation may be lost. This is what we try to avoid.
> # Second table keeps only bulk load related references. We do not care about 
> consistency of this table, because bulk load is idempotent operation and can 
> be repeated after failure. Partially written data in second table does not 
> affect on BackupHFileCleaner plugin, because this data (list of bulk loaded 
> files) correspond to a files which have not been loaded yet successfully and, 
> hence - are not visible to the system 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to