[ 
https://issues.apache.org/jira/browse/HBASE-17852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275107#comment-16275107
 ] 

Appy commented on HBASE-17852:
------------------------------

bq. To try to move the conversation forward, I tend to agree with Vlad that I 
don't seen an inherent problem with the rollback-via-snapshot implementation

The inherent problem with rollback-via-snapshot approach is - one operation is 
taking "exclusive lock" on the backup meta table, and that too in a very weird 
way.
It's weird because:
1) It behaves like exclusive lock in certain cases. (We only restore on 
failure, i.e. exclusion kicks in only on failures. That leads to waterfall of 
issues mentioned below.)
2) Some other operations on that table are following "exclusion" semantics (via 
locking a row), while others not.

As a result of which we see so many problems:
1) Different table for incremental backup data: The problem is not that there's 
a different table, that's fine, but the reason which led to it.
2) You can't run any other command in parallel! No restores (data loss, 
services are down, everything is on fire, oh but there's a cron job taking 
backup, so i can't do zilch!?), no merges, no deletes (prod cluster, running 
out of space, i have to wait for backup before i can free up space?). That's 
just absurd.
3) Other successful commands are rolled back silently. If an operator 
add/remove/delete sets, they are gone if a totally different thing fails!
4) During restore, backup table goes offline, cron job attempts backup and 
fails.

Others:
- And then the issues around cross RS RPC from observer during bulk load. Was 
the alternative suggested yesterday considered in the design? Was there any 
alternative that was considered?
- (Ref: Bulk loads) Backups are very important. But more important is user 
being able to load their data and use it. Preventing user to work with their 
data by putting backup in load path and failing everything if backup doesn't 
work is plain wrong. Find a different way to backup bulk load data without 
affecting core read/write paths.

So, I'd say, there are many things implicitly broken with current design.

Strong -1 on shipping it unless they are fixed.

> Add Fault tolerance to HBASE-14417 (Support bulk loaded files in incremental 
> backup)
> ------------------------------------------------------------------------------------
>
>                 Key: HBASE-17852
>                 URL: https://issues.apache.org/jira/browse/HBASE-17852
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Vladimir Rodionov
>            Assignee: Vladimir Rodionov
>             Fix For: 2.0.0
>
>         Attachments: HBASE-17852-v1.patch, HBASE-17852-v2.patch, 
> HBASE-17852-v3.patch, HBASE-17852-v4.patch, HBASE-17852-v5.patch, 
> HBASE-17852-v6.patch, HBASE-17852-v7.patch, HBASE-17852-v8.patch, 
> HBASE-17852-v9.patch
>
>
> Design approach rollback-via-snapshot implemented in this ticket:
> # Before backup create/delete/merge starts we take a snapshot of the backup 
> meta-table (backup system table). This procedure is lightweight because meta 
> table is small, usually should fit a single region.
> # When operation fails on a server side, we handle this failure by cleaning 
> up partial data in backup destination, followed by restoring backup 
> meta-table from a snapshot. 
> # When operation fails on a client side (abnormal termination, for example), 
> next time user will try create/merge/delete he(she) will see error message, 
> that system is in inconsistent state and repair is required, he(she) will 
> need to run backup repair tool.
> # To avoid multiple writers to the backup system table (backup client and 
> BackupObserver's) we introduce small table ONLY to keep listing of bulk 
> loaded files. All backup observers will work only with this new tables. The 
> reason: in case of a failure during backup create/delete/merge/restore, when 
> system performs automatic rollback, some data written by backup observers 
> during failed operation may be lost. This is what we try to avoid.
> # Second table keeps only bulk load related references. We do not care about 
> consistency of this table, because bulk load is idempotent operation and can 
> be repeated after failure. Partially written data in second table does not 
> affect on BackupHFileCleaner plugin, because this data (list of bulk loaded 
> files) correspond to a files which have not been loaded yet successfully and, 
> hence - are not visible to the system 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to