[ 
https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15605689#comment-15605689
 ] 

Ted Yu commented on HBASE-14417:
--------------------------------

When running TestFullRestore, I found an interesting issue: bulk load can 
happen to the table which is fully backed up - when overwrite option is 
specified.

I can think of two ways to omit these bulk loaded files:
1. under backup zookeeper subtree, create znode with tables which are being 
restored to with overwrite option. This allows the postAppend() hook to skip 
these tables for the duration of the restore
2. at the end of bulk load, issue deletes against the znodes which are added 
during the bulk load

I am leaning toward first option.

> Incremental backup and bulk loading
> -----------------------------------
>
>                 Key: HBASE-14417
>                 URL: https://issues.apache.org/jira/browse/HBASE-14417
>             Project: HBase
>          Issue Type: New Feature
>    Affects Versions: 2.0.0
>            Reporter: Vladimir Rodionov
>            Assignee: Ted Yu
>            Priority: Critical
>              Labels: backup
>             Fix For: 2.0.0
>
>         Attachments: 14417.v1.txt, 14417.v11.txt, 14417.v13.txt, 
> 14417.v2.txt, 14417.v6.txt
>
>
> Currently, incremental backup is based on WAL files. Bulk data loading 
> bypasses WALs for obvious reasons, breaking incremental backups. The only way 
> to continue backups after bulk loading is to create new full backup of a 
> table. This may not be feasible for customers who do bulk loading regularly 
> (say, every day).
> Google doc for design:
> https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to