[ 
https://issues.apache.org/jira/browse/HBASE-14142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200128#comment-15200128
 ] 

Vladimir Rodionov commented on HBASE-14142:
-------------------------------------------

Moved to Phase 3.

> HBase Backup/Restore Phase 3: Cells deduplication during backup
> ---------------------------------------------------------------
>
>                 Key: HBASE-14142
>                 URL: https://issues.apache.org/jira/browse/HBASE-14142
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Vladimir Rodionov
>            Assignee: Vladimir Rodionov
>
> As since we do not record last backed up sequence ids (MVCC) and do not 
> restore up to that sequence id - that is kind of tricky, there will be some 
> duplicates of KVs in store files after first incremental restore after full 
> backup. These duplicates are result of how we do full backup and first 
> incremental backup after full one. During full backup we perform distributed 
> log roll and record, for every RS, last WAL timestamp, then we do snapshot. 
> The next WAL after recorded one will make it into a next incremental backup 
> set, but it will contains some edits (puts, deletes) which have been recorded 
> by a previous snapshot. During restore, we, first, restore snapshot, then we 
> will re-play WALs and this operation can create some duplicates of KVs in 
> different store files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to