[ https://issues.apache.org/jira/browse/HBASE-14141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16249932#comment-16249932 ]
Vladimir Rodionov commented on HBASE-14141: ------------------------------------------- {quote} What happens if convertion to hfiles fails midway? I don't see cleanup (perhaps it is there – in failBackup, but we don't seem to pass the tmp dir name.... I see that incrementalCopyHFiles does cleanup... but don't see it in convertion of WAL to hfile). {quote} M/R job fails (returns non-zero), we throw exception and we the operation (failBackup) {quote} * * Get list of WAL files eligible for incremental backup What makes a WAL eliible for backup? {quote} All WAL files which have not been processed yet by backup system are considered eligible for incremental backup {quote} getLogFilesFromBackupSystem gets log files from backup table. Will this be a large set. Does Will the set be large? Will it grow w/o bound? {quote} Yes, it can be large, we do not have any bounds except TTL for a backup system table, which is 1 year by default but configurable of course. This should be mentioned explicitly in a doc, probably in a separate mini-section. > HBase Backup/Restore Phase 3: Filter WALs on backup to include only edits > from backed up tables > ----------------------------------------------------------------------------------------------- > > Key: HBASE-14141 > URL: https://issues.apache.org/jira/browse/HBASE-14141 > Project: HBase > Issue Type: New Feature > Reporter: Vladimir Rodionov > Assignee: Vladimir Rodionov > Priority: Blocker > Labels: backup > Fix For: 2.0.0 > > Attachments: HBASE-14141.HBASE-14123.v1.patch, HBASE-14141.v1.patch, > HBASE-14141.v2.patch, HBASE-14141.v4.patch, HBASE-14141.v5.patch, > HBASE-14141.v6.patch > > > h2. High level design overview > * When incremental backup request comes for tables {t} we select all the > tables already registered in a backup system - {T} and union them with {t}, > which results in a new table set - U(t, T) > * For every table K from U(t,T) we perform the following: > ** Convert new WAL files into HFile applying table filter K (only edits for > table T will pass the filter) > ** Move these HFile(s) to backup destination > During restore (incremental): > * We run full restore first > * Then collect all HFiles from intermediate incremental images and run them > through HFileSplitterJob, which splits files into a current tables region > boundaries > * Load these files using LoadIncrementalHFiles tool > -- This message was sent by Atlassian JIRA (v6.4.14#64029)