[ https://issues.apache.org/jira/browse/HBASE-19441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16280819#comment-16280819 ]
Appy commented on HBASE-19441: ------------------------------ bq. Remember that backups are client driven (per some design review from a long time ago), so queuing is tough to reason about (we have no "centralized" execution system to use). We have centralized execution system - procedureV2. Queueing is totally possible. We have good amount of time before 2.1, wouldn't it be reasonable to move things to procedure framework. The benefits would be- Backup operation can avoid using BackupMetaTable to maintain intermediate state (since procs have support for state, WAL, recovery on crash, etc). That in turn may help get rid of snapshot-restore of backuptable. Which means multiple backups can progress in parallel. > Implement retry logic around starting exclusive backup operation > ---------------------------------------------------------------- > > Key: HBASE-19441 > URL: https://issues.apache.org/jira/browse/HBASE-19441 > Project: HBase > Issue Type: Improvement > Components: backup&restore > Reporter: Josh Elser > Fix For: 3.0.0 > > > {quote} > Specifically, the client does a checkAndPut to specifics coordinates in the > backup table and throws an exception when that fails. Remember that backups > are client driven (per some design review from a long time ago), so queuing > is tough to reason about (we have no "centralized" execution system to use). > At a glance, it seems pretty straightforward to add some retry/backoff > semantics to BackupSystemTable#startBackupExclusiveOperation(). > {quote} > While we are in a state in which backup operations cannot be executed in > parallel, it would be nice to provide some retry logic + configuration. This > would alleviate users from having to build this themselves. -- This message was sent by Atlassian JIRA (v6.4.14#64029)