Vinayak Hegde created HBASE-29310:
-------------------------------------
Summary: Handle Bulk Load Operations in Continuous Backup and PITR
Workflow
Key: HBASE-29310
URL: https://issues.apache.org/jira/browse/HBASE-29310
Project: HBase
Issue Type: Task
Components: backup&restore
Reporter: Vinayak Hegde
When bulk load operations are performed, the resulting files are backed up to
the backup location via the continuous backup process.
However, during Point-In-Time Recovery (PITR), restoring these bulk-loaded
files efficiently can be challenging.
To address this, we propose the following guidelines for handling bulk loads in
the context of continuous backup and PITR:
* When a user performs a bulk load on any table under continuous backup and
PITR, they *must take a full or incremental backup* afterward. An incremental
backup is generally sufficient and faster.
*Required Changes*
# {*}Documentation Update{*}:
Add a note in the HBase Backup and Restore documentation explaining this rule
and its importance.
# {*}Logging Suggestion{*}:
After a bulk load operation completes, log a message suggesting the user
perform a full or incremental backup.
# {*}PITR Enhancements{*}:
** During PITR, check if any bulk load operation occurred after the last
successful backup.
** If no such backup exists, inform the user and fail the process.
** If the user chooses to proceed (e.g., using a {{--force}} flag), continue
with a warning that the bulk-loaded files will not be part of the restored
table.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)