Vinayak Hegde created HBASE-29310:
-------------------------------------

             Summary: Handle Bulk Load Operations in Continuous Backup and PITR 
Workflow
                 Key: HBASE-29310
                 URL: https://issues.apache.org/jira/browse/HBASE-29310
             Project: HBase
          Issue Type: Task
          Components: backup&restore
            Reporter: Vinayak Hegde


When bulk load operations are performed, the resulting files are backed up to 
the backup location via the continuous backup process.

However, during Point-In-Time Recovery (PITR), restoring these bulk-loaded 
files efficiently can be challenging.

To address this, we propose the following guidelines for handling bulk loads in 
the context of continuous backup and PITR:
 * When a user performs a bulk load on any table under continuous backup and 
PITR, they *must take a full or incremental backup* afterward. An incremental 
backup is generally sufficient and faster.

*Required Changes*
 # {*}Documentation Update{*}:
Add a note in the HBase Backup and Restore documentation explaining this rule 
and its importance.

 # {*}Logging Suggestion{*}:
After a bulk load operation completes, log a message suggesting the user 
perform a full or incremental backup.

 # {*}PITR Enhancements{*}:

 ** During PITR, check if any bulk load operation occurred after the last 
successful backup.

 ** If no such backup exists, inform the user and fail the process.

 ** If the user chooses to proceed (e.g., using a {{--force}} flag), continue 
with a warning that the bulk-loaded files will not be part of the restored 
table.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to