Vinayak Hegde created HBASE-29220:
-------------------------------------

             Summary: Track the Age/Timestamp of the Last Successfully 
Backed-Up WAL Entry in Continuous Backup Replication Endpoint
                 Key: HBASE-29220
                 URL: https://issues.apache.org/jira/browse/HBASE-29220
             Project: HBase
          Issue Type: Task
          Components: backup&restore
            Reporter: Vinayak Hegde


We use HBase’s replication framework for Continuous Backup through 
{{{}ContinuousBackupReplicationEndpoint{}}}. This replicates WAL entries to the 
backup location, which are then used for Point-In-Time Recovery (PITR) and 
Incremental Backup (an optimization technique that collects WALs and generates 
HFiles for faster recovery).

However, the {{ReplicationEndpoint}} can lag behind in time.

For example, if replication is one hour behind, 
{{ContinuousBackupReplicationEndpoint}} will currently be writing WAL entries 
that are one hour old. This means that if a user requests a PITR for the 
current time or attempts an incremental backup, they will miss that one hour of 
data.

To prevent this, we need to ensure that users can only request data that has 
been fully backed up. Therefore, we must track the timestamp of the last 
successfully backed-up WAL entry:
 * For PITR: Users should only be allowed to restore to a point before this 
timestamp.
 * For Incremental Backup: The incremental backup process should store this 
timestamp as the backup time to maintain data consistency.

This ensures data integrity and prevents users from requesting backups that 
include unprocessed WAL entries.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to