[ 
https://issues.apache.org/jira/browse/HBASE-29744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18048520#comment-18048520
 ] 

Hudson commented on HBASE-29744:
--------------------------------

Results for branch branch-2.6
        [build #404 on 
builds.a.o|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.6/404/]:
 (x) *{color:red}-1 overall{color}*
----
details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.6/404/General_20Nightly_20Build_20Report/]


(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.6/404/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.6/404/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.6/404/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk17 hadoop3 checks{color}
-- For more information [see jdk17 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.6/404/JDK17_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk17 hadoop 3.3.5 backward compatibility checks{color}
-- For more information [see jdk17 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.6/404/JDK17_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk17 hadoop 3.3.6 backward compatibility checks{color}
-- For more information [see jdk17 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.6/404/JDK17_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(x) {color:red}-1 jdk17 hadoop 3.4.0 backward compatibility checks{color}
-- For more information [see jdk17 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.6/404/JDK17_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 jdk17 hadoop 3.4.1 backward compatibility checks{color}
-- For more information [see jdk17 
report|https://ci-hbase.apache.org/job/HBase%20Nightly/job/branch-2.6/404/JDK17_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test for HBase 2 {color}
(/) {color:green}+1 client integration test for 3.3.5 {color}
(/) {color:green}+1 client integration test for 3.3.6 {color}
(/) {color:green}+1 client integration test for 3.4.0 {color}
(/) {color:green}+1 client integration test for 3.4.1 {color}
(/) {color:green}+1 client integration test for 3.4.2 {color}


> Data loss scenario for WAL files belonging to RS added between backups
> ----------------------------------------------------------------------
>
>                 Key: HBASE-29744
>                 URL: https://issues.apache.org/jira/browse/HBASE-29744
>             Project: HBase
>          Issue Type: Bug
>          Components: backup&restore
>            Reporter: Hernan Gelaf-Romer
>            Assignee: Hernan Gelaf-Romer
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 2.7.0, 3.0.0-beta-2, 2.6.5
>
>
> Incremental backups can fail with a FileNotFoundException when trying to 
> process Write-Ahead Log (WAL) files from RegionServers that were added to the 
> cluster after the last successful backup.
> The issue occurs in BackupLogCleaner.canDeleteFile(), which checks timestamp 
> boundaries (stored in the backup system table) to determine if WAL files are 
> safe to delete. When no boundary exists for a RegionServer address, the 
> cleaner incorrectly assumes that the WALs can safely be deleted and returns 
> true. This situation arises when a new RegionServer is added between backups. 
> The new server generates WAL files for tables, but since a backup has not yet 
> completed, no timestamp boundary for this server is recorded. As a result, 
> the cleaner may delete these WAL files before the next backup can process 
> them, leading to a FileNotFoundException.
>  
> Additionally, I believe this can lead to data loss
>  
> When an incremental backup runs, 
> {{IncrementalBackupManager.getLogFilesForNewBackup()}} scans the filesystem 
> and builds a list of WAL files to back up, including files from newly added 
> RegionServers. Before the backup processes these files, {{BackupLogCleaner}} 
> runs concurrently and checks timestamp boundaries to determine which files 
> can be safely deleted. When it finds no timestamp boundary for a new server, 
> it incorrectly assumes the WALs are safe to delete and removes them.
> When the backup later tries to process the deleted files, 
> {{IncrementalTableBackupClient.filterMissingFiles()}} is called to validate 
> the file list. For each missing file, this method only logs a warning message 
> and silently excludes it from the backup. The backup then continues and 
> completes with a successful status, even though data from the deleted WAL 
> files was never backed up.
> This results in permanent data loss with no failure indication: the backup 
> appears successful, the source WAL files are permanently deleted, and the 
> only evidence is a warning message in the logs that may go unnoticed. The 
> data from those WAL files cannot be recovered because both the backup and the 
> source are missing that data.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to