[ https://issues.apache.org/jira/browse/HBASE-28568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dieter De Paepe updated HBASE-28568: ------------------------------------ Description: The logic in BackupAdminImpl#finalizeDelete does not properly clean up tables from the incrementalBackupTableSet (= the set of backups to include in every incremental backup). This can lead to backups failing. Minimal example to reproduce from source: * Add following to `conf/hbase-site.xml` to enable backups: {code:java} <property> <name>hbase.backup.enable</name> <value>true</value> </property> <property> <name>hbase.master.logcleaner.plugins</name> <value>org.apache.hadoop.hbase.master.cleaner.TimeToLiveLogCleaner,org.apache.hadoop.hbase.master.cleaner.TimeToLiveProcedureWALCleaner,org.apache.hadoop.hbase.master.cleaner.TimeToLiveMasterLocalStoreWALCleaner,org.apache.hadoop.hbase.backup.master.BackupLogCleaner</value> </property> <property> <name>hbase.procedure.master.classes</name> <value>org.apache.hadoop.hbase.backup.master.LogRollMasterProcedureManager</value> </property> <property> <name>hbase.procedure.regionserver.classes</name> <value>org.apache.hadoop.hbase.backup.regionserver.LogRollRegionServerProcedureManager</value> </property> <property> <name>hbase.coprocessor.region.classes</name> <value>org.apache.hadoop.hbase.backup.BackupObserver</value> </property> <property> <name>hbase.fs.tmp.dir</name> <value>file:/tmp/hbase-tmp</value> </property> {code} * Start HBase: {{bin/start-hbase.sh}} * {code:java} echo "create 'table1', 'cf'" | bin/hbase shell -n echo "create 'table2', 'cf'" | bin/hbase shell -nbin/hbase backup create full file:/tmp/hbasebackups -t table1 bin/hbase backup create full file:/tmp/hbasebackups -t table2 bin/hbase backup create incremental file:/tmp/hbasebackups # Deletes the 2 most recent backups bin/hbase backup delete -l $(bin/hbase backup history | head -n1 | tail -n -1 | grep -o -P "backup_\d+"),$(bin/hbase backup history | head -n2 | tail -n -1 | grep -o -P "backup_\d+") bin/hbase backup create incremental file:/tmp/hbasebackups -t table1 bin/hbase backup history{code} * Output shows the incremental backup still includes table2, this should only be table1: {code:java} {ID=backup_1715000053763,Type=INCREMENTAL,Tables={table2,table1},State=COMPLETE,Start time=Mon May 06 14:54:14 CEST 2024,End time=Mon May 06 14:54:16 CEST 2024,Progress=100%} {ID=backup_1715000031407,Type=FULL,Tables={table1},State=COMPLETE,Start time=Mon May 06 14:53:52 CEST 2024,End time=Mon May 06 14:53:54 CEST 2024,Progress=100%} {code} PR will follow soon. (Edited: my original ticket included a stacktrace of an IllegalStateException from a PR for HBASE-28562) was: The logic in BackupAdminImpl#finalizeDelete does not properly clean up tables from the incrementalBackupTableSet (= the set of backups to include in every incremental backup). This can lead to backups failing. Minimal example to reproduce from source: * Add following to `conf/hbase-site.xml` to enable backups: {code:java} <property> <name>hbase.backup.enable</name> <value>true</value> </property> <property> <name>hbase.master.logcleaner.plugins</name> <value>org.apache.hadoop.hbase.master.cleaner.TimeToLiveLogCleaner,org.apache.hadoop.hbase.master.cleaner.TimeToLiveProcedureWALCleaner,org.apache.hadoop.hbase.master.cleaner.TimeToLiveMasterLocalStoreWALCleaner,org.apache.hadoop.hbase.backup.master.BackupLogCleaner</value> </property> <property> <name>hbase.procedure.master.classes</name> <value>org.apache.hadoop.hbase.backup.master.LogRollMasterProcedureManager</value> </property> <property> <name>hbase.procedure.regionserver.classes</name> <value>org.apache.hadoop.hbase.backup.regionserver.LogRollRegionServerProcedureManager</value> </property> <property> <name>hbase.coprocessor.region.classes</name> <value>org.apache.hadoop.hbase.backup.BackupObserver</value> </property> <property> <name>hbase.fs.tmp.dir</name> <value>file:/tmp/hbase-tmp</value> </property> {code} * Start HBase: {{bin/start-hbase.sh}} * {code:java} echo "create 'table1', 'cf'" | bin/hbase shell -n echo "create 'table2', 'cf'" | bin/hbase shell -nbin/hbase backup create full file:/tmp/hbasebackups -t table1 bin/hbase backup create full file:/tmp/hbasebackups -t table2 bin/hbase backup create incremental file:/tmp/hbasebackups # Deletes the 2 most recent backups bin/hbase backup delete -l $(bin/hbase backup history | head -n1 | tail -n -1 | grep -o -P "backup_\d+"),$(bin/hbase backup history | head -n2 | tail -n -1 | grep -o -P "backup_\d+") bin/hbase backup create incremental file:/tmp/hbasebackups -t table1 [...] 2024-05-06T14:28:46,420 INFO [main {}] mapreduce.MapReduceBackupCopyJob: Progress: 100.0% subTask: 1.0 mapProgress: 1.0 2024-05-06T14:28:46,468 ERROR [main {}] backup.BackupDriver: Error running command-line tool java.lang.IllegalStateException: Unable to find full backup that contains tables: [table2] at org.apache.hadoop.hbase.backup.impl.BackupManager.getAncestors(BackupManager.java:323) ~[hbase-backup-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT] at org.apache.hadoop.hbase.backup.impl.BackupManager.getAncestors(BackupManager.java:336) ~[hbase-backup-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT] at org.apache.hadoop.hbase.backup.impl.TableBackupClient.addManifest(TableBackupClient.java:286) ~[hbase-backup-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT] at org.apache.hadoop.hbase.backup.impl.TableBackupClient.completeBackup(TableBackupClient.java:351) ~[hbase-backup-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT] at org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:313) ~[hbase-backup-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT] at org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.backupTables(BackupAdminImpl.java:603) ~[hbase-backup-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT] at org.apache.hadoop.hbase.backup.impl.BackupCommands$CreateCommand.execute(BackupCommands.java:345) ~[hbase-backup-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT] at org.apache.hadoop.hbase.backup.BackupDriver.parseAndRun(BackupDriver.java:134) ~[hbase-backup-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT] at org.apache.hadoop.hbase.backup.BackupDriver.doWork(BackupDriver.java:169) ~[hbase-backup-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT] at org.apache.hadoop.hbase.backup.BackupDriver.run(BackupDriver.java:199) ~[hbase-backup-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT] at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82) ~[hadoop-common-3.3.5.jar:?] at org.apache.hadoop.hbase.backup.BackupDriver.main(BackupDriver.java:177) ~[hbase-backup-4.0.0-alpha-1-SNAPSHOT.jar:4.0.0-alpha-1-SNAPSHOT] {code} PR will follow soon. > Incremental backup set does not correctly shrink > ------------------------------------------------ > > Key: HBASE-28568 > URL: https://issues.apache.org/jira/browse/HBASE-28568 > Project: HBase > Issue Type: Bug > Components: backup&restore > Affects Versions: 2.6.0, 3.0.0 > Reporter: Dieter De Paepe > Priority: Major > > The logic in BackupAdminImpl#finalizeDelete does not properly clean up tables > from the incrementalBackupTableSet (= the set of backups to include in every > incremental backup). > This can lead to backups failing. > > Minimal example to reproduce from source: > * Add following to `conf/hbase-site.xml` to enable backups: > {code:java} > <property> > <name>hbase.backup.enable</name> > <value>true</value> > </property> > <property> > <name>hbase.master.logcleaner.plugins</name> > > <value>org.apache.hadoop.hbase.master.cleaner.TimeToLiveLogCleaner,org.apache.hadoop.hbase.master.cleaner.TimeToLiveProcedureWALCleaner,org.apache.hadoop.hbase.master.cleaner.TimeToLiveMasterLocalStoreWALCleaner,org.apache.hadoop.hbase.backup.master.BackupLogCleaner</value> > </property> > <property> > <name>hbase.procedure.master.classes</name> > > <value>org.apache.hadoop.hbase.backup.master.LogRollMasterProcedureManager</value> > </property> > <property> > <name>hbase.procedure.regionserver.classes</name> > > <value>org.apache.hadoop.hbase.backup.regionserver.LogRollRegionServerProcedureManager</value> > </property> > <property> > <name>hbase.coprocessor.region.classes</name> > <value>org.apache.hadoop.hbase.backup.BackupObserver</value> > </property> > <property> > <name>hbase.fs.tmp.dir</name> > <value>file:/tmp/hbase-tmp</value> > </property> {code} > * Start HBase: {{bin/start-hbase.sh}} > * > {code:java} > echo "create 'table1', 'cf'" | bin/hbase shell -n > echo "create 'table2', 'cf'" | bin/hbase shell -nbin/hbase backup create full > file:/tmp/hbasebackups -t table1 > bin/hbase backup create full file:/tmp/hbasebackups -t table2 > bin/hbase backup create incremental file:/tmp/hbasebackups > # Deletes the 2 most recent backups > bin/hbase backup delete -l $(bin/hbase backup history | head -n1 | tail -n > -1 | grep -o -P "backup_\d+"),$(bin/hbase backup history | head -n2 | tail > -n -1 | grep -o -P "backup_\d+") > bin/hbase backup create incremental file:/tmp/hbasebackups -t table1 > bin/hbase backup history{code} > * Output shows the incremental backup still includes table2, this should > only be table1: > {code:java} > {ID=backup_1715000053763,Type=INCREMENTAL,Tables={table2,table1},State=COMPLETE,Start > time=Mon May 06 14:54:14 CEST 2024,End time=Mon May 06 14:54:16 CEST > 2024,Progress=100%} > {ID=backup_1715000031407,Type=FULL,Tables={table1},State=COMPLETE,Start > time=Mon May 06 14:53:52 CEST 2024,End time=Mon May 06 14:53:54 CEST > 2024,Progress=100%} > {code} > PR will follow soon. > (Edited: my original ticket included a stacktrace of an IllegalStateException > from a PR for HBASE-28562) -- This message was sent by Atlassian Jira (v8.20.10#820010)