Dieter De Paepe created HBASE-29434:
---------------------------------------
Summary: Incremental backups fail during mergeSplitBulkloads
Key: HBASE-29434
URL: https://issues.apache.org/jira/browse/HBASE-29434
Project: HBase
Issue Type: Bug
Affects Versions: 2.6.2, 3.0.0, 4.0.0-alpha-1
Reporter: Dieter De Paepe
{code:java}
2025-06-30 08:24:48,857 ERROR
org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient: Failed to run
MapReduceHFileSplitterJob
org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory
hdfs://hdfsns/tmp/backup/hbase/.tmp/backup_1751271394481/lily_ngdata/CUSTOMER/data
already exists
at
org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:164)
at
org.apache.hadoop.mapreduce.JobSubmitter.checkSpecs(JobSubmitter.java:278)
at
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:142)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1677)
at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1674)
at
java.base/java.security.AccessController.doPrivileged(AccessController.java:714)
at java.base/javax.security.auth.Subject.doAs(Subject.java:525)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1953)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1674)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1695)
at
org.apache.hadoop.hbase.backup.mapreduce.MapReduceHFileSplitterJob.run(MapReduceHFileSplitterJob.java:171)
at
org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.mergeSplitBulkloads(IncrementalTableBackupClient.java:219)
at
org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.mergeSplitBulkloads(IncrementalTableBackupClient.java:203)
at
org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.handleBulkLoad(IncrementalTableBackupClient.java:174)
at
org.apache.hadoop.hbase.backup.impl.IncrementalTableBackupClient.execute(IncrementalTableBackupClient.java:309)
at
org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.backupTables(BackupAdminImpl.java:594)
{code}
Looking at the code, I think this is due to
"IncrementalTableBackupClient#mergeSplitBulkloads" calling
"mergeSplitBulkloads(List<String> files, TableName tn)" for both archived and
non-archived files. If both are non-empty, they'll end up in the same output
folder.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)