[ https://issues.apache.org/jira/browse/HBASE-27542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jarryd Lee updated HBASE-27542: ------------------------------- Description: During the completion step of incremental backups, the [TableBackupClient|https://github.com/apache/hbase/blob/2c3abae18aa35e2693b64b143316817d4569d0c3/hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/TableBackupClient.java#L343-L355] ensures distcp logs are cleaned up. However, [DistCp|https://github.com/apache/hadoop/blob/b87c0ea7ebde3edc312dcc8938809610a914df7f/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCp.java#L465-L476] already ensures that the metafolder, where the distcp logs are stored, is cleaned up via the [CopyCommitter|https://github.com/apache/hadoop/blob/b743d56eb4bf350448bd315540fde4f029175082/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/CopyCommitter.java#L189]. Additionally, when running in an external yarn cluster external to hbase, this causes issues as it assumes a common filesystem. The TableBackupClient cleanup method should be able to be safely removed. e: After digging in to this even more, the `_distcp_log` path used in the `TableBackupClient` is not actually the correct path used by theĀ was: During the completion step of incremental backups, the [TableBackupClient|https://github.com/apache/hbase/blob/2c3abae18aa35e2693b64b143316817d4569d0c3/hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/TableBackupClient.java#L343-L355] ensures distcp logs are cleaned up. However, [DistCp|https://github.com/apache/hadoop/blob/b87c0ea7ebde3edc312dcc8938809610a914df7f/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCp.java#L465-L476] already ensures that the metafolder, where the distcp logs are stored, is cleaned up via the [CopyCommitter|https://github.com/apache/hadoop/blob/b743d56eb4bf350448bd315540fde4f029175082/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/CopyCommitter.java#L189]. Additionally, when running in an external yarn cluster external to hbase, this causes issues as it assumes a common filesystem. The TableBackupClient cleanup method should be able to be safely removed. > Remove unneeded distcp log cleanup after incremental backups > ------------------------------------------------------------ > > Key: HBASE-27542 > URL: https://issues.apache.org/jira/browse/HBASE-27542 > Project: HBase > Issue Type: Improvement > Components: backup&restore > Affects Versions: 3.0.0-alpha-3 > Reporter: Jarryd Lee > Priority: Minor > > During the completion step of incremental backups, the > [TableBackupClient|https://github.com/apache/hbase/blob/2c3abae18aa35e2693b64b143316817d4569d0c3/hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/TableBackupClient.java#L343-L355] > ensures distcp logs are cleaned up. However, > [DistCp|https://github.com/apache/hadoop/blob/b87c0ea7ebde3edc312dcc8938809610a914df7f/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/DistCp.java#L465-L476] > already ensures that the metafolder, where the distcp logs are stored, is > cleaned up via the > [CopyCommitter|https://github.com/apache/hadoop/blob/b743d56eb4bf350448bd315540fde4f029175082/hadoop-tools/hadoop-distcp/src/main/java/org/apache/hadoop/tools/mapred/CopyCommitter.java#L189]. > Additionally, when running in an external yarn cluster external to hbase, > this causes issues as it assumes a common filesystem. > The TableBackupClient cleanup method should be able to be safely removed. > e: After digging in to this even more, the `_distcp_log` path used in the > `TableBackupClient` is not actually the correct path used by theĀ -- This message was sent by Atlassian Jira (v8.20.10#820010)