[ 
https://issues.apache.org/jira/browse/HBASE-28103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dieter De Paepe updated HBASE-28103:
------------------------------------
    Component/s: backup&restore

> HBase backup repair stuck after failed delete due to missing S3 credentials
> ---------------------------------------------------------------------------
>
>                 Key: HBASE-28103
>                 URL: https://issues.apache.org/jira/browse/HBASE-28103
>             Project: HBase
>          Issue Type: Bug
>          Components: backup&restore
>            Reporter: Dieter De Paepe
>            Priority: Major
>
> I was experimenting what happens if a user were to execute `hbase backupe 
> delete` without providing S3 credentials.
> I started with a backup present in a S3 bucket.
>  
> {noformat}
> hbase backup history
> {ID=backup_1695226626227,Type=FULL,Tables={foo:bar},State=COMPLETE,Start 
> time=Wed Sep 20 16:17:09 UTC 2023,End time=Wed Sep 20 16:17:42 UTC 
> 2023,Progress=100%}
> {noformat}
> I tried to delete this without providing S3 credentials, it failed (as 
> expected).
>  
>  
> {noformat}
> hbase backup delete -l backup_1695226626227
> 23/09/20 16:18:46 ERROR org.apache.hadoop.hbase.backup.impl.BackupAdminImpl: 
> Delete operation failed, please run backup repair utility to restore backup 
> system integrity
> java.nio.file.AccessDeniedException: 
> s3a://backuprestore-experiments/hbase/backup_1695226626227: 
> org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials 
> provided by TemporaryAWSCredentialsProvider SimpleAWSCredentialsProvider 
> EnvironmentVariableCredentialsProvider IAMInstanceCredentialsProvider : 
> com.amazonaws.SdkClientException: Unable to load AWS credentials from 
> environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and 
> AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))
>     at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:215)
>     at org.apache.hadoop.fs.s3a.Invoker.onceInTheFuture(Invoker.java:190)
>     at 
> org.apache.hadoop.fs.s3a.Listing$ObjectListingIterator.next(Listing.java:651)
>     at 
> org.apache.hadoop.fs.s3a.Listing$FileStatusListingIterator.requestNextBatch(Listing.java:430)
>     at 
> org.apache.hadoop.fs.s3a.Listing$FileStatusListingIterator.<init>(Listing.java:372)
>     at 
> org.apache.hadoop.fs.s3a.Listing.createFileStatusListingIterator(Listing.java:143)
>     at 
> org.apache.hadoop.fs.s3a.Listing.getFileStatusesAssumingNonEmptyDir(Listing.java:264)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerListStatus(S3AFileSystem.java:3369)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$null$22(S3AFileSystem.java:3346)
>     at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:122)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listStatus$23(S3AFileSystem.java:3345)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2480)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2499)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.listStatus(S3AFileSystem.java:3344)
>     at 
> org.apache.hadoop.hbase.backup.util.BackupUtils.listStatus(BackupUtils.java:522)
>     at 
> org.apache.hadoop.hbase.backup.util.BackupUtils.cleanupHLogDir(BackupUtils.java:430)
>     at 
> org.apache.hadoop.hbase.backup.util.BackupUtils.cleanupBackupData(BackupUtils.java:411)
>     at 
> org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.deleteBackup(BackupAdminImpl.java:229)
>     at 
> org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.deleteBackups(BackupAdminImpl.java:142)
>     at 
> org.apache.hadoop.hbase.backup.impl.BackupCommands$DeleteCommand.executeDeleteListOfBackups(BackupCommands.java:627)
>     at 
> org.apache.hadoop.hbase.backup.impl.BackupCommands$DeleteCommand.execute(BackupCommands.java:578)
>     at 
> org.apache.hadoop.hbase.backup.BackupDriver.parseAndRun(BackupDriver.java:134)
>     at 
> org.apache.hadoop.hbase.backup.BackupDriver.doWork(BackupDriver.java:169)
>     at org.apache.hadoop.hbase.backup.BackupDriver.run(BackupDriver.java:199)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
>     at org.apache.hadoop.hbase.backup.BackupDriver.main(BackupDriver.java:177)
> Caused by: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS 
> Credentials provided by TemporaryAWSCredentialsProvider 
> SimpleAWSCredentialsProvider EnvironmentVariableCredentialsProvider 
> IAMInstanceCredentialsProvider : com.amazonaws.SdkClientException: Unable to 
> load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or 
> AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))
>     at 
> org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:216)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.getCredentialsFromContext(AmazonHttpClient.java:1269)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.runBeforeRequestHandlers(AmazonHttpClient.java:845)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:794)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:781)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:755)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:715)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:697)
>     at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:561)
>     at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:541)
>     at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5456)
>     at 
> com.amazonaws.services.s3.AmazonS3Client.getBucketRegionViaHeadRequest(AmazonS3Client.java:6432)
>     at 
> com.amazonaws.services.s3.AmazonS3Client.fetchRegionFromCache(AmazonS3Client.java:6404)
>     at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5441)
>     at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5403)
>     at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5397)
>     at 
> com.amazonaws.services.s3.AmazonS3Client.listObjectsV2(AmazonS3Client.java:971)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listObjects$12(S3AFileSystem.java:2715)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
>     at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468)
>     at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:431)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.listObjects(S3AFileSystem.java:2706)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem$ListingOperationCallbacksImpl.lambda$listObjectsAsync$0(S3AFileSystem.java:2342)
>     at 
> org.apache.hadoop.fs.s3a.impl.CallableSupplier.get(CallableSupplier.java:87)
>     at 
> java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>     at java.base/java.lang.Thread.run(Thread.java:829)
> Caused by: com.amazonaws.SdkClientException: Unable to load AWS credentials 
> from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and 
> AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))
>     at 
> com.amazonaws.auth.EnvironmentVariableCredentialsProvider.getCredentials(EnvironmentVariableCredentialsProvider.java:49)
>     at 
> org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:177)
>     ... 28 more
> Delete command FAILED. Please run backup repair tool to restore backup system 
> integrity
> 23/09/20 16:18:46 ERROR org.apache.hadoop.hbase.backup.BackupDriver: Error 
> running command-line tool
> java.nio.file.AccessDeniedException: 
> s3a://backuprestore-experiments/hbase/backup_1695226626227: 
> org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials 
> provided by TemporaryAWSCredentialsProvider SimpleAWSCredentialsProvider 
> EnvironmentVariableCredentialsProvider IAMInstanceCredentialsProvider : 
> com.amazonaws.SdkClientException: Unable to load AWS credentials from 
> environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and 
> AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))
>     at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:215)
>     at org.apache.hadoop.fs.s3a.Invoker.onceInTheFuture(Invoker.java:190)
>     at 
> org.apache.hadoop.fs.s3a.Listing$ObjectListingIterator.next(Listing.java:651)
>     at 
> org.apache.hadoop.fs.s3a.Listing$FileStatusListingIterator.requestNextBatch(Listing.java:430)
>     at 
> org.apache.hadoop.fs.s3a.Listing$FileStatusListingIterator.<init>(Listing.java:372)
>     at 
> org.apache.hadoop.fs.s3a.Listing.createFileStatusListingIterator(Listing.java:143)
>     at 
> org.apache.hadoop.fs.s3a.Listing.getFileStatusesAssumingNonEmptyDir(Listing.java:264)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerListStatus(S3AFileSystem.java:3369)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$null$22(S3AFileSystem.java:3346)
>     at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:122)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listStatus$23(S3AFileSystem.java:3345)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2480)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2499)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.listStatus(S3AFileSystem.java:3344)
>     at 
> org.apache.hadoop.hbase.backup.util.BackupUtils.listStatus(BackupUtils.java:522)
>     at 
> org.apache.hadoop.hbase.backup.util.BackupUtils.cleanupHLogDir(BackupUtils.java:430)
>     at 
> org.apache.hadoop.hbase.backup.util.BackupUtils.cleanupBackupData(BackupUtils.java:411)
>     at 
> org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.deleteBackup(BackupAdminImpl.java:229)
>     at 
> org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.deleteBackups(BackupAdminImpl.java:142)
>     at 
> org.apache.hadoop.hbase.backup.impl.BackupCommands$DeleteCommand.executeDeleteListOfBackups(BackupCommands.java:627)
>     at 
> org.apache.hadoop.hbase.backup.impl.BackupCommands$DeleteCommand.execute(BackupCommands.java:578)
>     at 
> org.apache.hadoop.hbase.backup.BackupDriver.parseAndRun(BackupDriver.java:134)
>     at 
> org.apache.hadoop.hbase.backup.BackupDriver.doWork(BackupDriver.java:169)
>     at org.apache.hadoop.hbase.backup.BackupDriver.run(BackupDriver.java:199)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
>     at org.apache.hadoop.hbase.backup.BackupDriver.main(BackupDriver.java:177)
> Caused by: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS 
> Credentials provided by TemporaryAWSCredentialsProvider 
> SimpleAWSCredentialsProvider EnvironmentVariableCredentialsProvider 
> IAMInstanceCredentialsProvider : com.amazonaws.SdkClientException: Unable to 
> load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or 
> AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))
>     at 
> org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:216)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.getCredentialsFromContext(AmazonHttpClient.java:1269)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.runBeforeRequestHandlers(AmazonHttpClient.java:845)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:794)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:781)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:755)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:715)
>     at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:697)
>     at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:561)
>     at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:541)
>     at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5456)
>     at 
> com.amazonaws.services.s3.AmazonS3Client.getBucketRegionViaHeadRequest(AmazonS3Client.java:6432)
>     at 
> com.amazonaws.services.s3.AmazonS3Client.fetchRegionFromCache(AmazonS3Client.java:6404)
>     at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5441)
>     at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5403)
>     at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5397)
>     at 
> com.amazonaws.services.s3.AmazonS3Client.listObjectsV2(AmazonS3Client.java:971)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listObjects$12(S3AFileSystem.java:2715)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
>     at 
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
>     at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468)
>     at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:431)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.listObjects(S3AFileSystem.java:2706)
>     at 
> org.apache.hadoop.fs.s3a.S3AFileSystem$ListingOperationCallbacksImpl.lambda$listObjectsAsync$0(S3AFileSystem.java:2342)
>     at 
> org.apache.hadoop.fs.s3a.impl.CallableSupplier.get(CallableSupplier.java:87)
>     at 
> java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>     at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>     at java.base/java.lang.Thread.run(Thread.java:829)
> Caused by: com.amazonaws.SdkClientException: Unable to load AWS credentials 
> from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and 
> AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))
>     at 
> com.amazonaws.auth.EnvironmentVariableCredentialsProvider.getCredentials(EnvironmentVariableCredentialsProvider.java:49)
>     at 
> org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:177)
>     ... 28 more
> {noformat}
> At this point, I cannot start a new backup because a failed delete command is 
> present:
>  
>  
> {noformat}
> hbase backup \
>   -libjars 
> /opt/hadoop/share/hadoop/tools/lib/hadoop-aws-3.3.6-1-lily.jar,/opt/hadoop/share/hadoop/tools/lib/aws-java-sdk-bundle-1.12.367.jar
>  \
>   -Dfs.s3a.access.key=... \
>   -Dfs.s3a.secret.key=... \
>   -Dfs.s3a.session.token=... \
>    create incremental s3a://backuprestore-experiments/hbase -t foo:bar 
> Found failed backup DELETE coommand. 
> Backup system recovery is required.
> 23/09/20 16:31:16 ERROR org.apache.hadoop.hbase.backup.BackupDriver: Error 
> running command-line tool
> java.io.IOException: Failed backup DELETE found, aborted command execution
>     at 
> org.apache.hadoop.hbase.backup.impl.BackupCommands$Command.execute(BackupCommands.java:167)
>     at 
> org.apache.hadoop.hbase.backup.impl.BackupCommands$CreateCommand.execute(BackupCommands.java:309)
>     at 
> org.apache.hadoop.hbase.backup.BackupDriver.parseAndRun(BackupDriver.java:134)
>     at 
> org.apache.hadoop.hbase.backup.BackupDriver.doWork(BackupDriver.java:169)
>     at org.apache.hadoop.hbase.backup.BackupDriver.run(BackupDriver.java:199)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
>     at org.apache.hadoop.hbase.backup.BackupDriver.main(BackupDriver.java:177)
> {noformat}
> However, backup is unable to complete.
>  
>  
> {noformat}
> hbase backup repair
> REPAIR status: no failed sessions found. Checking failed delete backup 
> operation ...
> Found failed DELETE operation for: backup_1695226626227
> Running DELETE again ...
> 23/09/20 16:34:13 WARN org.apache.hadoop.hbase.backup.impl.BackupSystemTable: 
> Could not restore backup system table. Snapshot snapshot_backup_system does 
> not exists.
> 23/09/20 16:34:13 ERROR org.apache.hadoop.hbase.backup.BackupDriver: Error 
> running command-line tool
> java.io.IOException: There is no active backup exclusive operation
>     at 
> org.apache.hadoop.hbase.backup.impl.BackupSystemTable.finishBackupExclusiveOperation(BackupSystemTable.java:645)
>     at 
> org.apache.hadoop.hbase.backup.impl.BackupCommands$RepairCommand.repairFailedBackupDeletionIfAny(BackupCommands.java:721)
>     at 
> org.apache.hadoop.hbase.backup.impl.BackupCommands$RepairCommand.execute(BackupCommands.java:681)
>     at 
> org.apache.hadoop.hbase.backup.BackupDriver.parseAndRun(BackupDriver.java:134)
>     at 
> org.apache.hadoop.hbase.backup.BackupDriver.doWork(BackupDriver.java:169)
>     at org.apache.hadoop.hbase.backup.BackupDriver.run(BackupDriver.java:199)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
>     at org.apache.hadoop.hbase.backup.BackupDriver.main(BackupDriver.java:177)
> {noformat}
> The core issue seems to be the assumption that there is a "backup exclusive 
> operation" for each failed delete command.
> A good feature would also be to allow the repair command to delete the 
> pending delete. Though I guess that in some cases that may not result in a 
> reliable state if data was already partially deleted.
> The workaround in this case would be to delete the delete commands from the 
> backup table I guess?
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to