[
https://issues.apache.org/jira/browse/HBASE-28103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dieter De Paepe updated HBASE-28103:
------------------------------------
Component/s: backup&restore
> HBase backup repair stuck after failed delete due to missing S3 credentials
> ---------------------------------------------------------------------------
>
> Key: HBASE-28103
> URL: https://issues.apache.org/jira/browse/HBASE-28103
> Project: HBase
> Issue Type: Bug
> Components: backup&restore
> Reporter: Dieter De Paepe
> Priority: Major
>
> I was experimenting what happens if a user were to execute `hbase backupe
> delete` without providing S3 credentials.
> I started with a backup present in a S3 bucket.
>
> {noformat}
> hbase backup history
> {ID=backup_1695226626227,Type=FULL,Tables={foo:bar},State=COMPLETE,Start
> time=Wed Sep 20 16:17:09 UTC 2023,End time=Wed Sep 20 16:17:42 UTC
> 2023,Progress=100%}
> {noformat}
> I tried to delete this without providing S3 credentials, it failed (as
> expected).
>
>
> {noformat}
> hbase backup delete -l backup_1695226626227
> 23/09/20 16:18:46 ERROR org.apache.hadoop.hbase.backup.impl.BackupAdminImpl:
> Delete operation failed, please run backup repair utility to restore backup
> system integrity
> java.nio.file.AccessDeniedException:
> s3a://backuprestore-experiments/hbase/backup_1695226626227:
> org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials
> provided by TemporaryAWSCredentialsProvider SimpleAWSCredentialsProvider
> EnvironmentVariableCredentialsProvider IAMInstanceCredentialsProvider :
> com.amazonaws.SdkClientException: Unable to load AWS credentials from
> environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and
> AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))
> at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:215)
> at org.apache.hadoop.fs.s3a.Invoker.onceInTheFuture(Invoker.java:190)
> at
> org.apache.hadoop.fs.s3a.Listing$ObjectListingIterator.next(Listing.java:651)
> at
> org.apache.hadoop.fs.s3a.Listing$FileStatusListingIterator.requestNextBatch(Listing.java:430)
> at
> org.apache.hadoop.fs.s3a.Listing$FileStatusListingIterator.<init>(Listing.java:372)
> at
> org.apache.hadoop.fs.s3a.Listing.createFileStatusListingIterator(Listing.java:143)
> at
> org.apache.hadoop.fs.s3a.Listing.getFileStatusesAssumingNonEmptyDir(Listing.java:264)
> at
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerListStatus(S3AFileSystem.java:3369)
> at
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$null$22(S3AFileSystem.java:3346)
> at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:122)
> at
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listStatus$23(S3AFileSystem.java:3345)
> at
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
> at
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
> at
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
> at
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2480)
> at
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2499)
> at
> org.apache.hadoop.fs.s3a.S3AFileSystem.listStatus(S3AFileSystem.java:3344)
> at
> org.apache.hadoop.hbase.backup.util.BackupUtils.listStatus(BackupUtils.java:522)
> at
> org.apache.hadoop.hbase.backup.util.BackupUtils.cleanupHLogDir(BackupUtils.java:430)
> at
> org.apache.hadoop.hbase.backup.util.BackupUtils.cleanupBackupData(BackupUtils.java:411)
> at
> org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.deleteBackup(BackupAdminImpl.java:229)
> at
> org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.deleteBackups(BackupAdminImpl.java:142)
> at
> org.apache.hadoop.hbase.backup.impl.BackupCommands$DeleteCommand.executeDeleteListOfBackups(BackupCommands.java:627)
> at
> org.apache.hadoop.hbase.backup.impl.BackupCommands$DeleteCommand.execute(BackupCommands.java:578)
> at
> org.apache.hadoop.hbase.backup.BackupDriver.parseAndRun(BackupDriver.java:134)
> at
> org.apache.hadoop.hbase.backup.BackupDriver.doWork(BackupDriver.java:169)
> at org.apache.hadoop.hbase.backup.BackupDriver.run(BackupDriver.java:199)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
> at org.apache.hadoop.hbase.backup.BackupDriver.main(BackupDriver.java:177)
> Caused by: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS
> Credentials provided by TemporaryAWSCredentialsProvider
> SimpleAWSCredentialsProvider EnvironmentVariableCredentialsProvider
> IAMInstanceCredentialsProvider : com.amazonaws.SdkClientException: Unable to
> load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or
> AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))
> at
> org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:216)
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.getCredentialsFromContext(AmazonHttpClient.java:1269)
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.runBeforeRequestHandlers(AmazonHttpClient.java:845)
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:794)
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:781)
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:755)
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:715)
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:697)
> at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:561)
> at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:541)
> at
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5456)
> at
> com.amazonaws.services.s3.AmazonS3Client.getBucketRegionViaHeadRequest(AmazonS3Client.java:6432)
> at
> com.amazonaws.services.s3.AmazonS3Client.fetchRegionFromCache(AmazonS3Client.java:6404)
> at
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5441)
> at
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5403)
> at
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5397)
> at
> com.amazonaws.services.s3.AmazonS3Client.listObjectsV2(AmazonS3Client.java:971)
> at
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listObjects$12(S3AFileSystem.java:2715)
> at
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
> at
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
> at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468)
> at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:431)
> at
> org.apache.hadoop.fs.s3a.S3AFileSystem.listObjects(S3AFileSystem.java:2706)
> at
> org.apache.hadoop.fs.s3a.S3AFileSystem$ListingOperationCallbacksImpl.lambda$listObjectsAsync$0(S3AFileSystem.java:2342)
> at
> org.apache.hadoop.fs.s3a.impl.CallableSupplier.get(CallableSupplier.java:87)
> at
> java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:829)
> Caused by: com.amazonaws.SdkClientException: Unable to load AWS credentials
> from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and
> AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))
> at
> com.amazonaws.auth.EnvironmentVariableCredentialsProvider.getCredentials(EnvironmentVariableCredentialsProvider.java:49)
> at
> org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:177)
> ... 28 more
> Delete command FAILED. Please run backup repair tool to restore backup system
> integrity
> 23/09/20 16:18:46 ERROR org.apache.hadoop.hbase.backup.BackupDriver: Error
> running command-line tool
> java.nio.file.AccessDeniedException:
> s3a://backuprestore-experiments/hbase/backup_1695226626227:
> org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials
> provided by TemporaryAWSCredentialsProvider SimpleAWSCredentialsProvider
> EnvironmentVariableCredentialsProvider IAMInstanceCredentialsProvider :
> com.amazonaws.SdkClientException: Unable to load AWS credentials from
> environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and
> AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))
> at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:215)
> at org.apache.hadoop.fs.s3a.Invoker.onceInTheFuture(Invoker.java:190)
> at
> org.apache.hadoop.fs.s3a.Listing$ObjectListingIterator.next(Listing.java:651)
> at
> org.apache.hadoop.fs.s3a.Listing$FileStatusListingIterator.requestNextBatch(Listing.java:430)
> at
> org.apache.hadoop.fs.s3a.Listing$FileStatusListingIterator.<init>(Listing.java:372)
> at
> org.apache.hadoop.fs.s3a.Listing.createFileStatusListingIterator(Listing.java:143)
> at
> org.apache.hadoop.fs.s3a.Listing.getFileStatusesAssumingNonEmptyDir(Listing.java:264)
> at
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerListStatus(S3AFileSystem.java:3369)
> at
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$null$22(S3AFileSystem.java:3346)
> at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:122)
> at
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listStatus$23(S3AFileSystem.java:3345)
> at
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
> at
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
> at
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449)
> at
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2480)
> at
> org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2499)
> at
> org.apache.hadoop.fs.s3a.S3AFileSystem.listStatus(S3AFileSystem.java:3344)
> at
> org.apache.hadoop.hbase.backup.util.BackupUtils.listStatus(BackupUtils.java:522)
> at
> org.apache.hadoop.hbase.backup.util.BackupUtils.cleanupHLogDir(BackupUtils.java:430)
> at
> org.apache.hadoop.hbase.backup.util.BackupUtils.cleanupBackupData(BackupUtils.java:411)
> at
> org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.deleteBackup(BackupAdminImpl.java:229)
> at
> org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.deleteBackups(BackupAdminImpl.java:142)
> at
> org.apache.hadoop.hbase.backup.impl.BackupCommands$DeleteCommand.executeDeleteListOfBackups(BackupCommands.java:627)
> at
> org.apache.hadoop.hbase.backup.impl.BackupCommands$DeleteCommand.execute(BackupCommands.java:578)
> at
> org.apache.hadoop.hbase.backup.BackupDriver.parseAndRun(BackupDriver.java:134)
> at
> org.apache.hadoop.hbase.backup.BackupDriver.doWork(BackupDriver.java:169)
> at org.apache.hadoop.hbase.backup.BackupDriver.run(BackupDriver.java:199)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
> at org.apache.hadoop.hbase.backup.BackupDriver.main(BackupDriver.java:177)
> Caused by: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS
> Credentials provided by TemporaryAWSCredentialsProvider
> SimpleAWSCredentialsProvider EnvironmentVariableCredentialsProvider
> IAMInstanceCredentialsProvider : com.amazonaws.SdkClientException: Unable to
> load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or
> AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))
> at
> org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:216)
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.getCredentialsFromContext(AmazonHttpClient.java:1269)
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.runBeforeRequestHandlers(AmazonHttpClient.java:845)
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:794)
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:781)
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:755)
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:715)
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:697)
> at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:561)
> at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:541)
> at
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5456)
> at
> com.amazonaws.services.s3.AmazonS3Client.getBucketRegionViaHeadRequest(AmazonS3Client.java:6432)
> at
> com.amazonaws.services.s3.AmazonS3Client.fetchRegionFromCache(AmazonS3Client.java:6404)
> at
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5441)
> at
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5403)
> at
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5397)
> at
> com.amazonaws.services.s3.AmazonS3Client.listObjectsV2(AmazonS3Client.java:971)
> at
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listObjects$12(S3AFileSystem.java:2715)
> at
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
> at
> org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528)
> at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468)
> at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:431)
> at
> org.apache.hadoop.fs.s3a.S3AFileSystem.listObjects(S3AFileSystem.java:2706)
> at
> org.apache.hadoop.fs.s3a.S3AFileSystem$ListingOperationCallbacksImpl.lambda$listObjectsAsync$0(S3AFileSystem.java:2342)
> at
> org.apache.hadoop.fs.s3a.impl.CallableSupplier.get(CallableSupplier.java:87)
> at
> java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
> at
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
> at java.base/java.lang.Thread.run(Thread.java:829)
> Caused by: com.amazonaws.SdkClientException: Unable to load AWS credentials
> from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and
> AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))
> at
> com.amazonaws.auth.EnvironmentVariableCredentialsProvider.getCredentials(EnvironmentVariableCredentialsProvider.java:49)
> at
> org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:177)
> ... 28 more
> {noformat}
> At this point, I cannot start a new backup because a failed delete command is
> present:
>
>
> {noformat}
> hbase backup \
> -libjars
> /opt/hadoop/share/hadoop/tools/lib/hadoop-aws-3.3.6-1-lily.jar,/opt/hadoop/share/hadoop/tools/lib/aws-java-sdk-bundle-1.12.367.jar
> \
> -Dfs.s3a.access.key=... \
> -Dfs.s3a.secret.key=... \
> -Dfs.s3a.session.token=... \
> create incremental s3a://backuprestore-experiments/hbase -t foo:bar
> Found failed backup DELETE coommand.
> Backup system recovery is required.
> 23/09/20 16:31:16 ERROR org.apache.hadoop.hbase.backup.BackupDriver: Error
> running command-line tool
> java.io.IOException: Failed backup DELETE found, aborted command execution
> at
> org.apache.hadoop.hbase.backup.impl.BackupCommands$Command.execute(BackupCommands.java:167)
> at
> org.apache.hadoop.hbase.backup.impl.BackupCommands$CreateCommand.execute(BackupCommands.java:309)
> at
> org.apache.hadoop.hbase.backup.BackupDriver.parseAndRun(BackupDriver.java:134)
> at
> org.apache.hadoop.hbase.backup.BackupDriver.doWork(BackupDriver.java:169)
> at org.apache.hadoop.hbase.backup.BackupDriver.run(BackupDriver.java:199)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
> at org.apache.hadoop.hbase.backup.BackupDriver.main(BackupDriver.java:177)
> {noformat}
> However, backup is unable to complete.
>
>
> {noformat}
> hbase backup repair
> REPAIR status: no failed sessions found. Checking failed delete backup
> operation ...
> Found failed DELETE operation for: backup_1695226626227
> Running DELETE again ...
> 23/09/20 16:34:13 WARN org.apache.hadoop.hbase.backup.impl.BackupSystemTable:
> Could not restore backup system table. Snapshot snapshot_backup_system does
> not exists.
> 23/09/20 16:34:13 ERROR org.apache.hadoop.hbase.backup.BackupDriver: Error
> running command-line tool
> java.io.IOException: There is no active backup exclusive operation
> at
> org.apache.hadoop.hbase.backup.impl.BackupSystemTable.finishBackupExclusiveOperation(BackupSystemTable.java:645)
> at
> org.apache.hadoop.hbase.backup.impl.BackupCommands$RepairCommand.repairFailedBackupDeletionIfAny(BackupCommands.java:721)
> at
> org.apache.hadoop.hbase.backup.impl.BackupCommands$RepairCommand.execute(BackupCommands.java:681)
> at
> org.apache.hadoop.hbase.backup.BackupDriver.parseAndRun(BackupDriver.java:134)
> at
> org.apache.hadoop.hbase.backup.BackupDriver.doWork(BackupDriver.java:169)
> at org.apache.hadoop.hbase.backup.BackupDriver.run(BackupDriver.java:199)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:82)
> at org.apache.hadoop.hbase.backup.BackupDriver.main(BackupDriver.java:177)
> {noformat}
> The core issue seems to be the assumption that there is a "backup exclusive
> operation" for each failed delete command.
> A good feature would also be to allow the repair command to delete the
> pending delete. Though I guess that in some cases that may not result in a
> reliable state if data was already partially deleted.
> The workaround in this case would be to delete the delete commands from the
> backup table I guess?
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)