endzyme opened a new issue, #475: URL: https://github.com/apache/solr-operator/issues/475
### Summary I tried backing up to S3 using IAM Role Assumption via Web Identity Tokens on EKS, and am getting errors. I tried with static AWS IAM credentials with the same policy and it works. There is an ominous `WARN` which may indicate why web identity token role assumption is not functioning. * * * ### Details I am experiencing some issues using the S3 Backup and Restore configuration using the Solr Operator. I am running the 8.11 Solr image and have configured the pods on our EKS cluster with the appropriate Kubernetes Service Account and the Service Account is annotated in the proper way with the IAM Role ARN. There is an interesting warning message when attempting to perform a backup. The shard leader will emit the message below before every attempt: ``` WARN (OverseerThreadFactory-34-thread-2-processing-n:dev-8-blue-solrcloud-2.solr:80_solr) [c:test ] s.a.a.a.c.i.WebIdentityCredentialsUtils To use web identity tokens, the 'sts' service module must be on the class path. ``` When I configure the SolrCloud resource with static IAM credentials I can perform the backup, but with the assumed role via web identity token I am receiving a 403 from S3 (see error message below). ``` 2022-09-15 15:17:21.941 WARN (OverseerThreadFactory-34-thread-1-processing-n:dev-8-blue-solrcloud-1.solr:80_solr) [c:test ] s.a.a.a.c.i.WebIdentityCredentialsUtils To use web identity tokens, the 'sts' service module must be on the class path. 2022-09-15 15:17:24.447 ERROR (OverseerThreadFactory-34-thread-1-processing-n:dev-8-blue-solrcloud-1.solr:80_solr) [c:test ] o.a.s.s.S3StorageClient An AmazonServiceException was thrown! [serviceName=S3] [awsRequestId=SNIP] [httpStatus=403] [s3ErrorCode=null] [message=null] 2022-09-15 15:17:24.449 ERROR (OverseerThreadFactory-34-thread-1-processing-n:dev-8-blue-solrcloud-1.solr:80_solr) [c:test ] o.a.s.c.a.c.OverseerCollectionMessageHandler Collection: test operation: backup failed => org.apache.solr.s3.S3Exception: An AmazonServiceException was thrown! [serviceName=S3] [awsRequestId=SNIP] [httpStatus=403] [s3ErrorCode=null] [message=null] at org.apache.solr.s3.S3StorageClient.handleAmazonException(S3StorageClient.java:598) org.apache.solr.s3.S3Exception: An AmazonServiceException was thrown! [serviceName=S3] [awsRequestId=SNIP] [httpStatus=403] [s3ErrorCode=null] [message=null] at org.apache.solr.s3.S3StorageClient.handleAmazonException(S3StorageClient.java:598) ~[?:?] at org.apache.solr.s3.S3StorageClient.pathExists(S3StorageClient.java:314) ~[?:?] at org.apache.solr.s3.S3BackupRepository.exists(S3BackupRepository.java:200) ~[?:?] at org.apache.solr.cloud.api.collections.BackupCmd.createAndValidateBackupPath(BackupCmd.java:154) ~[?:?] at org.apache.solr.cloud.api.collections.BackupCmd.call(BackupCmd.java:94) ~[?:?] at org.apache.solr.cloud.api.collections.OverseerCollectionMessageHandler.processMessage(OverseerCollectionMessageHandler.java:271) ~[?:?] at org.apache.solr.cloud.OverseerTaskProcessor$Runner.run(OverseerTaskProcessor.java:524) ~[?:?] at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218) ~[?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) ~[?:?] at java.lang.Thread.run(Unknown Source) [?:?] Caused by: software.amazon.awssdk.services.s3.model.S3Exception: null (Service: S3, Status Code: 403, Request ID: SNIP, Extended Request ID: SNIP) at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleErrorResponse(AwsXmlPredicatedResponseHandler.java:156) ~[?:?] at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleResponse(AwsXmlPredicatedResponseHandler.java:106) ~[?:?] at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:84) ~[?:?] at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:42) ~[?:?] at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler$Crc32ValidationResponseHandler.handle(AwsSyncClientHandler.java:95) ~[?:?] at software.amazon.awssdk.core.internal.handler.BaseClientHandler.lambda$successTransformationResponseHandler$6(BaseClientHandler.java:232) ~[?:?] at software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:40) ~[?:?] at software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:30) ~[?:?] at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) ~[?:?] at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:73) ~[?:?] at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:42) ~[?:?] at software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:78) ~[?:?] at software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:40) ~[?:?] at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:50) ~[?:?] at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:36) ~[?:?] at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:80) ~[?:?] at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:36) ~[?:?] at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) ~[?:?] at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:56) ~[?:?] at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:36) ~[?:?] at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.executeWithTimer(ApiCallTimeoutTrackingStage.java:80) ~[?:?] at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:60) ~[?:?] at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42) ~[?:?] at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:48) ~[?:?] at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:31) ~[?:?] at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) ~[?:?] at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) ~[?:?] at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:37) ~[?:?] at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:26) ~[?:?] at software.amazon.awssdk.core.internal.http.AmazonSyncHttpClient$RequestExecutionBuilderImpl.execute(AmazonSyncHttpClient.java:193) ~[?:?] at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.invoke(BaseSyncClientHandler.java:103) ~[?:?] at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.doExecute(BaseSyncClientHandler.java:167) ~[?:?] at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.lambda$execute$1(BaseSyncClientHandler.java:82) ~[?:?] at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.measureApiCallSuccess(BaseSyncClientHandler.java:175) ~[?:?] at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.execute(BaseSyncClientHandler.java:76) ~[?:?] at software.amazon.awssdk.core.client.handler.SdkSyncClientHandler.execute(SdkSyncClientHandler.java:45) ~[?:?] at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:56) ~[?:?] at software.amazon.awssdk.services.s3.DefaultS3Client.headObject(DefaultS3Client.java:5080) ~[?:?] at software.amazon.awssdk.services.s3.S3Client.headObject(S3Client.java:9886) ~[?:?] at org.apache.solr.s3.S3StorageClient.pathExists(S3StorageClient.java:309) ~[?:?] ... 9 more ``` Below are the things I've tried and observed: * Tested the IAM Role itself for access to the target bucket * Confirmed that the mutating webhook is in fact modifying the SolrCloud pods with the appropriate env vars and projected service account token volume mounts * Confirmed that I can use those tokens to assume the role and get to the S3 bucket * Tested with "static" IAM credentials via the `kubectl explain solrcloud.spec.backupRepositories.s3.credentials.credentialsFileSecret` configuration, the IAM user has the same policy as the IAM role, and this works for backups This warning about `the 'sts' service module must be on the class path` message makes me think that something else needs to be loaded in the Solr modules before this will work. I have looked through the documentation and everything seems to indicate that, when using EKS, it's a supported use case to use IAM Roles through Service Accounts. The documentation also appears to indicate that I do not need to specify anything extra in modules or plugins for SolrCloud K8s Resource because they are autoloaded when providing backup configurations of S3. Any help would be appreciated! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org