Hello Team,

Does Spark support role-based authentication and access to Amazon S3 for 
Kubernetes deployment?
Note: we have deployed our spark application in the Kubernetes cluster.

Below are the Hadoop-AWS dependencies we are using:
<dependency>
   <groupId>org.apache.hadoop</groupId>
   <artifactId>hadoop-aws</artifactId>
   <version>3.3.4</version>
</dependency>

We are using the following configuration when creating the spark session, but 
it is not working::
sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.aws.credentials.provider",
 "org.apache.hadoop.fs.s3a.auth.AssumedRoleCredentialProvider");
sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.arn",
 System.getenv("AWS_ROLE_ARN"));
sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.credentials.provider",
 "com.amazonaws.auth.WebIdentityTokenCredentialsProvider");
sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.sts.endpoint",
 "s3.eu-central-1.amazonaws.com");
sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.sts.endpoint.region",
 Regions.EU_CENTRAL_1.getName());
sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.web.identity.token.file",
 System.getenv("AWS_WEB_IDENTITY_TOKEN_FILE"));
sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.session.duration",
 "30m");

Thank you!

Regards,
Atul

Reply via email to