Hello Team,


Does Spark support role-based authentication and access to Amazon S3 for
Kubernetes deployment?

*Note: we have deployed our spark application in the Kubernetes cluster.*



Below are the Hadoop-AWS dependencies we are using:

<dependency>
   <groupId>org.apache.hadoop</groupId>
   <artifactId>hadoop-aws</artifactId>
   <version>3.3.4</version>
</dependency>



We are using the following configuration when creating the spark session,
but it is not working::

sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.aws.credentials.provider",
"org.apache.hadoop.fs.s3a.auth.AssumedRoleCredentialProvider");

sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.arn",
System.getenv("AWS_ROLE_ARN"));

sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.credentials.provider",
"com.amazonaws.auth.WebIdentityTokenCredentialsProvider");

sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.sts.endpoint",
"s3.eu-central-1.amazonaws.com");

sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.sts.endpoint.region",
Regions.EU_CENTRAL_1.getName());

sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.web.identity.token.file",
System.getenv("AWS_WEB_IDENTITY_TOKEN_FILE"));

sparkSession.sparkContext().hadoopConfiguration().set("fs.s3a.assumed.role.session.duration",
"30m");



Thank you!



Regards,

Atul

Reply via email to