[ https://issues.apache.org/jira/browse/SPARK-38934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Lily updated SPARK-38934: ------------------------- Priority: Major (was: Critical) > Provider TemporaryAWSCredentialsProvider has no credentials > ----------------------------------------------------------- > > Key: SPARK-38934 > URL: https://issues.apache.org/jira/browse/SPARK-38934 > Project: Spark > Issue Type: Bug > Components: Kubernetes, Spark Core > Affects Versions: 3.2.1 > Reporter: Lily > Priority: Major > > > We are using Jupyter Hub on K8s as a notebook based development environment > and Spark on K8s as a backend cluster of Jupyter Hub on K8s with Spark 3.2.1 > and Hadoop 3.3.1. > When we run a code like the one below in the Jupyter Hub on K8s, > > {code:java} > val perm = ... // get AWS temporary credential by AWS STS from AWS assumed > role > // set AWS temporary credential > spark.sparkContext.hadoopConfiguration.set("fs.s3a.aws.credentials.provider", > "org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider") > spark.sparkContext.hadoopConfiguration.set("fs.s3a.access.key", > perm.credential.accessKeyID) > spark.sparkContext.hadoopConfiguration.set("fs.s3a.secret.key", > perm.credential.secretAccessKey) > spark.sparkContext.hadoopConfiguration.set("fs.s3a.session.token", > perm.credential.sessionToken) > // execute simple Spark action > spark.read.format("parquet").load("s3a://<path>/*").show(1) {code} > > > the first few executors left a warning like the one below in the first code > execution, but we were able to get the proper result thanks to Spark task > retry function. > {code:java} > 22/04/18 09:13:50 WARN TaskSetManager: Lost task 2.0 in stage 0.0 (TID 2) > (10.197.5.15 executor 1): java.nio.file.AccessDeniedException: > s3a://<path>/<file>.parquet: > org.apache.hadoop.fs.s3a.CredentialInitializationException: Provider > TemporaryAWSCredentialsProvider has no credentials > at > org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:206) > at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:117) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.listStatus(S3AFileSystem.java:2810) > at > org.apache.spark.util.HadoopFSUtils$.listLeafFiles(HadoopFSUtils.scala:225) > at > org.apache.spark.util.HadoopFSUtils$.$anonfun$parallelListLeafFilesInternal$6(HadoopFSUtils.scala:136) > at scala.collection.immutable.Stream.map(Stream.scala:418) > at > org.apache.spark.util.HadoopFSUtils$.$anonfun$parallelListLeafFilesInternal$4(HadoopFSUtils.scala:126) > at org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2(RDD.scala:863) > at > org.apache.spark.rdd.RDD.$anonfun$mapPartitions$2$adapted(RDD.scala:863) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) > at > org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) > at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) > at org.apache.spark.scheduler.Task.run(Task.scala:131) > at > org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506) > at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1462) > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:829) > Caused by: org.apache.hadoop.fs.s3a.CredentialInitializationException: > Provider TemporaryAWSCredentialsProvider has no credentials > at > org.apache.hadoop.fs.s3a.auth.AbstractSessionCredentialsProvider.getCredentials(AbstractSessionCredentialsProvider.java:130) > at > org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:177) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.getCredentialsFromContext(AmazonHttpClient.java:1266) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.runBeforeRequestHandlers(AmazonHttpClient.java:842) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:792) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:779) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:753) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:713) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:695) > at > com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:559) > at > com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:539) > at > com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5445) > at > com.amazonaws.services.s3.AmazonS3Client.getBucketRegionViaHeadRequest(AmazonS3Client.java:6420) > at > com.amazonaws.services.s3.AmazonS3Client.fetchRegionFromCache(AmazonS3Client.java:6393) > at > com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5430) > at > com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5392) > at > com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5386) > at > com.amazonaws.services.s3.AmazonS3Client.listObjectsV2(AmazonS3Client.java:971) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$listObjects$7(S3AFileSystem.java:2116) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:489) > at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:412) > at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:375) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.listObjects(S3AFileSystem.java:2107) > at > org.apache.hadoop.fs.s3a.S3AFileSystem$ListingOperationCallbacksImpl.lambda$listObjectsAsync$0(S3AFileSystem.java:1750) > at > org.apache.hadoop.fs.s3a.impl.CallableSupplier.get(CallableSupplier.java:62) > at > java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700) > ... 3 more {code} > Would you explain why we are having this warning and tell us how we can > prevent experiencing this issue again? > Thank you in advance. > -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org