[jira] [Commented] (SPARK-24271) sc.hadoopConfigurations can not be overwritten in the same spark context

Steve Loughran (JIRA) Fri, 25 May 2018 02:37:27 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-24271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16490474#comment-16490474
 ]


Steve Loughran commented on SPARK-24271:
----------------------------------------

Disabling the s3 cache can be pretty inefficient, as every worker talking to a 
bucket is going to create a new instance, with its own AWS thread pool & things.

if you are trying to change perms for different buckets, you can use 
[per-bucket configuration 
instead|https://hadoop.apache.org/docs/r3.1.0/hadoop-aws/tools/hadoop-aws/index.html#Configuring_different_S3_buckets_with_Per-Bucket_Configuration]
{code}

fs.s3a.bucket.myfirstbucket.access.key=AAAAA
fs.s3a.bucket.myfirstbucket.secret.key=XXXX

fs.s3a.bucket.backups.access.key=BBBBB
fs.s3a.bucket.backups.secret.key=YYYYY
{code}
Same for things like endpoint.

these can all coexist in the same configuration file, where I'd recommend a 
spark-default.conf rather than code, as with code it's all to easy to 
accidentally commit your secrets somewhere public like github.

> sc.hadoopConfigurations can not be overwritten in the same spark context
> ------------------------------------------------------------------------
>
>                 Key: SPARK-24271
>                 URL: https://issues.apache.org/jira/browse/SPARK-24271
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Shell
>    Affects Versions: 2.3.0
>            Reporter: Jami Malikzade
>            Priority: Major
>
> If for example we pass to spark context  following configs :
> sc.hadoopConfiguration.set("fs.s3a.access.key", "correctAK") 
> sc.hadoopConfiguration.set("fs.s3a.secret.key", "correctSK") 
> sc.hadoopConfiguration.set("fs.s3a.endpoint", "objectstorage:8773") //
> sc.hadoopConfiguration.set("fs.s3a.impl", 
> "org.apache.hadoop.fs.s3a.S3AFileSystem")
> sc.hadoopConfiguration.set("fs.s3a.connection.ssl.enabled", "false")
> We are able later read from bucket. So behavior is expected.
> If in the same sc I will change credentials to wrong, and will try to read 
> from bucket it will still work,
> and vice versa if it were wrong credentials,changing to working will not work.
> sc.hadoopConfiguration.set("fs.s3a.access.key", "wrongAK") // 
> sc.hadoopConfiguration.set("fs.s3a.secret.key", "wrongSK") //



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-24271) sc.hadoopConfigurations can not be overwritten in the same spark context

Reply via email to