[ 
https://issues.apache.org/jira/browse/SPARK-13979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen George updated SPARK-13979:
---------------------------------
    Description: 
I'm having a problem where respawning a failed executor during a job that 
reads/writes parquet on S3 causes subsequent tasks to fail because of missing 
AWS keys.

h4. Setup:

I'm using Spark 1.5.2 with Hadoop 2.7 and running experiments on a simple 
standalone cluster:

1 master
2 workers

My application is co-located on the master machine, while the two workers are 
on two other machines (one worker per machine). All machines are running in 
EC2. I've configured my setup so that my application executes its task on two 
executors (one executor per worker).

h4. Application:

My application reads and writes parquet files on S3. I set the AWS keys on the 
SparkContext by doing:

val sc = new SparkContext()
val hadoopConf = sc.hadoopConfiguration
hadoopConf.set("fs.s3n.awsAccessKeyId", "SOME_KEY")
hadoopConf.set("fs.s3n.awsSecretAccessKey", "SOME_SECRET")

At this point I'm done, and I go ahead and use "sc".

h4. Issue:

I can read and write parquet files without a problem with this setup. *BUT* if 
an executor dies during a job and is respawned by a worker, tasks fail with the 
following error:

"Caused by: java.lang.IllegalArgumentException: AWS Access Key ID and Secret 
Access Key must be specified as the username or password (respectively) of a 
s3n URL, or by setting the {{fs.s3n.awsAccessKeyId}} or 
{{fs.s3n.awsSecretAccessKey}} properties (respectively)."

h4. Basic analysis

I think I've traced this down to the following:

SparkHadoopUtil is initialized with an empty {{SparkConf}}. Later, classes like 
{{DataSourceStrategy}} simply call {{SparkHadoopUtil.get.conf}} and access the 
(now invalid; missing various properties) {{HadoopConfiguration}} that's built 
from this empty {{SparkConf}} object. It's unclear to me why this is done, and 
it seems that the code as written would cause broken results anytime callers 
use {{SparkHadoopUtil.get.conf}} directly.

  was:
I'm having a problem where respawning a failed executor during a job that 
reads/writes parquet on S3 causes subsequent tasks to fail because of missing 
AWS keys.

Setup:

I'm using Spark 1.5.2 with Hadoop 2.7 and running experiments on a simple 
standalone cluster:

1 master
2 workers

My application is co-located on the master machine, while the two workers are 
on two other machines (one worker per machine). All machines are running in 
EC2. I've configured my setup so that my application executes its task on two 
executors (one executor per worker).

Application:

My application reads and writes parquet files on S3. I set the AWS keys on the 
SparkContext by doing:

val sc = new SparkContext()
val hadoopConf = sc.hadoopConfiguration
hadoopConf.set("fs.s3n.awsAccessKeyId", "SOME_KEY")
hadoopConf.set("fs.s3n.awsSecretAccessKey", "SOME_SECRET")

At this point I'm done, and I go ahead and use "sc".

Issue:

I can read and write parquet files without a problem with this setup. *BUT* if 
an executor dies during a job and is respawned by a worker, tasks fail with the 
following error:

"Caused by: java.lang.IllegalArgumentException: AWS Access Key ID and Secret 
Access Key must be specified as the username or password (respectively) of a 
s3n URL, or by setting the {{fs.s3n.awsAccessKeyId}} or 
{{fs.s3n.awsSecretAccessKey}} properties (respectively)."

I think I've traced this down to the following:

SparkHadoopUtil is initialized with an empty {{SparkConf}}. Later, classes like 
{{DataSourceStrategy}} simply call {{SparkHadoopUtil.get.conf}} and access the 
(now invalid; missing various properties) {{HadoopConfiguration}} that's built 
from this empty {{SparkConf}} object. It's unclear to me why this is done, and 
it seems that the code as written would cause broken results anytime callers 
use {{SparkHadoopUtil.get.conf}} directly.


> Killed executor is respawned without AWS keys in standalone spark cluster
> -------------------------------------------------------------------------
>
>                 Key: SPARK-13979
>                 URL: https://issues.apache.org/jira/browse/SPARK-13979
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.5.2
>         Environment: I'm using Spark 1.5.2 with Hadoop 2.7 and running 
> experiments on a simple standalone cluster:
> 1 master
> 2 workers
> All ubuntu 14.04 with Java 8/Scala 2.10
>            Reporter: Allen George
>
> I'm having a problem where respawning a failed executor during a job that 
> reads/writes parquet on S3 causes subsequent tasks to fail because of missing 
> AWS keys.
> h4. Setup:
> I'm using Spark 1.5.2 with Hadoop 2.7 and running experiments on a simple 
> standalone cluster:
> 1 master
> 2 workers
> My application is co-located on the master machine, while the two workers are 
> on two other machines (one worker per machine). All machines are running in 
> EC2. I've configured my setup so that my application executes its task on two 
> executors (one executor per worker).
> h4. Application:
> My application reads and writes parquet files on S3. I set the AWS keys on 
> the SparkContext by doing:
> val sc = new SparkContext()
> val hadoopConf = sc.hadoopConfiguration
> hadoopConf.set("fs.s3n.awsAccessKeyId", "SOME_KEY")
> hadoopConf.set("fs.s3n.awsSecretAccessKey", "SOME_SECRET")
> At this point I'm done, and I go ahead and use "sc".
> h4. Issue:
> I can read and write parquet files without a problem with this setup. *BUT* 
> if an executor dies during a job and is respawned by a worker, tasks fail 
> with the following error:
> "Caused by: java.lang.IllegalArgumentException: AWS Access Key ID and Secret 
> Access Key must be specified as the username or password (respectively) of a 
> s3n URL, or by setting the {{fs.s3n.awsAccessKeyId}} or 
> {{fs.s3n.awsSecretAccessKey}} properties (respectively)."
> h4. Basic analysis
> I think I've traced this down to the following:
> SparkHadoopUtil is initialized with an empty {{SparkConf}}. Later, classes 
> like {{DataSourceStrategy}} simply call {{SparkHadoopUtil.get.conf}} and 
> access the (now invalid; missing various properties) {{HadoopConfiguration}} 
> that's built from this empty {{SparkConf}} object. It's unclear to me why 
> this is done, and it seems that the code as written would cause broken 
> results anytime callers use {{SparkHadoopUtil.get.conf}} directly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to