Re: Programmatic Spark 1.2.0 on EMR | S3 filesystem is not working when using

2015-02-02 Thread Aniket Bhatnagar
Alright.. I found the issue. I wasn't setting fs.s3.buffer.dir configuration. Here is the final spark conf snippet that works: spark.hadoop.fs.s3n.impl: com.amazon.ws.emr.hadoop.fs.EmrFileSystem, spark.hadoop.fs.s3.impl: com.amazon.ws.emr.hadoop.fs.EmrFileSystem, spark.hadoop.fs.s3bfs.impl:

Re: Programmatic Spark 1.2.0 on EMR | S3 filesystem is not working when using

2015-01-30 Thread Sven Krasser
From your stacktrace it appears that the S3 writer tries to write the data to a temp file on the local file system first. Taking a guess, that local directory doesn't exist or you don't have permissions for it. -Sven On Fri, Jan 30, 2015 at 6:44 AM, Aniket Bhatnagar aniket.bhatna...@gmail.com

Re: Programmatic Spark 1.2.0 on EMR | S3 filesystem is not working when using

2015-01-30 Thread Aniket Bhatnagar
Right. Which makes me to believe that the directory is perhaps configured somewhere and i have missed configuring the same. The process that is submitting jobs (basically becomes driver) is running in sudo mode and the executors are executed by YARN. The hadoop username is configured as 'hadoop'