Alright.. I found the issue. I wasn't setting fs.s3.buffer.dir
configuration. Here is the final spark conf snippet that works:
spark.hadoop.fs.s3n.impl: com.amazon.ws.emr.hadoop.fs.EmrFileSystem,
spark.hadoop.fs.s3.impl: com.amazon.ws.emr.hadoop.fs.EmrFileSystem,
spark.hadoop.fs.s3bfs.impl:
From your stacktrace it appears that the S3 writer tries to write the data
to a temp file on the local file system first. Taking a guess, that local
directory doesn't exist or you don't have permissions for it.
-Sven
On Fri, Jan 30, 2015 at 6:44 AM, Aniket Bhatnagar
aniket.bhatna...@gmail.com
Right. Which makes me to believe that the directory is perhaps configured
somewhere and i have missed configuring the same. The process that is
submitting jobs (basically becomes driver) is running in sudo mode and the
executors are executed by YARN. The hadoop username is configured as
'hadoop'