Re: Spark, S3A, and 503 SlowDown / rate limit issues

2017-07-12 Thread Steve Loughran
On 10 Jul 2017, at 21:57, Everett Anderson > wrote: Hey, Thanks for the responses, guys! On Thu, Jul 6, 2017 at 7:08 AM, Steve Loughran > wrote: On 5 Jul 2017, at 14:40, Vadim Semenov

Re: Spark, S3A, and 503 SlowDown / rate limit issues

2017-07-10 Thread Everett Anderson
Hey, Thanks for the responses, guys! On Thu, Jul 6, 2017 at 7:08 AM, Steve Loughran wrote: > > On 5 Jul 2017, at 14:40, Vadim Semenov > wrote: > > Are you sure that you use S3A? > Because EMR says that they do not support S3A > >

Re: Spark, S3A, and 503 SlowDown / rate limit issues

2017-07-06 Thread Steve Loughran
On 5 Jul 2017, at 14:40, Vadim Semenov > wrote: Are you sure that you use S3A? Because EMR says that they do not support S3A https://aws.amazon.com/premiumsupport/knowledge-center/emr-file-system-s3/ > Amazon EMR does not

Re: Spark, S3A, and 503 SlowDown / rate limit issues

2017-07-05 Thread Vadim Semenov
Are you sure that you use S3A? Because EMR says that they do not support S3A https://aws.amazon.com/premiumsupport/knowledge-center/emr-file-system-s3/ > Amazon EMR does not currently support use of the Apache Hadoop S3A file system. I think that the HEAD requests come from the

Spark, S3A, and 503 SlowDown / rate limit issues

2017-06-29 Thread Everett Anderson
Hi, We're using Spark 2.0.2 + Hadoop 2.7.3 on AWS EMR with S3A for direct I/O from/to S3 from our Spark jobs. We set mapreduce.fileoutputcommitter.algorithm.version=2 and are using encrypted S3 buckets. This has been working fine for us, but perhaps as we've been running more jobs in parallel,