Fatima, the easiest way to create Spark cluster on AWS is to create EMR
cluster and select Spark application. (the latest EMR includes Spark 1.6.1)

Spark works well with S3 (read and write). However it's recommended to
set spark.speculation true (it's expected that some tasks fail if you read
large S3 folder, so speculation should help)



On Thu, Apr 28, 2016 at 2:39 PM, Fatma Ozcan <fatma....@gmail.com> wrote:

> What is your experience using Spark on AWS? Are you setting up your own
> Spark cluster, and using HDFS? Or are you using Spark as a service from
> AWS? In the latter case, what is your experience of using S3 directly,
> without having HDFS in between?
>
> Thanks,
> Fatma
>

Reply via email to