Re: Spark on AWS

2016-04-28 Thread Fatma Ozcan
involved are. But it required lots of tuning work, because we are > clearly under the recommended requirements. 4 of the 5 machines are > switched off during the night, only the bridge machine is alive 24/7. > > 12$ per month in total. > > Renato Perini. > > > Il 28/04/201

Spark on AWS

2016-04-28 Thread Fatma Ozcan
What is your experience using Spark on AWS? Are you setting up your own Spark cluster, and using HDFS? Or are you using Spark as a service from AWS? In the latter case, what is your experience of using S3 directly, without having HDFS in between? Thanks, Fatma

SparkML pipelines and error recovery

2015-09-18 Thread Fatma Ozcan
Trying to understand how Spark ML pipelines work in case of failures. If I have multiple transformers and one of them fails, will the lineage based recovery of rdd's automatically kick in? Thanks, Fatma

Querying JSON in Spark SQL

2015-03-16 Thread Fatma Ozcan
Is there any documentation that explains how to query JSON documents using SparkSQL? Thanks, Fatma