If your cluster is a dedicated spark cluster (only running spark job, no
other jobs like hive/pig/mr), then spark standalone would be fine.
Otherwise I think yarn would be a better option.

On Fri, Nov 27, 2015 at 3:36 PM, cs user <acldstk...@gmail.com> wrote:

> Hi All,
>
> Apologies if this question has been asked before. I'd like to know if
> there are any downsides to running spark over yarn with the --master
> yarn-cluster option vs having a separate spark standalone cluster to
> execute jobs?
>
> We're looking at installing a hdfs/hadoop cluster with Ambari and
> submitting jobs to the cluster using yarn, or having an Ambari cluster and
> a separate standalone spark cluster, which will run the spark jobs on data
> within hdfs.
>
> With yarn, will we still get all the benefits of spark?
>
> Will it be possible to process streaming data?
>
> Many thanks in advance for any responses.
>
> Cheers!
>



-- 
Best Regards

Jeff Zhang

Reply via email to