Hi Robert,

If you're running Spark against YARN, you don't need to install anything
Spark-specific on all the nodes.  For each application, the client will
copy the Spark jar to HDFS where the Spark processes can fetch it.  For
faster app startup, you can copy the Spark jar to a public location on HDFS
and point to it there.  The YARN NodeManagers will cache it on each node
after it's fetched the first time.

-Sandy


On Tue, Jul 8, 2014 at 6:24 PM, Robert James <srobertja...@gmail.com> wrote:

> I have a Spark app which runs well on local master.  I'm now ready to
> put it on a cluster.  What needs to be installed on the master? What
> needs to be installed on the workers?
>
> If the cluster already has Hadoop or YARN or Cloudera, does it still
> need an install of Spark?
>

Reply via email to