There are some computations that are done in core in front end. This is
always method specific. Outside the method itself, there are no additional
requirements on top of spark requirements. However, since many ml methods
tend to be more iterable than your regular etl stuff, expect also higher
demand for room for spark graph lineages .
On Jan 22, 2015 8:07 AM, "Pasmanik, Paul" <paul.pasma...@danteinc.com>
wrote:

> I was able to get spark and mahout installed on EMR cluster as bootstrap
> actions and was able to run spark-itemsimilarity job via an EMR step with
> some modifications to mahout script (defining SPARK_HOME and making sure
> CLASSPATH is not picked up from the invoking script  which is amazon's
> script-runner).
>
> I was only able to run this job using yarn-client (yarn-master is not able
> to submit to resource manager).
>
> In yarn-client mode the driver program runs in the client process and
> submits jobs to executors via yarn manager, so my question is how much
> memory does this driver need?
> Will the memory requirement vary based on the size of the input to
> spark-itemsimilarity?
>
> Thanks.
>
>
> -----Original Message-----
> From: Pasmanik, Paul [mailto:paul.pasma...@danteinc.com]
> Sent: Thursday, January 15, 2015 12:46 PM
> To: user@mahout.apache.org
> Subject: mahout 1.0 on EMR with spark
>
> Has anyone tried running mahout 1.0 on EMR with Spark?
> I've used instructions at
> https://github.com/awslabs/emr-bootstrap-actions/tree/master/spark to get
> EMR cluster running spark.   I am now able to deploy EMR cluster with Spark
> using AWS JAVA APIs.
> EMR allows running a custom script as bootstrap action which I can use to
> install mahout.
> What I am trying to figure out is whether I would need to build mahout
> every time I start EMR cluster or have pre-built artifacts and develop a
> script similar to what awslab is using to install spark?
>
> Thanks.
>
>
>
> ________________________________
> The information contained in this electronic transmission is intended only
> for the use of the recipient and may be confidential and privileged.
> Unauthorized use, disclosure, or reproduction is strictly prohibited and
> may be unlawful. If you have received this electronic transmission in
> error, please notify the sender immediately.
>
>

Reply via email to