Oh, specifically to item similarity. Not sure. On Jan 22, 2015 8:42 AM, "Dmitriy Lyubimov" <[email protected]> wrote:
> There are some computations that are done in core in front end. This is > always method specific. Outside the method itself, there are no additional > requirements on top of spark requirements. However, since many ml methods > tend to be more iterable than your regular etl stuff, expect also higher > demand for room for spark graph lineages . > On Jan 22, 2015 8:07 AM, "Pasmanik, Paul" <[email protected]> > wrote: > >> I was able to get spark and mahout installed on EMR cluster as bootstrap >> actions and was able to run spark-itemsimilarity job via an EMR step with >> some modifications to mahout script (defining SPARK_HOME and making sure >> CLASSPATH is not picked up from the invoking script which is amazon's >> script-runner). >> >> I was only able to run this job using yarn-client (yarn-master is not >> able to submit to resource manager). >> >> In yarn-client mode the driver program runs in the client process and >> submits jobs to executors via yarn manager, so my question is how much >> memory does this driver need? >> Will the memory requirement vary based on the size of the input to >> spark-itemsimilarity? >> >> Thanks. >> >> >> -----Original Message----- >> From: Pasmanik, Paul [mailto:[email protected]] >> Sent: Thursday, January 15, 2015 12:46 PM >> To: [email protected] >> Subject: mahout 1.0 on EMR with spark >> >> Has anyone tried running mahout 1.0 on EMR with Spark? >> I've used instructions at >> https://github.com/awslabs/emr-bootstrap-actions/tree/master/spark to >> get EMR cluster running spark. I am now able to deploy EMR cluster with >> Spark using AWS JAVA APIs. >> EMR allows running a custom script as bootstrap action which I can use to >> install mahout. >> What I am trying to figure out is whether I would need to build mahout >> every time I start EMR cluster or have pre-built artifacts and develop a >> script similar to what awslab is using to install spark? >> >> Thanks. >> >> >> >> ________________________________ >> The information contained in this electronic transmission is intended >> only for the use of the recipient and may be confidential and privileged. >> Unauthorized use, disclosure, or reproduction is strictly prohibited and >> may be unlawful. If you have received this electronic transmission in >> error, please notify the sender immediately. >> >>
