Re: Running ALS on comparitively large RDD

2016-03-11 Thread Deepak Gopalakrishnan
Executor memory : 45g X 4 executors , 1 Driver with 45g memory Data Source is from S3 and I've logs that tells me the Rating objects are loaded fine. On Fri, Mar 11, 2016 at 2:13 PM, Nick Pentreath wrote: > Hmmm, something else is going on there. What data source are

Re: Running ALS on comparitively large RDD

2016-03-11 Thread Nick Pentreath
Hmmm, something else is going on there. What data source are you reading from? How much driver and executor memory have you provided to Spark? On Fri, 11 Mar 2016 at 09:21 Deepak Gopalakrishnan wrote: > 1. I'm using about 1 million users against few thousand products. I >

Re: Running ALS on comparitively large RDD

2016-03-10 Thread Deepak Gopalakrishnan
1. I'm using about 1 million users against few thousand products. I basically have around a million ratings 2. Spark 1.6 on Amazon EMR On Fri, Mar 11, 2016 at 12:46 PM, Nick Pentreath wrote: > Could you provide more details about: > 1. Data set size (# ratings, # users

Re: Running ALS on comparitively large RDD

2016-03-10 Thread Nick Pentreath
Could you provide more details about: 1. Data set size (# ratings, # users and # products) 2. Spark cluster set up and version Thanks On Fri, 11 Mar 2016 at 05:53 Deepak Gopalakrishnan wrote: > Hello All, > > I've been running Spark's ALS on a dataset of users and rated