1. I'm using about 1 million users against few thousand products. I basically have around a million ratings 2. Spark 1.6 on Amazon EMR
On Fri, Mar 11, 2016 at 12:46 PM, Nick Pentreath <nick.pentre...@gmail.com> wrote: > Could you provide more details about: > 1. Data set size (# ratings, # users and # products) > 2. Spark cluster set up and version > > Thanks > > On Fri, 11 Mar 2016 at 05:53 Deepak Gopalakrishnan <dgk...@gmail.com> > wrote: > >> Hello All, >> >> I've been running Spark's ALS on a dataset of users and rated items. I >> first encode my users to integers by using an auto increment function ( >> just like zipWithIndex), I do the same for my items. I then create an RDD >> of the ratings and feed it to ALS. >> >> My issue is that the ALS algorithm never completes. Attached is a >> screenshot of the stages window. >> >> Any help will be greatly appreciated >> >> -- >> Regards, >> *Deepak Gopalakrishnan* >> *Mobile*:+918891509774 >> *Skype* : deepakgk87 >> http://myexps.blogspot.com >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org > > -- Regards, *Deepak Gopalakrishnan* *Mobile*:+918891509774 *Skype* : deepakgk87 http://myexps.blogspot.com