Re: Issues while running MLlib matrix factorization ALS algorithm

Sean Owen Fri, 16 Sep 2016 09:19:29 -0700

Oh this is the netflix dataset right? I recognize it from the number
of users/items. It's not fast on a laptop or anything, and takes
plenty of memory, but succeeds. I haven't run this recently but it
worked in Spark 1.x.


On Fri, Sep 16, 2016 at 5:13 PM, Roshani Nagmote
<roshaninagmo...@gmail.com> wrote:
> I am also surprised that I face this problems with fairy small dataset on 14
> M4.2xlarge machines.  Could you please let me know on which dataset you can
> run 100 iterations of rank 30 on your laptop?
>
> I am currently just trying to run the default example code given with spark
> to run ALS on movie lens dataset. I did not change anything in the code.
> However I am running this example on Netflix dataset (1.5 gb)
>
> Thanks,
> Roshani
>
>
> On Friday, September 16, 2016, Sean Owen <so...@cloudera.com> wrote:
>>
>> You may have to decrease the checkpoint interval to say 5 if you're
>> getting StackOverflowError. You may have a particularly deep lineage
>> being created during iterations.
>>
>> No space left on device means you don't have enough local disk to
>> accommodate the big shuffles in some stage. You can add more disk or
>> maybe look at tuning shuffle params to do more in memory and maybe
>> avoid spilling to disk as much.
>>
>> However, given the small data size, I'm surprised that you see either
>> problem.
>>
>> 10-20 iterations is usually where the model stops improving much anyway.
>>
>> I can run 100 iterations of rank 30 on my *laptop* so something is
>> fairly wrong in your setup or maybe in other parts of your user code.
>>
>> On Thu, Sep 15, 2016 at 10:00 PM, Roshani Nagmote
>> <roshaninagmo...@gmail.com> wrote:
>> > Hi,
>> >
>> > I need help to run matrix factorization ALS algorithm in Spark MLlib.
>> >
>> > I am using dataset(1.5Gb) having 480189 users and 17770 items formatted
>> > in
>> > similar way as Movielens dataset.
>> > I am trying to run MovieLensALS example jar on this dataset on AWS Spark
>> > EMR
>> > cluster having 14 M4.2xlarge slaves.
>> >
>> > Command run:
>> > /usr/lib/spark/bin/spark-submit --deploy-mode cluster --master yarn
>> > --class
>> > org.apache.spark.examples.mllib.MovieLensALS --jars
>> > /usr/lib/spark/examples/jars/scopt_2.11-3.3.0.jar
>> > /usr/lib/spark/examples/jars/spark-examples_2.11-2.0.0.jar --rank 32
>> > --numIterations 50 --kryo s3://dataset/input_dataset
>> >
>> > Issues I get:
>> > If I increase rank to 70 or more and numIterations 15 or more, I get
>> > following errors:
>> > 1) stack overflow error
>> > 2) No space left on device - shuffle phase
>> >
>> > Could you please let me know if there are any parameters I should tune
>> > to
>> > make this algorithm work on this dataset?
>> >
>> > For better rmse, I want to increase iterations. Am I missing something
>> > very
>> > trivial? Could anyone help me run this algorithm on this specific
>> > dataset
>> > with more iterations?
>> >
>> > Was anyone able to run ALS on spark with more than 100 iterations and
>> > rank
>> > more than 30?
>> >
>> > Any help will be greatly appreciated.
>> >
>> > Thanks and Regards,
>> > Roshani

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Issues while running MLlib matrix factorization ALS algorithm

Reply via email to