Re: Issues while running MLlib matrix factorization ALS algorithm

Roshani Nagmote Mon, 19 Sep 2016 11:01:35 -0700

Hello Sean,

Can you please tell me how to set checkpoint interval? I did set
checkpointDir("hdfs:/") But if I want to reduce the default value of
checkpoint interval which is 10. How should it be done?


Sorry is its a very basic question. I am a novice in spark.

Thanks,
Roshani

On Fri, Sep 16, 2016 at 11:14 AM, Roshani Nagmote <roshaninagmo...@gmail.com
> wrote:

> Hello,
>
> Thanks for your reply.
>
> Yes, Its netflix dataset. And when I get no space on device, my ‘/mnt’
> directory gets filled up. I checked.
>
> /usr/lib/spark/bin/spark-submit --deploy-mode cluster --master yarn
> --class org.apache.spark.examples.mllib.MovieLensALS --jars
> /usr/lib/spark/examples/jars/scopt_2.11-3.3.0.jar
> /usr/lib/spark/examples/jars/spark-examples_2.11-2.0.0.jar *--rank 32
> --numIterations 100* --kryo s3://dataset_netflix
>
> When I run above command, I get following error
>
> Job aborted due to stage failure: Task 221 in stage 53.0 failed 4 times,
> most recent failure: Lost task 221.3 in stage 53.0 (TID 9817, ):
> java.io.FileNotFoundException: /mnt/yarn/usercache/hadoop/
> appcache/application_1473786456609_0042/blockmgr-045c2dec-7765-4954-9c9a-
> c7452f7bd3b7/08/shuffle_168_221_0.data.b17d39a6-4d3c-4198-9e25-e19ca2b4d368
> (No space left on device)
>
> I think I should not need to increase the space on device, as data is not
> that big. So, is there any way, I can setup parameters so that it does not
> use much disk space. I don’t know much about tuning parameters.
>
> It will be great if anyone can help me with this.
>
> Thanks,
> Roshani
>
> On Sep 16, 2016, at 9:18 AM, Sean Owen <so...@cloudera.com> wrote:
>
>
>
>
>

Re: Issues while running MLlib matrix factorization ALS algorithm

Reply via email to