Re: Issues while running MLlib matrix factorization ALS algorithm

Roshani Nagmote Fri, 16 Sep 2016 11:14:58 -0700

Hello,

Thanks for your reply.

Yes, Its netflix dataset. And when I get no space on device, my ‘/mnt’ 
directory gets filled up. I checked. 

/usr/lib/spark/bin/spark-submit --deploy-mode cluster --master yarn --class 
org.apache.spark.examples.mllib.MovieLensALS --jars 
/usr/lib/spark/examples/jars/scopt_2.11-3.3.0.jar 
/usr/lib/spark/examples/jars/spark-examples_2.11-2.0.0.jar --rank 32 
--numIterations 100 --kryo s3://dataset_netflix

When I run above command, I get following error

Job aborted due to stage failure: Task 221 in stage 53.0 failed 4 times, most 
recent failure: Lost task 221.3 in stage 53.0 (TID 9817, ): 
java.io.FileNotFoundException: 
/mnt/yarn/usercache/hadoop/appcache/application_1473786456609_0042/blockmgr-045c2dec-7765-4954-9c9a-c7452f7bd3b7/08/shuffle_168_221_0.data.b17d39a6-4d3c-4198-9e25-e19ca2b4d368
 (No space left on device)

I think I should not need to increase the space on device, as data is not that 
big. So, is there any way, I can setup parameters so that it does not use much 
disk space. I don’t know much about tuning parameters. 

It will be great if anyone can help me with this.

Thanks,
Roshani

> On Sep 16, 2016, at 9:18 AM, Sean Owen <so...@cloudera.com> wrote:
> 
>

Re: Issues while running MLlib matrix factorization ALS algorithm

Reply via email to