Thanks for the quick responses! I used your final -Dspark.local.dir suggestion, but I see this during the initialization of the application:
14/07/16 06:56:08 INFO storage.DiskBlockManager: Created local directory at /vol/spark-local-20140716065608-7b2a I would have expected something in /mnt/spark/. Thanks, Chris On Tue, Jul 15, 2014 at 11:44 PM, Chris Gore <cdg...@cdgore.com> wrote: > Hi Chris, > > I've encountered this error when running Spark’s ALS methods too. In my > case, it was because I set spark.local.dir improperly, and every time there > was a shuffle, it would spill many GB of data onto the local drive. What > fixed it was setting it to use the /mnt directory, where a network drive is > mounted. For example, setting an environmental variable: > > export SPACE=$(mount | grep mnt | awk '{print $3"/spark/"}' | xargs | sed > 's/ /,/g’) > > Then adding -Dspark.local.dir=$SPACE or simply > -Dspark.local.dir=/mnt/spark/,/mnt2/spark/ when you run your driver > application > > Chris > > On Jul 15, 2014, at 11:39 PM, Xiangrui Meng <men...@gmail.com> wrote: > > > Check the number of inodes (df -i). The assembly build may create many > > small files. -Xiangrui > > > > On Tue, Jul 15, 2014 at 11:35 PM, Chris DuBois <chris.dub...@gmail.com> > wrote: > >> Hi all, > >> > >> I am encountering the following error: > >> > >> INFO scheduler.TaskSetManager: Loss was due to java.io.IOException: No > space > >> left on device [duplicate 4] > >> > >> For each slave, df -h looks roughtly like this, which makes the above > error > >> surprising. > >> > >> Filesystem Size Used Avail Use% Mounted on > >> /dev/xvda1 7.9G 4.4G 3.5G 57% / > >> tmpfs 7.4G 4.0K 7.4G 1% /dev/shm > >> /dev/xvdb 37G 3.3G 32G 10% /mnt > >> /dev/xvdf 37G 2.0G 34G 6% /mnt2 > >> /dev/xvdv 500G 33M 500G 1% /vol > >> > >> I'm on an EC2 cluster (c3.xlarge + 5 x m3) that I launched using the > >> spark-ec2 scripts and a clone of spark from today. The job I am running > >> closely resembles the collaborative filtering example. This issue > happens > >> with the 1M version as well as the 10 million rating version of the > >> MovieLens dataset. > >> > >> I have seen previous questions, but they haven't helped yet. For > example, I > >> tried setting the Spark tmp directory to the EBS volume at /vol/, both > by > >> editing the spark conf file (and copy-dir'ing it to the slaves) as well > as > >> through the SparkConf. Yet I still get the above error. Here is my > current > >> Spark config below. Note that I'm launching via > ~/spark/bin/spark-submit. > >> > >> conf = SparkConf() > >> conf.setAppName("RecommendALS").set("spark.local.dir", > >> "/vol/").set("spark.executor.memory", "7g").set("spark.akka.frameSize", > >> "100").setExecutorEnv("SPARK_JAVA_OPTS", " -Dspark.akka.frameSize=100") > >> sc = SparkContext(conf=conf) > >> > >> Thanks for any advice, > >> Chris > >> > >