Yes, I broadcast the spark-env.sh file to all worker nodes before I run my program and then execute bin/stop-all.sh, bin/start-all.sh. I have also viewed the size of data2 directory on each worker node and it is also about 800G. Thanks!
2013/10/29 Matei Zaharia <matei.zaha...@gmail.com> > The error is from a worker node -- did you check that /data2 is set up > properly on the worker nodes too? In general that should be the only > directory used. > > Matei > > On Oct 28, 2013, at 6:52 PM, Shangyu Luo <lsy...@gmail.com> wrote: > > Hello, > I have some questions about the files that Spark will create and use > during its running. > (1) I am running a python program on Spark with a cluster of EC2. The data > comes from hdfs file system. I have met the following error in the console > of the master node: > *java.io.FileNotFoundException: > /data2/tmp/spark-local-20131029003412-c340/1b/shuffle_1_527_79 (No space > left on device)* > at java.io.FileOutputStream.openAppend(Native Method) > at java.io.FileOutputStream.<init>(FileOutputStream.java:207) > at > org.apache.spark.storage.DiskStore$DiskBlockObjectWriter.open(DiskStore.scala:58) > at > org.apache.spark.storage.DiskStore$DiskBlockObjectWriter.write(DiskStore.scala:107) > at > org.apache.spark.scheduler.ShuffleMapTask$$anonfun$run$1.apply(ShuffleMapTask.scala:152) > at > org.apache.spark.scheduler.ShuffleMapTask$$anonfun$run$1.apply(ShuffleMapTask.scala:149) > at scala.collection.Iterator$class.foreach(Iterator.scala:772) > at scala.collection.Iterator$$anon$19.foreach(Iterator.scala:399) > at > org.apache.spark.scheduler.ShuffleMapTask.run(ShuffleMapTask.scala:149) > at > org.apache.spark.scheduler.ShuffleMapTask.run(ShuffleMapTask.scala:88) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:158) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:679) > I set spark.local.dir=*/data2/tmp* in spark-env.sh and there is about * > 800G* space in data2 directory. I have checked the space of data2 and it > is just used about 3G. > So why Spark thinks that there is no space left on device? > > (2) Moreover, *I am wondering if Spark will create some files under > other directories other than spark.local.dir*? Presently I use a=b.map.* > persist(storage.disk_only)* in some part of my program, where will the > persisted data be stored? > > (3) Lastly, I also had "*Removing BlockManager xxx with no recent heart > beats: xxxxxms exceeds 45000ms* " error sometimes. I have set the > corresponding parameters in spark-env.sh: > SPARK_JAVA_OPTS+="-Dspark.akka.timeout=300000 " > SPARK_JAVA_OPTS+="-Dspark.worker.timeout=300000 " > SPARK_JAVA_OPTS+="-Dspark.akka.askTimeout=3000 " > SPARK_JAVA_OPTS+="-Dspark.storage.blockManagerHeartBeatMs=300000 " > SPARK_JAVA_OPTS+="-Dspark.akka.retry.wait=300000 " > But it does no help. Can someone gives me some suggestion about solving > this problem? > > Any help will be appreciated! > Thanks! > > Best, > Shangyu > > > -- -- Shangyu, Luo Department of Computer Science Rice University -- Not Just Think About It, But Do It! -- Success is never final. -- Losers always whine about their best