Re: spark there is no space on the disk

2015-03-31 Thread Peng Xia
Yes, we have just modified the configuration, and every thing works fine.
Thanks very much for the help.

On Thu, Mar 19, 2015 at 5:24 PM, Ted Yu yuzhih...@gmail.com wrote:

 For YARN, possibly this one ?

 property
   nameyarn.nodemanager.local-dirs/name
   value/hadoop/yarn/local/value
 /property

 Cheers

 On Thu, Mar 19, 2015 at 2:21 PM, Marcelo Vanzin van...@cloudera.com
 wrote:

 IIRC you have to set that configuration on the Worker processes (for
 standalone). The app can't override it (only for a client-mode
 driver). YARN has a similar configuration, but I don't know the name
 (shouldn't be hard to find, though).

 On Thu, Mar 19, 2015 at 11:56 AM, Davies Liu dav...@databricks.com
 wrote:
  Is it possible that `spark.local.dir` is overriden by others? The docs
 say:
 
  NOTE: In Spark 1.0 and later this will be overriden by
  SPARK_LOCAL_DIRS (Standalone, Mesos) or LOCAL_DIRS (YARN)
 
  On Sat, Mar 14, 2015 at 5:29 PM, Peng Xia sparkpeng...@gmail.com
 wrote:
  Hi Sean,
 
  Thank very much for your reply.
  I tried to config it from below code:
 
  sf = SparkConf().setAppName(test).set(spark.executor.memory,
  45g).set(spark.cores.max, 62),set(spark.local.dir, C:\\tmp)
 
  But still get the error.
  Do you know how I can config this?
 
 
  Thanks,
  Best,
  Peng
 
 
  On Sat, Mar 14, 2015 at 3:41 AM, Sean Owen so...@cloudera.com wrote:
 
  It means pretty much what it says. You ran out of space on an executor
  (not driver), because the dir used for serialization temp files is
  full (not all volumes). Set spark.local.dirs to something more
  appropriate and larger.
 
  On Sat, Mar 14, 2015 at 2:10 AM, Peng Xia sparkpeng...@gmail.com
 wrote:
   Hi
  
  
   I was running a logistic regression algorithm on a 8 nodes spark
   cluster,
   each node has 8 cores and 56 GB Ram (each node is running a windows
   system).
   And the spark installation driver has 1.9 TB capacity. The dataset
 I was
   training on are has around 40 million records with around 6600
 features.
   But
   I always get this error during the training process:
  
   Py4JJavaError: An error occurred while calling
   o70.trainLogisticRegressionModelWithLBFGS.
   : org.apache.spark.SparkException: Job aborted due to stage failure:
   Task
   2709 in stage 3.0 failed 4 times, most recent failure: Lost task
 2709.3
   in
   stage 3.0 (TID 2766,
   workernode0.rbaHdInsightCluster5.b6.internal.cloudapp.net):
   java.io.IOException: There is not enough space on the disk
   at java.io.FileOutputStream.writeBytes(Native Method)
   at java.io.FileOutputStream.write(FileOutputStream.java:345)
   at
   java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
   at
  
  
 org.xerial.snappy.SnappyOutputStream.dumpOutput(SnappyOutputStream.java:300)
   at
  
  
 org.xerial.snappy.SnappyOutputStream.rawWrite(SnappyOutputStream.java:247)
   at
  
 org.xerial.snappy.SnappyOutputStream.write(SnappyOutputStream.java:107)
   at
  
  
 java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1876)
   at
  
  
 java.io.ObjectOutputStream$BlockDataOutputStream.writeByte(ObjectOutputStream.java:1914)
   at
  
  
 java.io.ObjectOutputStream.writeFatalException(ObjectOutputStream.java:1575)
   at
   java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:350)
   at
  
  
 org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:42)
   at
  
  
 org.apache.spark.serializer.SerializationStream.writeAll(Serializer.scala:110)
   at
  
  
 org.apache.spark.storage.BlockManager.dataSerializeStream(BlockManager.scala:1177)
   at
   org.apache.spark.storage.DiskStore.putIterator(DiskStore.scala:78)
   at
   org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:787)
   at
  
  
 org.apache.spark.storage.BlockManager.putIterator(BlockManager.scala:638)
   at
  
 org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:145)
   at
   org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:70)
   at org.apache.spark.rdd.RDD.iterator(RDD.scala:243)
   at
   org.apache.spark.rdd.FilteredRDD.compute(FilteredRDD.scala:34)
   at
   org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:278)
   at org.apache.spark.rdd.RDD.iterator(RDD.scala:245)
   at
   org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
   at org.apache.spark.scheduler.Task.run(Task.scala:56)
   at
  
 org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:200)
   at
  
  
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at
  
  
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
  
   Driver stacktrace:
   at
  
   

Re: spark there is no space on the disk

2015-03-19 Thread Davies Liu
Is it possible that `spark.local.dir` is overriden by others? The docs say:

NOTE: In Spark 1.0 and later this will be overriden by
SPARK_LOCAL_DIRS (Standalone, Mesos) or LOCAL_DIRS (YARN)

On Sat, Mar 14, 2015 at 5:29 PM, Peng Xia sparkpeng...@gmail.com wrote:
 Hi Sean,

 Thank very much for your reply.
 I tried to config it from below code:

 sf = SparkConf().setAppName(test).set(spark.executor.memory,
 45g).set(spark.cores.max, 62),set(spark.local.dir, C:\\tmp)

 But still get the error.
 Do you know how I can config this?


 Thanks,
 Best,
 Peng


 On Sat, Mar 14, 2015 at 3:41 AM, Sean Owen so...@cloudera.com wrote:

 It means pretty much what it says. You ran out of space on an executor
 (not driver), because the dir used for serialization temp files is
 full (not all volumes). Set spark.local.dirs to something more
 appropriate and larger.

 On Sat, Mar 14, 2015 at 2:10 AM, Peng Xia sparkpeng...@gmail.com wrote:
  Hi
 
 
  I was running a logistic regression algorithm on a 8 nodes spark
  cluster,
  each node has 8 cores and 56 GB Ram (each node is running a windows
  system).
  And the spark installation driver has 1.9 TB capacity. The dataset I was
  training on are has around 40 million records with around 6600 features.
  But
  I always get this error during the training process:
 
  Py4JJavaError: An error occurred while calling
  o70.trainLogisticRegressionModelWithLBFGS.
  : org.apache.spark.SparkException: Job aborted due to stage failure:
  Task
  2709 in stage 3.0 failed 4 times, most recent failure: Lost task 2709.3
  in
  stage 3.0 (TID 2766,
  workernode0.rbaHdInsightCluster5.b6.internal.cloudapp.net):
  java.io.IOException: There is not enough space on the disk
  at java.io.FileOutputStream.writeBytes(Native Method)
  at java.io.FileOutputStream.write(FileOutputStream.java:345)
  at
  java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
  at
 
  org.xerial.snappy.SnappyOutputStream.dumpOutput(SnappyOutputStream.java:300)
  at
 
  org.xerial.snappy.SnappyOutputStream.rawWrite(SnappyOutputStream.java:247)
  at
  org.xerial.snappy.SnappyOutputStream.write(SnappyOutputStream.java:107)
  at
 
  java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1876)
  at
 
  java.io.ObjectOutputStream$BlockDataOutputStream.writeByte(ObjectOutputStream.java:1914)
  at
 
  java.io.ObjectOutputStream.writeFatalException(ObjectOutputStream.java:1575)
  at
  java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:350)
  at
 
  org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:42)
  at
 
  org.apache.spark.serializer.SerializationStream.writeAll(Serializer.scala:110)
  at
 
  org.apache.spark.storage.BlockManager.dataSerializeStream(BlockManager.scala:1177)
  at
  org.apache.spark.storage.DiskStore.putIterator(DiskStore.scala:78)
  at
  org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:787)
  at
 
  org.apache.spark.storage.BlockManager.putIterator(BlockManager.scala:638)
  at
  org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:145)
  at
  org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:70)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:243)
  at
  org.apache.spark.rdd.FilteredRDD.compute(FilteredRDD.scala:34)
  at
  org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:278)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:245)
  at
  org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
  at org.apache.spark.scheduler.Task.run(Task.scala:56)
  at
  org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:200)
  at
 
  java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at
 
  java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:745)
 
  Driver stacktrace:
  at
 
  org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1214)
  at
 
  org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1203)
  at
 
  org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1202)
  at
 
  scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
  at
  scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
  at
 
  org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1202)
  at
 
  org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
  at
 
  org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
  at 

Re: spark there is no space on the disk

2015-03-19 Thread Marcelo Vanzin
IIRC you have to set that configuration on the Worker processes (for
standalone). The app can't override it (only for a client-mode
driver). YARN has a similar configuration, but I don't know the name
(shouldn't be hard to find, though).

On Thu, Mar 19, 2015 at 11:56 AM, Davies Liu dav...@databricks.com wrote:
 Is it possible that `spark.local.dir` is overriden by others? The docs say:

 NOTE: In Spark 1.0 and later this will be overriden by
 SPARK_LOCAL_DIRS (Standalone, Mesos) or LOCAL_DIRS (YARN)

 On Sat, Mar 14, 2015 at 5:29 PM, Peng Xia sparkpeng...@gmail.com wrote:
 Hi Sean,

 Thank very much for your reply.
 I tried to config it from below code:

 sf = SparkConf().setAppName(test).set(spark.executor.memory,
 45g).set(spark.cores.max, 62),set(spark.local.dir, C:\\tmp)

 But still get the error.
 Do you know how I can config this?


 Thanks,
 Best,
 Peng


 On Sat, Mar 14, 2015 at 3:41 AM, Sean Owen so...@cloudera.com wrote:

 It means pretty much what it says. You ran out of space on an executor
 (not driver), because the dir used for serialization temp files is
 full (not all volumes). Set spark.local.dirs to something more
 appropriate and larger.

 On Sat, Mar 14, 2015 at 2:10 AM, Peng Xia sparkpeng...@gmail.com wrote:
  Hi
 
 
  I was running a logistic regression algorithm on a 8 nodes spark
  cluster,
  each node has 8 cores and 56 GB Ram (each node is running a windows
  system).
  And the spark installation driver has 1.9 TB capacity. The dataset I was
  training on are has around 40 million records with around 6600 features.
  But
  I always get this error during the training process:
 
  Py4JJavaError: An error occurred while calling
  o70.trainLogisticRegressionModelWithLBFGS.
  : org.apache.spark.SparkException: Job aborted due to stage failure:
  Task
  2709 in stage 3.0 failed 4 times, most recent failure: Lost task 2709.3
  in
  stage 3.0 (TID 2766,
  workernode0.rbaHdInsightCluster5.b6.internal.cloudapp.net):
  java.io.IOException: There is not enough space on the disk
  at java.io.FileOutputStream.writeBytes(Native Method)
  at java.io.FileOutputStream.write(FileOutputStream.java:345)
  at
  java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
  at
 
  org.xerial.snappy.SnappyOutputStream.dumpOutput(SnappyOutputStream.java:300)
  at
 
  org.xerial.snappy.SnappyOutputStream.rawWrite(SnappyOutputStream.java:247)
  at
  org.xerial.snappy.SnappyOutputStream.write(SnappyOutputStream.java:107)
  at
 
  java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1876)
  at
 
  java.io.ObjectOutputStream$BlockDataOutputStream.writeByte(ObjectOutputStream.java:1914)
  at
 
  java.io.ObjectOutputStream.writeFatalException(ObjectOutputStream.java:1575)
  at
  java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:350)
  at
 
  org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:42)
  at
 
  org.apache.spark.serializer.SerializationStream.writeAll(Serializer.scala:110)
  at
 
  org.apache.spark.storage.BlockManager.dataSerializeStream(BlockManager.scala:1177)
  at
  org.apache.spark.storage.DiskStore.putIterator(DiskStore.scala:78)
  at
  org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:787)
  at
 
  org.apache.spark.storage.BlockManager.putIterator(BlockManager.scala:638)
  at
  org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:145)
  at
  org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:70)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:243)
  at
  org.apache.spark.rdd.FilteredRDD.compute(FilteredRDD.scala:34)
  at
  org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:278)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:245)
  at
  org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
  at org.apache.spark.scheduler.Task.run(Task.scala:56)
  at
  org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:200)
  at
 
  java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at
 
  java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:745)
 
  Driver stacktrace:
  at
 
  org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1214)
  at
 
  org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1203)
  at
 
  org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1202)
  at
 
  scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
  at
  scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
  at
 
  

Re: spark there is no space on the disk

2015-03-19 Thread Ted Yu
For YARN, possibly this one ?

property
  nameyarn.nodemanager.local-dirs/name
  value/hadoop/yarn/local/value
/property

Cheers

On Thu, Mar 19, 2015 at 2:21 PM, Marcelo Vanzin van...@cloudera.com wrote:

 IIRC you have to set that configuration on the Worker processes (for
 standalone). The app can't override it (only for a client-mode
 driver). YARN has a similar configuration, but I don't know the name
 (shouldn't be hard to find, though).

 On Thu, Mar 19, 2015 at 11:56 AM, Davies Liu dav...@databricks.com
 wrote:
  Is it possible that `spark.local.dir` is overriden by others? The docs
 say:
 
  NOTE: In Spark 1.0 and later this will be overriden by
  SPARK_LOCAL_DIRS (Standalone, Mesos) or LOCAL_DIRS (YARN)
 
  On Sat, Mar 14, 2015 at 5:29 PM, Peng Xia sparkpeng...@gmail.com
 wrote:
  Hi Sean,
 
  Thank very much for your reply.
  I tried to config it from below code:
 
  sf = SparkConf().setAppName(test).set(spark.executor.memory,
  45g).set(spark.cores.max, 62),set(spark.local.dir, C:\\tmp)
 
  But still get the error.
  Do you know how I can config this?
 
 
  Thanks,
  Best,
  Peng
 
 
  On Sat, Mar 14, 2015 at 3:41 AM, Sean Owen so...@cloudera.com wrote:
 
  It means pretty much what it says. You ran out of space on an executor
  (not driver), because the dir used for serialization temp files is
  full (not all volumes). Set spark.local.dirs to something more
  appropriate and larger.
 
  On Sat, Mar 14, 2015 at 2:10 AM, Peng Xia sparkpeng...@gmail.com
 wrote:
   Hi
  
  
   I was running a logistic regression algorithm on a 8 nodes spark
   cluster,
   each node has 8 cores and 56 GB Ram (each node is running a windows
   system).
   And the spark installation driver has 1.9 TB capacity. The dataset I
 was
   training on are has around 40 million records with around 6600
 features.
   But
   I always get this error during the training process:
  
   Py4JJavaError: An error occurred while calling
   o70.trainLogisticRegressionModelWithLBFGS.
   : org.apache.spark.SparkException: Job aborted due to stage failure:
   Task
   2709 in stage 3.0 failed 4 times, most recent failure: Lost task
 2709.3
   in
   stage 3.0 (TID 2766,
   workernode0.rbaHdInsightCluster5.b6.internal.cloudapp.net):
   java.io.IOException: There is not enough space on the disk
   at java.io.FileOutputStream.writeBytes(Native Method)
   at java.io.FileOutputStream.write(FileOutputStream.java:345)
   at
   java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
   at
  
  
 org.xerial.snappy.SnappyOutputStream.dumpOutput(SnappyOutputStream.java:300)
   at
  
  
 org.xerial.snappy.SnappyOutputStream.rawWrite(SnappyOutputStream.java:247)
   at
  
 org.xerial.snappy.SnappyOutputStream.write(SnappyOutputStream.java:107)
   at
  
  
 java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1876)
   at
  
  
 java.io.ObjectOutputStream$BlockDataOutputStream.writeByte(ObjectOutputStream.java:1914)
   at
  
  
 java.io.ObjectOutputStream.writeFatalException(ObjectOutputStream.java:1575)
   at
   java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:350)
   at
  
  
 org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:42)
   at
  
  
 org.apache.spark.serializer.SerializationStream.writeAll(Serializer.scala:110)
   at
  
  
 org.apache.spark.storage.BlockManager.dataSerializeStream(BlockManager.scala:1177)
   at
   org.apache.spark.storage.DiskStore.putIterator(DiskStore.scala:78)
   at
   org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:787)
   at
  
  
 org.apache.spark.storage.BlockManager.putIterator(BlockManager.scala:638)
   at
  
 org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:145)
   at
   org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:70)
   at org.apache.spark.rdd.RDD.iterator(RDD.scala:243)
   at
   org.apache.spark.rdd.FilteredRDD.compute(FilteredRDD.scala:34)
   at
   org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:278)
   at org.apache.spark.rdd.RDD.iterator(RDD.scala:245)
   at
   org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
   at org.apache.spark.scheduler.Task.run(Task.scala:56)
   at
   org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:200)
   at
  
  
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at
  
  
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:745)
  
   Driver stacktrace:
   at
  
   org.apache.spark.scheduler.DAGScheduler.org
 $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1214)
   at
  
  
 

Re: spark there is no space on the disk

2015-03-14 Thread Peng Xia
Hi Sean,

Thank very much for your reply.
I tried to config it from below code:

sf = SparkConf().setAppName(test).set(spark.executor.memory,
45g).set(spark.cores.max, 62),set(spark.local.dir, C:\\tmp)

But still get the error.
Do you know how I can config this?


Thanks,
Best,
Peng


On Sat, Mar 14, 2015 at 3:41 AM, Sean Owen so...@cloudera.com wrote:

 It means pretty much what it says. You ran out of space on an executor
 (not driver), because the dir used for serialization temp files is
 full (not all volumes). Set spark.local.dirs to something more
 appropriate and larger.

 On Sat, Mar 14, 2015 at 2:10 AM, Peng Xia sparkpeng...@gmail.com wrote:
  Hi
 
 
  I was running a logistic regression algorithm on a 8 nodes spark cluster,
  each node has 8 cores and 56 GB Ram (each node is running a windows
 system).
  And the spark installation driver has 1.9 TB capacity. The dataset I was
  training on are has around 40 million records with around 6600 features.
 But
  I always get this error during the training process:
 
  Py4JJavaError: An error occurred while calling
  o70.trainLogisticRegressionModelWithLBFGS.
  : org.apache.spark.SparkException: Job aborted due to stage failure: Task
  2709 in stage 3.0 failed 4 times, most recent failure: Lost task 2709.3
 in
  stage 3.0 (TID 2766,
  workernode0.rbaHdInsightCluster5.b6.internal.cloudapp.net):
  java.io.IOException: There is not enough space on the disk
  at java.io.FileOutputStream.writeBytes(Native Method)
  at java.io.FileOutputStream.write(FileOutputStream.java:345)
  at
 java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
  at
 
 org.xerial.snappy.SnappyOutputStream.dumpOutput(SnappyOutputStream.java:300)
  at
 
 org.xerial.snappy.SnappyOutputStream.rawWrite(SnappyOutputStream.java:247)
  at
  org.xerial.snappy.SnappyOutputStream.write(SnappyOutputStream.java:107)
  at
 
 java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1876)
  at
 
 java.io.ObjectOutputStream$BlockDataOutputStream.writeByte(ObjectOutputStream.java:1914)
  at
 
 java.io.ObjectOutputStream.writeFatalException(ObjectOutputStream.java:1575)
  at
  java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:350)
  at
 
 org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:42)
  at
 
 org.apache.spark.serializer.SerializationStream.writeAll(Serializer.scala:110)
  at
 
 org.apache.spark.storage.BlockManager.dataSerializeStream(BlockManager.scala:1177)
  at
  org.apache.spark.storage.DiskStore.putIterator(DiskStore.scala:78)
  at
  org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:787)
  at
  org.apache.spark.storage.BlockManager.putIterator(BlockManager.scala:638)
  at
  org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:145)
  at
 org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:70)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:243)
  at org.apache.spark.rdd.FilteredRDD.compute(FilteredRDD.scala:34)
  at
 org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:278)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:245)
  at
  org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
  at org.apache.spark.scheduler.Task.run(Task.scala:56)
  at
  org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:200)
  at
 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at
 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:745)
 
  Driver stacktrace:
  at
  org.apache.spark.scheduler.DAGScheduler.org
 $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1214)
  at
 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1203)
  at
 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1202)
  at
 
 scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
  at
  scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
  at
 
 org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1202)
  at
 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
  at
 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
  at scala.Option.foreach(Option.scala:236)
  at
 
 org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:696)
  at
 
 org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1420)
  at 

Re: spark there is no space on the disk

2015-03-14 Thread Peng Xia
And I have 2 TB free space on C driver.

On Sat, Mar 14, 2015 at 8:29 PM, Peng Xia sparkpeng...@gmail.com wrote:

 Hi Sean,

 Thank very much for your reply.
 I tried to config it from below code:

 sf = SparkConf().setAppName(test).set(spark.executor.memory, 
 45g).set(spark.cores.max, 62),set(spark.local.dir, C:\\tmp)

 But still get the error.
 Do you know how I can config this?


 Thanks,
 Best,
 Peng


 On Sat, Mar 14, 2015 at 3:41 AM, Sean Owen so...@cloudera.com wrote:

 It means pretty much what it says. You ran out of space on an executor
 (not driver), because the dir used for serialization temp files is
 full (not all volumes). Set spark.local.dirs to something more
 appropriate and larger.

 On Sat, Mar 14, 2015 at 2:10 AM, Peng Xia sparkpeng...@gmail.com wrote:
  Hi
 
 
  I was running a logistic regression algorithm on a 8 nodes spark
 cluster,
  each node has 8 cores and 56 GB Ram (each node is running a windows
 system).
  And the spark installation driver has 1.9 TB capacity. The dataset I was
  training on are has around 40 million records with around 6600
 features. But
  I always get this error during the training process:
 
  Py4JJavaError: An error occurred while calling
  o70.trainLogisticRegressionModelWithLBFGS.
  : org.apache.spark.SparkException: Job aborted due to stage failure:
 Task
  2709 in stage 3.0 failed 4 times, most recent failure: Lost task 2709.3
 in
  stage 3.0 (TID 2766,
  workernode0.rbaHdInsightCluster5.b6.internal.cloudapp.net):
  java.io.IOException: There is not enough space on the disk
  at java.io.FileOutputStream.writeBytes(Native Method)
  at java.io.FileOutputStream.write(FileOutputStream.java:345)
  at
 java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
  at
 
 org.xerial.snappy.SnappyOutputStream.dumpOutput(SnappyOutputStream.java:300)
  at
 
 org.xerial.snappy.SnappyOutputStream.rawWrite(SnappyOutputStream.java:247)
  at
  org.xerial.snappy.SnappyOutputStream.write(SnappyOutputStream.java:107)
  at
 
 java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1876)
  at
 
 java.io.ObjectOutputStream$BlockDataOutputStream.writeByte(ObjectOutputStream.java:1914)
  at
 
 java.io.ObjectOutputStream.writeFatalException(ObjectOutputStream.java:1575)
  at
  java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:350)
  at
 
 org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:42)
  at
 
 org.apache.spark.serializer.SerializationStream.writeAll(Serializer.scala:110)
  at
 
 org.apache.spark.storage.BlockManager.dataSerializeStream(BlockManager.scala:1177)
  at
  org.apache.spark.storage.DiskStore.putIterator(DiskStore.scala:78)
  at
  org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:787)
  at
 
 org.apache.spark.storage.BlockManager.putIterator(BlockManager.scala:638)
  at
  org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:145)
  at
 org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:70)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:243)
  at
 org.apache.spark.rdd.FilteredRDD.compute(FilteredRDD.scala:34)
  at
 org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:278)
  at org.apache.spark.rdd.RDD.iterator(RDD.scala:245)
  at
  org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
  at org.apache.spark.scheduler.Task.run(Task.scala:56)
  at
  org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:200)
  at
 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at
 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:745)
 
  Driver stacktrace:
  at
  org.apache.spark.scheduler.DAGScheduler.org
 $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1214)
  at
 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1203)
  at
 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1202)
  at
 
 scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
  at
  scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
  at
 
 org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1202)
  at
 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
  at
 
 org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
  at scala.Option.foreach(Option.scala:236)
  at
 
 org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:696)
  at
 
 

Re: spark there is no space on the disk

2015-03-14 Thread Sean Owen
It means pretty much what it says. You ran out of space on an executor
(not driver), because the dir used for serialization temp files is
full (not all volumes). Set spark.local.dirs to something more
appropriate and larger.

On Sat, Mar 14, 2015 at 2:10 AM, Peng Xia sparkpeng...@gmail.com wrote:
 Hi


 I was running a logistic regression algorithm on a 8 nodes spark cluster,
 each node has 8 cores and 56 GB Ram (each node is running a windows system).
 And the spark installation driver has 1.9 TB capacity. The dataset I was
 training on are has around 40 million records with around 6600 features. But
 I always get this error during the training process:

 Py4JJavaError: An error occurred while calling
 o70.trainLogisticRegressionModelWithLBFGS.
 : org.apache.spark.SparkException: Job aborted due to stage failure: Task
 2709 in stage 3.0 failed 4 times, most recent failure: Lost task 2709.3 in
 stage 3.0 (TID 2766,
 workernode0.rbaHdInsightCluster5.b6.internal.cloudapp.net):
 java.io.IOException: There is not enough space on the disk
 at java.io.FileOutputStream.writeBytes(Native Method)
 at java.io.FileOutputStream.write(FileOutputStream.java:345)
 at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
 at
 org.xerial.snappy.SnappyOutputStream.dumpOutput(SnappyOutputStream.java:300)
 at
 org.xerial.snappy.SnappyOutputStream.rawWrite(SnappyOutputStream.java:247)
 at
 org.xerial.snappy.SnappyOutputStream.write(SnappyOutputStream.java:107)
 at
 java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1876)
 at
 java.io.ObjectOutputStream$BlockDataOutputStream.writeByte(ObjectOutputStream.java:1914)
 at
 java.io.ObjectOutputStream.writeFatalException(ObjectOutputStream.java:1575)
 at
 java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:350)
 at
 org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:42)
 at
 org.apache.spark.serializer.SerializationStream.writeAll(Serializer.scala:110)
 at
 org.apache.spark.storage.BlockManager.dataSerializeStream(BlockManager.scala:1177)
 at
 org.apache.spark.storage.DiskStore.putIterator(DiskStore.scala:78)
 at
 org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:787)
 at
 org.apache.spark.storage.BlockManager.putIterator(BlockManager.scala:638)
 at
 org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:145)
 at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:70)
 at org.apache.spark.rdd.RDD.iterator(RDD.scala:243)
 at org.apache.spark.rdd.FilteredRDD.compute(FilteredRDD.scala:34)
 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:278)
 at org.apache.spark.rdd.RDD.iterator(RDD.scala:245)
 at
 org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
 at org.apache.spark.scheduler.Task.run(Task.scala:56)
 at
 org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:200)
 at
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
 at java.lang.Thread.run(Thread.java:745)

 Driver stacktrace:
 at
 org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1214)
 at
 org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1203)
 at
 org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1202)
 at
 scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
 at
 scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
 at
 org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1202)
 at
 org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
 at
 org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
 at scala.Option.foreach(Option.scala:236)
 at
 org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:696)
 at
 org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1420)
 at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
 at
 org.apache.spark.scheduler.DAGSchedulerEventProcessActor.aroundReceive(DAGScheduler.scala:1375)
 at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
 at akka.actor.ActorCell.invoke(ActorCell.scala:487)
 at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
 at akka.dispatch.Mailbox.run(Mailbox.scala:220)
 at
 

spark there is no space on the disk

2015-03-13 Thread Peng Xia
Hi


I was running a logistic regression algorithm on a 8 nodes spark cluster,
each node has 8 cores and 56 GB Ram (each node is running a windows
system). And the spark installation driver has 1.9 TB capacity. The dataset
I was training on are has around 40 million records with around 6600
features. But I always get this error during the training process:

Py4JJavaError: An error occurred while calling
o70.trainLogisticRegressionModelWithLBFGS.:
org.apache.spark.SparkException: Job aborted due to stage failure:
Task 2709 in stage 3.0 failed 4 times, most recent failure: Lost task
2709.3 in stage 3.0 (TID 2766,
workernode0.rbaHdInsightCluster5.b6.internal.cloudapp.net):
java.io.IOException: There is not enough space on the disk
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:345)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
at 
org.xerial.snappy.SnappyOutputStream.dumpOutput(SnappyOutputStream.java:300)
at 
org.xerial.snappy.SnappyOutputStream.rawWrite(SnappyOutputStream.java:247)
at 
org.xerial.snappy.SnappyOutputStream.write(SnappyOutputStream.java:107)
at 
java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1876)
at 
java.io.ObjectOutputStream$BlockDataOutputStream.writeByte(ObjectOutputStream.java:1914)
at 
java.io.ObjectOutputStream.writeFatalException(ObjectOutputStream.java:1575)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:350)
at 
org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:42)
at 
org.apache.spark.serializer.SerializationStream.writeAll(Serializer.scala:110)
at 
org.apache.spark.storage.BlockManager.dataSerializeStream(BlockManager.scala:1177)
at org.apache.spark.storage.DiskStore.putIterator(DiskStore.scala:78)
at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:787)
at 
org.apache.spark.storage.BlockManager.putIterator(BlockManager.scala:638)
at 
org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:145)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:70)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:243)
at org.apache.spark.rdd.FilteredRDD.compute(FilteredRDD.scala:34)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:278)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:245)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
at org.apache.spark.scheduler.Task.run(Task.scala:56)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:200)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
at 
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1214)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1203)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1202)
at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at 
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1202)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:696)
at scala.Option.foreach(Option.scala:236)
at 
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:696)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1420)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at 
org.apache.spark.scheduler.DAGSchedulerEventProcessActor.aroundReceive(DAGScheduler.scala:1375)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
at akka.dispatch.Mailbox.run(Mailbox.scala:220)
at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)