Re: java.io.IOException: No space left on device--regd.

2015-07-06 Thread Akhil Das
You can also set these in the spark-env.sh file :

export SPARK_WORKER_DIR="/mnt/spark/"
export SPARK_LOCAL_DIR="/mnt/spark/"



Thanks
Best Regards

On Mon, Jul 6, 2015 at 12:29 PM, Akhil Das 
wrote:

> While the job is running, just look in the directory and see whats the
> root cause of it (is it the logs? is it the shuffle? etc). Here's a few
> configuration options which you can try:
>
> - Disable shuffle : spark.shuffle.spill=false (It might end up in OOM)
> - Enable log rotation:
>
> sparkConf.set("spark.executor.logs.rolling.strategy", "size")
> .set("spark.executor.logs.rolling.size.maxBytes", "1024")
> .set("spark.executor.logs.rolling.maxRetainedFiles", "3")
>
>
> Thanks
> Best Regards
>
> On Mon, Jul 6, 2015 at 10:44 AM, Devarajan Srinivasan <
> devathecool1...@gmail.com> wrote:
>
>> Hi ,
>>
>>  I am trying to run an ETL on spark which involves expensive shuffle
>> operation. Basically I require a self-join to be performed on a
>> sparkDataFrame RDD . The job runs fine for around 15 hours and when the
>> stage(which performs the sef-join) is about to complete, I get a 
>> *"java.io.IOException:
>> No space left on device"*. I initially thought this could be due  to
>> *spark.local.dir* pointing to */tmp* directory which was configured with
>> *2GB* of space, since this job requires expensive shuffles,spark
>> requires  more space to write the  shuffle files. Hence I configured
>> *spark.local.dir* to point to a different directory which has *1TB* of
>> space. But still I get the same *no space left exception*. What could be
>> the root cause of this issue?
>>
>>
>> Thanks in advance.
>>
>> *Exception stacktrace:*
>>
>> *java.io.IOException: No space left on device
>>  at java.io.FileOutputStream.writeBytes(Native Method)
>>  at java.io.FileOutputStream.write(FileOutputStream.java:345)
>>  at 
>> org.apache.spark.storage.DiskBlockObjectWriter$TimeTrackingOutputStream$$anonfun$write$3.apply$mcV$sp(BlockObjectWriter.scala:87)
>>  at org.apache.spark.storage.DiskBlockObjectWriter.org 
>> $apache$spark$storage$DiskBlockObjectWriter$$callWithTiming(BlockObjectWriter.scala:229)
>>  at 
>> org.apache.spark.storage.DiskBlockObjectWriter$TimeTrackingOutputStream.write(BlockObjectWriter.scala:87)
>>  at 
>> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
>>  at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
>>  at 
>> org.xerial.snappy.SnappyOutputStream.dump(SnappyOutputStream.java:297)
>>  at 
>> org.xerial.snappy.SnappyOutputStream.rawWrite(SnappyOutputStream.java:244)
>>  at 
>> org.xerial.snappy.SnappyOutputStream.write(SnappyOutputStream.java:99)
>>  at 
>> java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1876)
>>  at 
>> java.io.ObjectOutputStream$BlockDataOutputStream.setBlockDataMode(ObjectOutputStream.java:1785)
>>  at 
>> java.io.ObjectOutputStream.writeNonProxyDesc(ObjectOutputStream.java:1285)
>>  at 
>> java.io.ObjectOutputStream.writeClassDesc(ObjectOutputStream.java:1230)
>>  at 
>> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1426)
>>  at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
>>  at 
>> java.io.ObjectOutputStream.writeFatalException(ObjectOutputStream.java:1576)
>>  at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:350)
>>  at 
>> org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:44)
>>  at 
>> org.apache.spark.storage.DiskBlockObjectWriter.write(BlockObjectWriter.scala:204)
>>  at 
>> org.apache.spark.util.collection.ExternalSorter.spillToPartitionFiles(ExternalSorter.scala:370)
>>  at 
>> org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:211)
>>  at 
>> org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:63)
>>  at 
>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
>>  at 
>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>>  at org.apache.spark.scheduler.Task.run(Task.scala:64)
>>  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
>>  at 
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>  at 
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>  at java.lang.Thread.run(Thread.java:745)*
>>
>>
>>
>


Re: java.io.IOException: No space left on device--regd.

2015-07-05 Thread Akhil Das
While the job is running, just look in the directory and see whats the root
cause of it (is it the logs? is it the shuffle? etc). Here's a few
configuration options which you can try:

- Disable shuffle : spark.shuffle.spill=false (It might end up in OOM)
- Enable log rotation:

sparkConf.set("spark.executor.logs.rolling.strategy", "size")
.set("spark.executor.logs.rolling.size.maxBytes", "1024")
.set("spark.executor.logs.rolling.maxRetainedFiles", "3")


Thanks
Best Regards

On Mon, Jul 6, 2015 at 10:44 AM, Devarajan Srinivasan <
devathecool1...@gmail.com> wrote:

> Hi ,
>
>  I am trying to run an ETL on spark which involves expensive shuffle
> operation. Basically I require a self-join to be performed on a
> sparkDataFrame RDD . The job runs fine for around 15 hours and when the
> stage(which performs the sef-join) is about to complete, I get a 
> *"java.io.IOException:
> No space left on device"*. I initially thought this could be due  to
> *spark.local.dir* pointing to */tmp* directory which was configured with
> *2GB* of space, since this job requires expensive shuffles,spark
> requires  more space to write the  shuffle files. Hence I configured
> *spark.local.dir* to point to a different directory which has *1TB* of
> space. But still I get the same *no space left exception*. What could be
> the root cause of this issue?
>
>
> Thanks in advance.
>
> *Exception stacktrace:*
>
> *java.io.IOException: No space left on device
>   at java.io.FileOutputStream.writeBytes(Native Method)
>   at java.io.FileOutputStream.write(FileOutputStream.java:345)
>   at 
> org.apache.spark.storage.DiskBlockObjectWriter$TimeTrackingOutputStream$$anonfun$write$3.apply$mcV$sp(BlockObjectWriter.scala:87)
>   at org.apache.spark.storage.DiskBlockObjectWriter.org 
> $apache$spark$storage$DiskBlockObjectWriter$$callWithTiming(BlockObjectWriter.scala:229)
>   at 
> org.apache.spark.storage.DiskBlockObjectWriter$TimeTrackingOutputStream.write(BlockObjectWriter.scala:87)
>   at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
>   at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
>   at 
> org.xerial.snappy.SnappyOutputStream.dump(SnappyOutputStream.java:297)
>   at 
> org.xerial.snappy.SnappyOutputStream.rawWrite(SnappyOutputStream.java:244)
>   at 
> org.xerial.snappy.SnappyOutputStream.write(SnappyOutputStream.java:99)
>   at 
> java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1876)
>   at 
> java.io.ObjectOutputStream$BlockDataOutputStream.setBlockDataMode(ObjectOutputStream.java:1785)
>   at 
> java.io.ObjectOutputStream.writeNonProxyDesc(ObjectOutputStream.java:1285)
>   at 
> java.io.ObjectOutputStream.writeClassDesc(ObjectOutputStream.java:1230)
>   at 
> java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1426)
>   at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
>   at 
> java.io.ObjectOutputStream.writeFatalException(ObjectOutputStream.java:1576)
>   at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:350)
>   at 
> org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:44)
>   at 
> org.apache.spark.storage.DiskBlockObjectWriter.write(BlockObjectWriter.scala:204)
>   at 
> org.apache.spark.util.collection.ExternalSorter.spillToPartitionFiles(ExternalSorter.scala:370)
>   at 
> org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:211)
>   at 
> org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:63)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>   at org.apache.spark.scheduler.Task.run(Task.scala:64)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:745)*
>
>
>


java.io.IOException: No space left on device--regd.

2015-07-05 Thread Devarajan Srinivasan
Hi ,

 I am trying to run an ETL on spark which involves expensive shuffle
operation. Basically I require a self-join to be performed on a
sparkDataFrame RDD . The job runs fine for around 15 hours and when the
stage(which performs the sef-join) is about to complete, I get a
*"java.io.IOException:
No space left on device"*. I initially thought this could be due  to
*spark.local.dir* pointing to */tmp* directory which was configured with
*2GB* of space, since this job requires expensive shuffles,spark requires
more space to write the  shuffle files. Hence I configured *spark.local.dir*
to point to a different directory which has *1TB* of space. But still I get
the same *no space left exception*. What could be the root cause of this
issue?


Thanks in advance.

*Exception stacktrace:*

*java.io.IOException: No space left on device
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:345)
at 
org.apache.spark.storage.DiskBlockObjectWriter$TimeTrackingOutputStream$$anonfun$write$3.apply$mcV$sp(BlockObjectWriter.scala:87)
at org.apache.spark.storage.DiskBlockObjectWriter.org
$apache$spark$storage$DiskBlockObjectWriter$$callWithTiming(BlockObjectWriter.scala:229)
at 
org.apache.spark.storage.DiskBlockObjectWriter$TimeTrackingOutputStream.write(BlockObjectWriter.scala:87)
at 
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
at 
org.xerial.snappy.SnappyOutputStream.dump(SnappyOutputStream.java:297)
at 
org.xerial.snappy.SnappyOutputStream.rawWrite(SnappyOutputStream.java:244)
at 
org.xerial.snappy.SnappyOutputStream.write(SnappyOutputStream.java:99)
at 
java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1876)
at 
java.io.ObjectOutputStream$BlockDataOutputStream.setBlockDataMode(ObjectOutputStream.java:1785)
at 
java.io.ObjectOutputStream.writeNonProxyDesc(ObjectOutputStream.java:1285)
at 
java.io.ObjectOutputStream.writeClassDesc(ObjectOutputStream.java:1230)
at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1426)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
at 
java.io.ObjectOutputStream.writeFatalException(ObjectOutputStream.java:1576)
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:350)
at 
org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:44)
at 
org.apache.spark.storage.DiskBlockObjectWriter.write(BlockObjectWriter.scala:204)
at 
org.apache.spark.util.collection.ExternalSorter.spillToPartitionFiles(ExternalSorter.scala:370)
at 
org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:211)
at 
org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:63)
at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:64)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)*