Hi ,

 I am trying to run an ETL on spark which involves expensive shuffle
operation. Basically I require a self-join to be performed on a
sparkDataFrame RDD . The job runs fine for around 15 hours and when the
stage(which performs the sef-join) is about to complete, I get a
*"java.io.IOException:
No space left on device"*. I initially thought this could be due  to
*spark.local.dir* pointing to */tmp* directory which was configured with
*2GB* of space, since this job requires expensive shuffles,spark requires
more space to write the  shuffle files. Hence I configured *spark.local.dir*
to point to a different directory which has *1TB* of space. But still I get
the same *no space left exception*. What could be the root cause of this
issue?


Thanks in advance.

*Exception stacktrace:*

*java.io.IOException: No space left on device
        at java.io.FileOutputStream.writeBytes(Native Method)
        at java.io.FileOutputStream.write(FileOutputStream.java:345)
        at 
org.apache.spark.storage.DiskBlockObjectWriter$TimeTrackingOutputStream$$anonfun$write$3.apply$mcV$sp(BlockObjectWriter.scala:87)
        at org.apache.spark.storage.DiskBlockObjectWriter.org
<http://org.apache.spark.storage.DiskBlockObjectWriter.org>$apache$spark$storage$DiskBlockObjectWriter$$callWithTiming(BlockObjectWriter.scala:229)
        at 
org.apache.spark.storage.DiskBlockObjectWriter$TimeTrackingOutputStream.write(BlockObjectWriter.scala:87)
        at 
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
        at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
        at 
org.xerial.snappy.SnappyOutputStream.dump(SnappyOutputStream.java:297)
        at 
org.xerial.snappy.SnappyOutputStream.rawWrite(SnappyOutputStream.java:244)
        at 
org.xerial.snappy.SnappyOutputStream.write(SnappyOutputStream.java:99)
        at 
java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1876)
        at 
java.io.ObjectOutputStream$BlockDataOutputStream.setBlockDataMode(ObjectOutputStream.java:1785)
        at 
java.io.ObjectOutputStream.writeNonProxyDesc(ObjectOutputStream.java:1285)
        at 
java.io.ObjectOutputStream.writeClassDesc(ObjectOutputStream.java:1230)
        at 
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1426)
        at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177)
        at 
java.io.ObjectOutputStream.writeFatalException(ObjectOutputStream.java:1576)
        at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:350)
        at 
org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:44)
        at 
org.apache.spark.storage.DiskBlockObjectWriter.write(BlockObjectWriter.scala:204)
        at 
org.apache.spark.util.collection.ExternalSorter.spillToPartitionFiles(ExternalSorter.scala:370)
        at 
org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:211)
        at 
org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:63)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
        at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
        at org.apache.spark.scheduler.Task.run(Task.scala:64)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)*

Reply via email to