Hello, I came across a weird non-easily replicable OOM in executor during unrolling. The standalone cluster uses default memory settings on Spark 1.5.2.
What strikes me is that the OOM happens when Spark tries to allocate a bytebuffer for the FileChannel when dropping blocks from memory and writing them on disk because of storage level. To be fair, the cluster is under lots of memory pressure. I was thinking decreasing the spark.storage.safetyFraction to give more breathing room to Spark. Is that the right way to think here? The stacktrace: java.lang.OutOfMemoryError at sun.misc.Unsafe.allocateMemory(Native Method) at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:127) at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306) at sun.nio.ch.Util.getTemporaryDirectBuffer(Util.java:174) at sun.nio.ch.IOUtil.write(IOUtil.java:58) at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:205) at org.apache.spark.storage.DiskStore$$anonfun$putBytes$1.apply$mcV$sp(DiskStore.scala:50) at org.apache.spark.storage.DiskStore$$anonfun$putBytes$1.apply(DiskStore.scala:49) at org.apache.spark.storage.DiskStore$$anonfun$putBytes$1.apply(DiskStore.scala:49) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1206) at org.apache.spark.storage.DiskStore.putBytes(DiskStore.scala:52) at org.apache.spark.storage.BlockManager.dropFromMemory(BlockManager.scala:1043) at org.apache.spark.storage.BlockManager.dropFromMemory(BlockManager.scala:1002) at org.apache.spark.storage.MemoryStore$$anonfun$ensureFreeSpace$4.apply(MemoryStore.scala:468) at org.apache.spark.storage.MemoryStore$$anonfun$ensureFreeSpace$4.apply(MemoryStore.scala:457) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at org.apache.spark.storage.MemoryStore.ensureFreeSpace(MemoryStore.scala:457) at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:292) at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171) at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78) at org.apache.spark.rdd.RDD.iterator(RDD.scala:262) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:300) at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) at org.apache.spark.scheduler.Task.run(Task.scala:88) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Thanks, *Thomas Gerber* Director of Data Engineering <http://radius.com/>
java.lang.OutOfMemoryError at sun.misc.Unsafe.allocateMemory(Native Method) at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:127) at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306) at sun.nio.ch.Util.getTemporaryDirectBuffer(Util.java:174) at sun.nio.ch.IOUtil.write(IOUtil.java:58) at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:205) at org.apache.spark.storage.DiskStore$$anonfun$putBytes$1.apply$mcV$sp(DiskStore.scala:50) at org.apache.spark.storage.DiskStore$$anonfun$putBytes$1.apply(DiskStore.scala:49) at org.apache.spark.storage.DiskStore$$anonfun$putBytes$1.apply(DiskStore.scala:49) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1206) at org.apache.spark.storage.DiskStore.putBytes(DiskStore.scala:52) at org.apache.spark.storage.BlockManager.dropFromMemory(BlockManager.scala:1043) at org.apache.spark.storage.BlockManager.dropFromMemory(BlockManager.scala:1002) at org.apache.spark.storage.MemoryStore$$anonfun$ensureFreeSpace$4.apply(MemoryStore.scala:468) at org.apache.spark.storage.MemoryStore$$anonfun$ensureFreeSpace$4.apply(MemoryStore.scala:457) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at org.apache.spark.storage.MemoryStore.ensureFreeSpace(MemoryStore.scala:457) at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:292) at org.apache.spark.CacheManager.putInBlockManager(CacheManager.scala:171) at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:78) at org.apache.spark.rdd.RDD.iterator(RDD.scala:262) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:300) at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) at org.apache.spark.scheduler.Task.run(Task.scala:88) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)
--------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org