It is hard to guess why OOM happens without knowing your application's logic 
and the data size.
Without knowing that, I can only guess based on some common experiences:
1) increase "spark.default.parallelism"2) Increase your executor-memory, maybe 
6g is not just enough 3) Your environment is kind of unbalance between cup 
cores and available memory (8 cores vs 12G). Each core should have 3G for 
Spark.4) If you cache RDD, using "MEMORY_ONLY_SER" instead of "MEMORY_ONLY"5) 
Since your cores is much more compared with your available memory, lower the 
cores for executor by set "-Dspark.deploy.defaultCores=". When you have not 
enough memory, reduce the concurrency of your executor, it will lower the 
memory requirement, with running in a slower speed.
Yong

Date: Wed, 8 Apr 2015 04:57:22 +0800
Subject: Re: 'Java heap space' error occured when query 4G data file from HDFS
From: lidali...@gmail.com
To: user@spark.apache.org

Any help?please.
Help me do a right configure.

李铖 <lidali...@gmail.com>于2015年4月7日星期二写道:
In my dev-test env .I have 3 virtual machines ,every machine have 12G memory,8 
cpu core.
Here is spark-defaults.conf,and spark-env.sh.Maybe some config is not right.
I run this command :spark-submit --master yarn-client --driver-memory 7g 
--executor-memory 6g /home/hadoop/spark/main.pyexception rised.
spark-defaults.conf
spark.master                     spark://cloud1:7077spark.default.parallelism   
100spark.eventLog.enabled           truespark.serializer                 
org.apache.spark.serializer.KryoSerializerspark.driver.memory              
5gspark.driver.maxResultSize         6gspark.kryoserializer.buffer.mb       
256spark.kryoserializer.buffer.max.mb   512     spark.executor.memory   
4gspark.rdd.compress    truespark.storage.memoryFraction        
0spark.akka.frameSize   50spark.shuffle.compress        
truespark.shuffle.spill.compress        falsespark.local.dir            
/home/hadoop/tmp
 spark-evn.sh
export SCALA=/home/hadoop/softsetup/scalaexport 
JAVA_HOME=/home/hadoop/softsetup/jdk1.7.0_71export SPARK_WORKER_CORES=1export 
SPARK_WORKER_MEMORY=4gexport HADOOP_CONF_DIR=/opt/cloud/hadoop/etc/hadoopexport 
SPARK_EXECUTOR_MEMORY=4gexport SPARK_DRIVER_MEMORY=4g
Exception:
15/04/07 18:11:03 INFO BlockManagerInfo: Added taskresult_28 on disk on 
cloud3:38109 (size: 162.7 MB)15/04/07 18:11:03 INFO BlockManagerInfo: Added 
taskresult_28 on disk on cloud3:38109 (size: 162.7 MB)15/04/07 18:11:03 INFO 
TaskSetManager: Starting task 31.0 in stage 1.0 (TID 31, cloud3, NODE_LOCAL, 
1296 bytes)15/04/07 18:11:03 INFO BlockManagerInfo: Added taskresult_29 on disk 
on cloud2:49451 (size: 163.7 MB)15/04/07 18:11:03 INFO BlockManagerInfo: Added 
taskresult_29 on disk on cloud2:49451 (size: 163.7 MB)15/04/07 18:11:03 INFO 
TaskSetManager: Starting task 30.0 in stage 1.0 (TID 32, cloud2, NODE_LOCAL, 
1296 bytes)15/04/07 18:11:03 ERROR Utils: Uncaught exception in thread 
task-result-getter-0java.lang.OutOfMemoryError: Java heap space   at 
org.apache.spark.scheduler.DirectTaskResult$$anonfun$readExternal$1.apply$mcV$sp(TaskResult.scala:61)
        at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:985)       
at 
org.apache.spark.scheduler.DirectTaskResult.readExternal(TaskResult.scala:58)   
     at java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1837) 
     at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)    at 
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)   at 
java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)     at 
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
    at 
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:81)
      at 
org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply$mcV$sp(TaskResultGetter.scala:73)
   at 
org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:49)
  at 
org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:49)
  at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1460) at 
org.apache.spark.scheduler.TaskResultGetter$$anon$2.run(TaskResultGetter.scala:48)
   at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
     at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
     at java.lang.Thread.run(Thread.java:745)Exception in thread 
"task-result-getter-0" java.lang.OutOfMemoryError: Java heap space  at 
org.apache.spark.scheduler.DirectTaskResult$$anonfun$readExternal$1.apply$mcV$sp(TaskResult.scala:61)
        at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:985)       
at 
org.apache.spark.scheduler.DirectTaskResult.readExternal(TaskResult.scala:58)   
     at java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1837) 
     at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)    at 
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)   at 
java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)     at 
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
    at 
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:81)
      at 
org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply$mcV$sp(TaskResultGetter.scala:73)
   at 
org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:49)
  at 
org.apache.spark.scheduler.TaskResultGetter$$anon$2$$anonfun$run$1.apply(TaskResultGetter.scala:49)
  at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1460) at 
org.apache.spark.scheduler.TaskResultGetter$$anon$2.run(TaskResultGetter.scala:48)
   at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
     at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
     at java.lang.Thread.run(Thread.java:745)15/04/07 18:11:03 INFO 
BlockManagerInfo: Added taskresult_28 on disk on cloud3:38109 (size: 162.7 
MB)15/04/07 18:11:03 INFO BlockManagerInfo: Added taskresult_29 on disk on 
cloud2:49451 (size: 163.7 MB)15/04/07 18:11:05 ERROR Utils: Uncaught exception 
in thread task-result-getter-4java.lang.OutOfMemoryError: Java heap 
spaceException in thread "task-result-getter-4" java.lang.OutOfMemoryError: 
Java heap space15/04/07 18:11:07 INFO BlockManagerInfo: Added taskresult_31 on 
disk on cloud3:38109 (size: 87.9 MB)15/04/07 18:11:07 INFO BlockManagerInfo: 
Added taskresult_31 on disk on cloud3:38109 (size: 87.9 MB)15/04/07 18:11:08 
WARN TransportChannelHandler: Exception in connection from 
cloud3/192.168.0.95:38109java.lang.OutOfMemoryError: Java heap space15/04/07 
18:11:08 ERROR TransportResponseHandler: Still have 1 requests outstanding when 
connection from cloud3/192.168.0.95:38109 is closed15/04/07 18:11:08 ERROR 
OneForOneBlockFetcher: Failed while starting block 
fetchesjava.lang.OutOfMemoryError: Java heap space15/04/07 18:11:08 ERROR 
RetryingBlockFetcher: Failed to fetch block taskresult_31, and will not retry 
(0 retries)java.lang.OutOfMemoryError: Java heap space15/04/07 18:11:08 ERROR 
TransportClient: Failed to send RPC 7722440433247749491 to 
cloud3/192.168.0.95:38109: 
java.nio.channels.ClosedChannelExceptionjava.nio.channels.ClosedChannelException15/04/07
 18:11:08 ERROR OneForOneBlockFetcher: Failed while starting block 
fetchesjava.io.IOException: Failed to send RPC 7722440433247749491 to 
cloud3/192.168.0.95:38109: java.nio.channels.ClosedChannelException     at 
org.apache.spark.network.client.TransportClient$2.operationComplete(TransportClient.java:158)
        at 
org.apache.spark.network.client.TransportClient$2.operationComplete(TransportClient.java:145)
        at 
io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
     at 
io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:567)
     at 
io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:424)  at 
io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetFailure(AbstractChannel.java:745)
     at 
io.netty.channel.AbstractChannel$AbstractUnsafe.write(AbstractChannel.java:646) 
     at 
io.netty.channel.DefaultChannelPipeline$HeadContext.write(DefaultChannelPipeline.java:1054)
  at 
io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:658)
   at 
io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:716)
 at 
io.netty.channel.AbstractChannelHandlerContext.write(AbstractChannelHandlerContext.java:651)
 at 
io.netty.handler.codec.MessageToMessageEncoder.write(MessageToMessageEncoder.java:112)
       at 
io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:658)
   at 
io.netty.channel.AbstractChannelHandlerContext.access$2000(AbstractChannelHandlerContext.java:32)
    at 
io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext.java:939)
       at 
io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:991)
       at 
io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:924)
 at 
io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:380)
   at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357) at 
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
 at java.lang.Thread.run(Thread.java:745)Caused by: 
java.nio.channels.ClosedChannelException15/04/07 18:11:08 INFO 
BlockManagerInfo: Added taskresult_30 on disk on cloud1:44029 (size: 163.5 
MB)15/04/07 18:11:08 INFO BlockManagerInfo: Added taskresult_30 on disk on 
cloud1:44029 (size: 163.5 MB)15/04/07 18:11:08 ERROR Utils: Uncaught exception 
in thread task-result-getter-6java.lang.OutOfMemoryError: Java heap 
spaceException in thread "task-result-getter-6" java.lang.OutOfMemoryError: 
Java heap space15/04/07 18:11:08 ERROR TaskResultGetter: Exception while 
getting task resultjava.util.concurrent.ExecutionException: Boxed Error at 
scala.concurrent.impl.Promise$.resolver(Promise.scala:55)    at 
scala.concurrent.impl.Promise$.scala$concurrent$impl$Promise$$resolveTry(Promise.scala:47)
   at 
scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:244)  at 
scala.concurrent.Promise$class.complete(Promise.scala:55)    at 
scala.concurrent.impl.Promise$DefaultPromise.complete(Promise.scala:153)     at 
scala.concurrent.Promise$class.failure(Promise.scala:107)    at 
scala.concurrent.impl.Promise$DefaultPromise.failure(Promise.scala:153)      at 
org.apache.spark.network.BlockTransferService$$anon$1.onBlockFetchFailure(BlockTransferService.scala:92)
     at 
org.apache.spark.network.shuffle.RetryingBlockFetcher$RetryingBlockFetchListener.onBlockFetchFailure(RetryingBlockFetcher.java:230)
  at 
org.apache.spark.network.shuffle.OneForOneBlockFetcher.failRemainingBlocks(OneForOneBlockFetcher.java:123)
   at 
org.apache.spark.network.shuffle.OneForOneBlockFetcher.access$300(OneForOneBlockFetcher.java:43)
     at 
org.apache.spark.network.shuffle.OneForOneBlockFetcher$1.onFailure(OneForOneBlockFetcher.java:114)
   at 
org.apache.spark.network.client.TransportResponseHandler.failOutstandingRequests(TransportResponseHandler.java:84)
   at 
org.apache.spark.network.client.TransportResponseHandler.exceptionCaught(TransportResponseHandler.java:108)
  at 
org.apache.spark.network.server.TransportChannelHandler.exceptionCaught(TransportChannelHandler.java:69)
     at 
io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:271)
 at 
io.netty.channel.AbstractChannelHandlerContext.notifyHandlerException(AbstractChannelHandlerContext.java:768)
        at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:335)
     at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
       at 
io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
 at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
     at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
       at 
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:163)
       at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:333)
     at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:319)
       at 
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:787)
     at 
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:130)
      at 
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)  at 
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
        at 
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) at 
io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) at 
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:116)
 at java.lang.Thread.run(Thread.java:745)Caused by: java.lang.OutOfMemoryError: 
Java heap space
                                          

Reply via email to