Naresh, We've configured our Spark JVMs to shut down if there is an OutOfMemoryError. Otherwise, the error will bring down a random thread an cause trouble like the IllegalStateException you hit. It is best to let Spark recover by replacing the executor or failing the job.
rb On Wed, Feb 15, 2017 at 1:58 PM, naresh gundla <nareshgun...@gmail.com> wrote: > Hi , > > I am running a spark application and getting out of memory errors in yarn > nodemanager logs and container get killed. Please find below for the errors > details. > Has anyone faced with this issue? > > *Enabled spark dynamic allocation and yarn shuffle* > > 2017-02-15 14:50:48,047 WARN io.netty.util.concurrent.DefaultPromise: An > exception was thrown by org.apache.spark.network.server. > TransportRequestHandler$2.operationComplete() > java.lang.OutOfMemoryError: GC overhead limit exceeded > 2017-02-15 15:21:09,506 ERROR > org.apache.spark.network.server.TransportRequestHandler: > Error opening block StreamChunkId{streamId=1374579274227, chunkIndex=241} > for request from /10.154.16.83:50042 > java.lang.IllegalStateException: Received out-of-order chunk index 241 > (expected 114) > at org.apache.spark.network.server.OneForOneStreamManager. > getChunk(OneForOneStreamManager.java:81) > at org.apache.spark.network.server.TransportRequestHandler. > processFetchRequest(TransportRequestHandler.java:121) > at org.apache.spark.network.server.TransportRequestHandler.handle( > TransportRequestHandler.java:100) > at org.apache.spark.network.server.TransportChannelHandler. > channelRead0(TransportChannelHandler.java:104) > at org.apache.spark.network.server.TransportChannelHandler. > channelRead0(TransportChannelHandler.java:51) > at io.netty.channel.SimpleChannelInboundHandler.channelRead( > SimpleChannelInboundHandler.java:105) > at io.netty.channel.AbstractChannelHandlerContext. > invokeChannelRead(AbstractChannelHandlerContext.java:333) > at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead( > AbstractChannelHandlerContext.java:319) > at io.netty.handler.timeout.IdleStateHandler.channelRead( > IdleStateHandler.java:254) > at io.netty.channel.AbstractChannelHandlerContext. > invokeChannelRead(AbstractChannelHandlerContext.java:333) > at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead( > AbstractChannelHandlerContext.java:319) > at io.netty.handler.codec.MessageToMessageDecoder.channelRead( > MessageToMessageDecoder.java:103) > at io.netty.channel.AbstractChannelHandlerContext. > invokeChannelRead(AbstractChannelHandlerContext.java:333) > at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead( > AbstractChannelHandlerContext.java:319) > at org.apache.spark.network.util.TransportFrameDecoder. > channelRead(TransportFrameDecoder.java:86) > at io.netty.channel.AbstractChannelHandlerContext. > invokeChannelRead(AbstractChannelHandlerContext.java:333) > at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead( > AbstractChannelHandlerContext.java:319) > at io.netty.channel.DefaultChannelPipeline.fireChannelRead( > DefaultChannelPipeline.java:787) > at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read( > AbstractNioByteChannel.java:130) > at io.netty.channel.nio.NioEventLoop.processSelectedKey( > NioEventLoop.java:511) > at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized( > NioEventLoop.java:468) > at io.netty.channel.nio.NioEventLoop.processSelectedKeys( > NioEventLoop.java:382) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) > at io.netty.util.concurrent.SingleThreadEventExecutor$2. > run(SingleThreadEventExecutor.java:116) > at java.lang.Thread.run(Thread.java:745) > > 2017-02-15 14:50:14,692 WARN > org.apache.spark.network.server.TransportChannelHandler: > Exception in connection from /10.154.16.74:58547 > java.lang.OutOfMemoryError: GC overhead limit exceeded > > Thanks > Naresh > > -- Ryan Blue Software Engineer Netflix