[ https://issues.apache.org/jira/browse/FLINK-2773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14939027#comment-14939027 ]
Maximilian Michels edited comment on FLINK-2773 at 9/30/15 11:09 PM: --------------------------------------------------------------------- Problem should be resolved with {{011cbbf}}. was (Author: mxm): Problem should be resolved with {011cbbf}. > OutOfMemoryError on YARN Session > -------------------------------- > > Key: FLINK-2773 > URL: https://issues.apache.org/jira/browse/FLINK-2773 > Project: Flink > Issue Type: Bug > Components: YARN Client > Affects Versions: 0.10 > Reporter: Fabian Hueske > Assignee: Maximilian Michels > Priority: Blocker > Fix For: 0.10 > > > When running a Flink program on a detached YARN session using the latest > master (commit {{0b3ca57b41e09937b9e63f2f443834c8ad1cf497}}), I observed this > {{OutOfMemoryError}} > {code} > java.lang.Exception: The data preparation for task 'CoGroup > (coGroup-A68B765B7BAB4E29BF6816965A994776)' , caused an error: Error > obtaining the sorted input: Thread 'SortMerger Reading Thread' terminated due > to an exception: java.lang.OutOfMemoryError: Direct buffer memory > at > org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:464) > at > org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:354) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:579) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: Error obtaining the sorted input: > Thread 'SortMerger Reading Thread' terminated due to an exception: > java.lang.OutOfMemoryError: Direct buffer memory > at > org.apache.flink.runtime.operators.sort.UnilateralSortMerger.getIterator(UnilateralSortMerger.java:607) > at > org.apache.flink.runtime.operators.RegularPactTask.getInput(RegularPactTask.java:1089) > at > org.apache.flink.runtime.operators.CoGroupDriver.prepare(CoGroupDriver.java:97) > at > org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:459) > ... 3 more > Caused by: java.io.IOException: Thread 'SortMerger Reading Thread' terminated > due to an exception: java.lang.OutOfMemoryError: Direct buffer memory > at > org.apache.flink.runtime.operators.sort.UnilateralSortMerger$ThreadBase.run(UnilateralSortMerger.java:787) > Caused by: > org.apache.flink.runtime.io.network.netty.exception.LocalTransportException: > java.lang.OutOfMemoryError: Direct buffer memory > at > org.apache.flink.runtime.io.network.netty.PartitionRequestClientHandler.exceptionCaught(PartitionRequestClientHandler.java:153) > at > io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:246) > at > io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:224) > at > io.netty.channel.ChannelInboundHandlerAdapter.exceptionCaught(ChannelInboundHandlerAdapter.java:131) > at > io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:246) > at > io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:224) > at > io.netty.channel.ChannelInboundHandlerAdapter.exceptionCaught(ChannelInboundHandlerAdapter.java:131) > at > io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:246) > at > io.netty.channel.AbstractChannelHandlerContext.notifyHandlerException(AbstractChannelHandlerContext.java:737) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:310) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294) > at > io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846) > at > io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131) > at > io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) > at > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112) > at java.lang.Thread.run(Thread.java:745) > Caused by: io.netty.handler.codec.DecoderException: > java.lang.OutOfMemoryError: Direct buffer memory > at > io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:234) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308) > ... 9 more > Caused by: java.lang.OutOfMemoryError: Direct buffer memory > at java.nio.Bits.reserveMemory(Bits.java:658) > at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123) > at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306) > at > io.netty.buffer.UnpooledUnsafeDirectByteBuf.allocateDirect(UnpooledUnsafeDirectByteBuf.java:108) > at > io.netty.buffer.UnpooledUnsafeDirectByteBuf.capacity(UnpooledUnsafeDirectByteBuf.java:157) > at > io.netty.buffer.AbstractByteBuf.ensureWritable(AbstractByteBuf.java:251) > at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:849) > at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:841) > at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:831) > at > io.netty.handler.codec.ByteToMessageDecoder$1.cumulate(ByteToMessageDecoder.java:92) > at > io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:228) > ... 10 more > {code} > Since I know, that this feature was properly working recently, I reverted to > commit {{8ca853e0f6c18be8e6b066c6ec0f23badb797323}} and the problem was gone. > The problem might have been introduced when adding offheap memory support for > YARN (commit {{93c95b6a6f150a2c55dc387e4ef1d603b3ef3f22}}). -- This message was sent by Atlassian JIRA (v6.3.4#6332)