[ 
https://issues.apache.org/jira/browse/FLINK-11379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haibo Suen updated FLINK-11379:
-------------------------------
    Description: 
When TM loads a offloaded TDD with large size, it may throw a 
"java.lang.OutOfMemoryError: Direct Buffer Memory" error. The loading uses 
nio's _Files.readAllBytes()_ to read serialized TDD. In the call stack of 
_Files.readAllBytes()_ , it will allocate a direct memory buffer which's size 
is equal the length of the file. This will cause OutOfMemoryErro error when 
direct memory is not enough.

If the length of a file is large than a maximum buffer size,  the maximum size 
direct-buffer should be used to read bytes of the file to avoid direct memory 
OutOfMemoryError.  The maximum buffer size can be 8K or others.

The exception stack is as follows (this exception stack is from an old Flink 
version, but the master branch has the same problem).

Caused by: java.lang.OutOfMemoryError: Direct buffer memory
   at java.nio.Bits.reserveMemory(Bits.java:706)
   at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123)
   at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311)
   at sun.nio.ch.Util.getTemporaryDirectBuffer(Util.java:241)
   at sun.nio.ch.IOUtil.read(IOUtil.java:195)
   at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:182)
   at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:65)
   at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:109)
   at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
   at java.nio.file.Files.read(Files.java:3105)
   at java.nio.file.Files.readAllBytes(Files.java:3158)
   at 
org.apache.flink.runtime.deployment.TaskDeploymentDescriptor.loadBigData(TaskDeploymentDescriptor.java:338)
   at 
org.apache.flink.runtime.taskexecutor.TaskExecutor.submitTask(TaskExecutor.java:397)
   at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
   at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:498)
   at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:211)
   at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:155)
   at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$onReceive$1(AkkaRpcActor.java:133)
   at akka.actor.ActorCell$$anonfun$become$1.applyOrElse(ActorCell.scala:544)
   at akka.actor.Actor$class.aroundReceive(Actor.scala:502)
   at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95)
   ... 9 more

  was:
When TM loads a offloaded TDD with large size, it may throw a 
"java.lang.OutOfMemoryError: Direct Buffer Memory" error. The loading uses 
nio's _Files.readAllBytes()_ to read serialized TDD. In the call stack of 
_Files.readAllBytes()_ , it will allocate a direct memory buffer which's size 
is equal the length of the file. This will cause OutOfMemoryErro error when 
direct memory is not enough.

A fixed size direct buffer should be used to read all bytes of a file to avoid 
direct memory OutOfMemoryError, such as a 8K buffer.

The exception stack is as follows (this exception stack is from an old Flink 
version, but the master branch has the same problem).

Caused by: java.lang.OutOfMemoryError: Direct buffer memory
   at java.nio.Bits.reserveMemory(Bits.java:706)
   at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123)
   at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311)
   at sun.nio.ch.Util.getTemporaryDirectBuffer(Util.java:241)
   at sun.nio.ch.IOUtil.read(IOUtil.java:195)
   at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:182)
   at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:65)
   at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:109)
   at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
   at java.nio.file.Files.read(Files.java:3105)
   at java.nio.file.Files.readAllBytes(Files.java:3158)
   at 
org.apache.flink.runtime.deployment.TaskDeploymentDescriptor.loadBigData(TaskDeploymentDescriptor.java:338)
   at 
org.apache.flink.runtime.taskexecutor.TaskExecutor.submitTask(TaskExecutor.java:397)
   at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
   at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:498)
   at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:211)
   at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:155)
   at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$onReceive$1(AkkaRpcActor.java:133)
   at akka.actor.ActorCell$$anonfun$become$1.applyOrElse(ActorCell.scala:544)
   at akka.actor.Actor$class.aroundReceive(Actor.scala:502)
   at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95)
   ... 9 more


> "java.lang.OutOfMemoryError: Direct buffer memory" when TM loads a large size 
> TDD
> ---------------------------------------------------------------------------------
>
>                 Key: FLINK-11379
>                 URL: https://issues.apache.org/jira/browse/FLINK-11379
>             Project: Flink
>          Issue Type: Bug
>          Components: TaskManager
>    Affects Versions: 1.7.0, 1.7.1
>            Reporter: Haibo Suen
>            Assignee: Haibo Suen
>            Priority: Major
>
> When TM loads a offloaded TDD with large size, it may throw a 
> "java.lang.OutOfMemoryError: Direct Buffer Memory" error. The loading uses 
> nio's _Files.readAllBytes()_ to read serialized TDD. In the call stack of 
> _Files.readAllBytes()_ , it will allocate a direct memory buffer which's size 
> is equal the length of the file. This will cause OutOfMemoryErro error when 
> direct memory is not enough.
> If the length of a file is large than a maximum buffer size,  the maximum 
> size direct-buffer should be used to read bytes of the file to avoid direct 
> memory OutOfMemoryError.  The maximum buffer size can be 8K or others.
> The exception stack is as follows (this exception stack is from an old Flink 
> version, but the master branch has the same problem).
> Caused by: java.lang.OutOfMemoryError: Direct buffer memory
>    at java.nio.Bits.reserveMemory(Bits.java:706)
>    at java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:123)
>    at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311)
>    at sun.nio.ch.Util.getTemporaryDirectBuffer(Util.java:241)
>    at sun.nio.ch.IOUtil.read(IOUtil.java:195)
>    at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:182)
>    at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:65)
>    at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:109)
>    at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
>    at java.nio.file.Files.read(Files.java:3105)
>    at java.nio.file.Files.readAllBytes(Files.java:3158)
>    at 
> org.apache.flink.runtime.deployment.TaskDeploymentDescriptor.loadBigData(TaskDeploymentDescriptor.java:338)
>    at 
> org.apache.flink.runtime.taskexecutor.TaskExecutor.submitTask(TaskExecutor.java:397)
>    at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
>    at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>    at java.lang.reflect.Method.invoke(Method.java:498)
>    at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:211)
>    at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:155)
>    at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$onReceive$1(AkkaRpcActor.java:133)
>    at akka.actor.ActorCell$$anonfun$become$1.applyOrElse(ActorCell.scala:544)
>    at akka.actor.Actor$class.aroundReceive(Actor.scala:502)
>    at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95)
>    ... 9 more



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to