Re: akka.framesize configuration does not runtime execution
Hi Yuval, I'm also wondering why do you have such a big metadata file. Probably, you could reduce it by decreasing "state.backend.fs.memory-threshold" (if you didn't do so already) [1]. [1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/checkpointing.html#state-backend-fs-memory-threshold Regards, Roman On Mon, Oct 19, 2020 at 7:26 AM Yun Tang wrote: > Hi Yuval > > First of all, large savepoint metadata would not must need a very large > akka frame size. Writing meta data to external file system calls IO-write > method [1] instead of sending RPC message. > > Secondly, savepoint would not store any confiuration, it would only store > checkpointed state. > > BTW, why you could have so large RPC message over than 1GB? > > [1] > https://github.com/apache/flink/blob/f705f0af6ba50f6e68c22484d1daeda842518d27/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/PendingCheckpoint.java#L313 > > Best > Yun Tang > -- > *From:* Yuval Itzchakov > *Sent:* Thursday, October 15, 2020 21:22 > *To:* user > *Subject:* akka.framesize configuration does not runtime execution > > Hi, > > Due to a very large savepoint metadata file (3GB +), I've set the > akka.framesize that is being required to 5GB. I set this via flink.conf > `akka.framesize` property. > > When trying to recover from the savepoint, the JM emits the following > error message: > > "thread":"flink-akka.actor.default-dispatcher-30" > "level":"ERROR" > "loggerName":"akka.remote.EndpointWriter" > "message":"Transient "Discarding oversized payload sent to > Actor[akka.tcp://flink@XXX:XXX/user/taskmanager_0#369979612]: max allowed > size 1073741824 bytes, actual size of encoded class > org.apache.flink.runtime.rpc.messages.RemoteRpcInvocation was 1610683118 > "name":"akka.remote.OversizedPayloadException" > > As I recall, while taking the savepoint the maximum framesize was indeed > defined as 1GB. > > Could it be that akka.framesize is being recovered from the stored > savepoint, thus not allowing me to configure re-configure the maximum size > of the payload? > > -- > Best Regards, > Yuval Itzchakov. >
Re: akka.framesize configuration does not runtime execution
Hi Yuval First of all, large savepoint metadata would not must need a very large akka frame size. Writing meta data to external file system calls IO-write method [1] instead of sending RPC message. Secondly, savepoint would not store any confiuration, it would only store checkpointed state. BTW, why you could have so large RPC message over than 1GB? [1] https://github.com/apache/flink/blob/f705f0af6ba50f6e68c22484d1daeda842518d27/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/PendingCheckpoint.java#L313 Best Yun Tang From: Yuval Itzchakov Sent: Thursday, October 15, 2020 21:22 To: user Subject: akka.framesize configuration does not runtime execution Hi, Due to a very large savepoint metadata file (3GB +), I've set the akka.framesize that is being required to 5GB. I set this via flink.conf `akka.framesize` property. When trying to recover from the savepoint, the JM emits the following error message: "thread":"flink-akka.actor.default-dispatcher-30" "level":"ERROR" "loggerName":"akka.remote.EndpointWriter" "message":"Transient "Discarding oversized payload sent to Actor[akka.tcp://flink@XXX:XXX/user/taskmanager_0#369979612]: max allowed size 1073741824 bytes, actual size of encoded class org.apache.flink.runtime.rpc.messages.RemoteRpcInvocation was 1610683118 "name":"akka.remote.OversizedPayloadException" As I recall, while taking the savepoint the maximum framesize was indeed defined as 1GB. Could it be that akka.framesize is being recovered from the stored savepoint, thus not allowing me to configure re-configure the maximum size of the payload? -- Best Regards, Yuval Itzchakov.
akka.framesize configuration does not runtime execution
Hi, Due to a very large savepoint metadata file (3GB +), I've set the akka.framesize that is being required to 5GB. I set this via flink.conf `akka.framesize` property. When trying to recover from the savepoint, the JM emits the following error message: "thread":"flink-akka.actor.default-dispatcher-30" "level":"ERROR" "loggerName":"akka.remote.EndpointWriter" "message":"Transient "Discarding oversized payload sent to Actor[akka.tcp://flink@XXX:XXX/user/taskmanager_0#369979612]: max allowed size 1073741824 bytes, actual size of encoded class org.apache.flink.runtime.rpc.messages.RemoteRpcInvocation was 1610683118 "name":"akka.remote.OversizedPayloadException" As I recall, while taking the savepoint the maximum framesize was indeed defined as 1GB. Could it be that akka.framesize is being recovered from the stored savepoint, thus not allowing me to configure re-configure the maximum size of the payload? -- Best Regards, Yuval Itzchakov.