Re: Flink On Yarn HA 部署模式下Flink程序无法启动
您好,我的版本是1.13.1 -- Original -- From: "Yang Wang"https://issues.apache.org/jira/browse/FLINK-19212 Best, Yang 周瑞
Re: Flink On Yarn HA 部署模式下Flink程序无法启动
看报错应该是个已知问题[1]并且已经在1.11.2中修复 [1]. https://issues.apache.org/jira/browse/FLINK-19212 Best, Yang 周瑞 于2021年8月17日周二 上午11:04写道: > 您好:Flink程序部署在Yran上以Appliation Mode 模式启动的,在没有采用HA > 模式的时候可以正常启动,配置了HA之后,启动异常,麻烦帮忙看下是什么原因导致的. > > > HA 配置如下: > high-availability: zookeeper high-availability.storageDir: > hdfs://mycluster/flink/ha high-availability.zookeeper.quorum: > zk-1:2181,zk-2:2181,zk-3:2181 high-availability.zookeeper.path.root: /flink > high-availability.cluster-id: /flink_cluster > > > 异常如下: > 2021-08-17 10:24:18,938 INFO > org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService [] - > Starting DefaultLeaderElectionService with > ZooKeeperLeaderElectionDriver{leaderPath='/leader/resource_manager_lock'}. > 2021-08-17 10:25:09,706 ERROR > org.apache.flink.runtime.rest.handler.taskmanager.TaskManagerDetailsHandler > [] - Unhandled exception. > org.apache.flink.runtime.rpc.akka.exceptions.AkkaRpcException: Failed to > serialize the result for RPC call : requestTaskManagerDetailsInfo. > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:404) > ~[flink-dist_2.12-1.13.1.jar:1.13.1] > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$sendAsyncResponse$0(AkkaRpcActor.java:360) > ~[flink-dist_2.12-1.13.1.jar:1.13.1] > at > java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:836) > ~[?:1.8.0_292] > at > java.util.concurrent.CompletableFuture.uniHandleStage(CompletableFuture.java:848) > ~[?:1.8.0_292] > at > java.util.concurrent.CompletableFuture.handle(CompletableFuture.java:2168) > ~[?:1.8.0_292] > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.sendAsyncResponse(AkkaRpcActor.java:352) > ~[flink-dist_2.12-1.13.1.jar:1.13.1] > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:319) > ~[flink-dist_2.12-1.13.1.jar:1.13.1] > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:212) > ~[flink-dist_2.12-1.13.1.jar:1.13.1] > at > org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:77) > ~[flink-dist_2.12-1.13.1.jar:1.13.1] > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:158) > ~[flink-dist_2.12-1.13.1.jar:1.13.1] > at > akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26) > ~[flink-dist_2.12-1.13.1.jar:1.13.1] > at > akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21) > ~[flink-dist_2.12-1.13.1.jar:1.13.1] > at > scala.PartialFunction.applyOrElse(PartialFunction.scala:123) > ~[flink-dist_2.12-1.13.1.jar:1.13.1] > at > scala.PartialFunction.applyOrElse$(PartialFunction.scala:122) > ~[flink-dist_2.12-1.13.1.jar:1.13.1] > at > akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21) > ~[flink-dist_2.12-1.13.1.jar:1.13.1] > at > scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) > ~[flink-dist_2.12-1.13.1.jar:1.13.1] > at > scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172) > ~[flink-dist_2.12-1.13.1.jar:1.13.1] > at > scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172) > ~[flink-dist_2.12-1.13.1.jar:1.13.1] > at akka.actor.Actor.aroundReceive(Actor.scala:517) > ~[flink-dist_2.12-1.13.1.jar:1.13.1] > at > akka.actor.Actor.aroundReceive$(Actor.scala:515) > ~[flink-dist_2.12-1.13.1.jar:1.13.1] > at > akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225) > ~[flink-dist_2.12-1.13.1.jar:1.13.1] > at > akka.actor.ActorCell.receiveMessage(ActorCell.scala:592) > [flink-dist_2.12-1.13.1.jar:1.13.1] > at > akka.actor.ActorCell.invoke(ActorCell.scala:561) > [flink-dist_2.12-1.13.1.jar:1.13.1] > at > akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258) > [flink-dist_2.12-1.13.1.jar:1.13.1] > at akka.dispatch.Mailbox.run(Mailbox.scala:225) > [flink-dist_2.12-1.13.1.jar:1.13.1] > at akka.dispatch.Mailbox.exec(Mailbox.scala:235) > [flink-dist_2.12-1.13.1.jar:1.13.1] > at > akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > [flink-dist_2.12-1.13.1.jar:1.13.1] > at > akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > [flink-dist_2.12-1.13.1.jar:1.13.1] > at > akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > [flink-dist_2.12-1.13.1.jar:1.13.1] > at > akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > [flink-dist_2.12-1.13.1.jar:1.13.1] > Caused by: java.io.NotSerializableException: > org.apache.flink.runtime.resourcemanager.TaskManagerInfoWithSlots > at > java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184) > ~[?:1.8.0_292] > at > java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348) > ~[?:1.8.0_292] > at > org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:624) > ~[flink-dist_2.12-1.13.1.jar:1.13.1] > at > org.
Flink On Yarn HA 部署模式下Flink程序无法启动
您好:Flink程序部署在Yran上以Appliation Mode 模式启动的,在没有采用HA 模式的时候可以正常启动,配置了HA之后,启动异常,麻烦帮忙看下是什么原因导致的. HA 配置如下: high-availability: zookeeper high-availability.storageDir: hdfs://mycluster/flink/ha high-availability.zookeeper.quorum: zk-1:2181,zk-2:2181,zk-3:2181 high-availability.zookeeper.path.root: /flink high-availability.cluster-id: /flink_cluster 异常如下: 2021-08-17 10:24:18,938 INFO org.apache.flink.runtime.leaderelection.DefaultLeaderElectionService [] - Starting DefaultLeaderElectionService with ZooKeeperLeaderElectionDriver{leaderPath='/leader/resource_manager_lock'}. 2021-08-17 10:25:09,706 ERROR org.apache.flink.runtime.rest.handler.taskmanager.TaskManagerDetailsHandler [] - Unhandled exception. org.apache.flink.runtime.rpc.akka.exceptions.AkkaRpcException: Failed to serialize the result for RPC call : requestTaskManagerDetailsInfo. at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:404) ~[flink-dist_2.12-1.13.1.jar:1.13.1] at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$sendAsyncResponse$0(AkkaRpcActor.java:360) ~[flink-dist_2.12-1.13.1.jar:1.13.1] at java.util.concurrent.CompletableFuture.uniHandle(CompletableFuture.java:836) ~[?:1.8.0_292] at java.util.concurrent.CompletableFuture.uniHandleStage(CompletableFuture.java:848) ~[?:1.8.0_292] at java.util.concurrent.CompletableFuture.handle(CompletableFuture.java:2168) ~[?:1.8.0_292] at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.sendAsyncResponse(AkkaRpcActor.java:352) ~[flink-dist_2.12-1.13.1.jar:1.13.1] at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:319) ~[flink-dist_2.12-1.13.1.jar:1.13.1] at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:212) ~[flink-dist_2.12-1.13.1.jar:1.13.1] at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:77) ~[flink-dist_2.12-1.13.1.jar:1.13.1] at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:158) ~[flink-dist_2.12-1.13.1.jar:1.13.1] at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26) ~[flink-dist_2.12-1.13.1.jar:1.13.1] at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21) ~[flink-dist_2.12-1.13.1.jar:1.13.1] at scala.PartialFunction.applyOrElse(PartialFunction.scala:123) ~[flink-dist_2.12-1.13.1.jar:1.13.1] at scala.PartialFunction.applyOrElse$(PartialFunction.scala:122) ~[flink-dist_2.12-1.13.1.jar:1.13.1] at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21) ~[flink-dist_2.12-1.13.1.jar:1.13.1] at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) ~[flink-dist_2.12-1.13.1.jar:1.13.1] at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172) ~[flink-dist_2.12-1.13.1.jar:1.13.1] at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172) ~[flink-dist_2.12-1.13.1.jar:1.13.1] at akka.actor.Actor.aroundReceive(Actor.scala:517) ~[flink-dist_2.12-1.13.1.jar:1.13.1] at akka.actor.Actor.aroundReceive$(Actor.scala:515) ~[flink-dist_2.12-1.13.1.jar:1.13.1] at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225) ~[flink-dist_2.12-1.13.1.jar:1.13.1] at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592) [flink-dist_2.12-1.13.1.jar:1.13.1] at akka.actor.ActorCell.invoke(ActorCell.scala:561) [flink-dist_2.12-1.13.1.jar:1.13.1] at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258) [flink-dist_2.12-1.13.1.jar:1.13.1] at akka.dispatch.Mailbox.run(Mailbox.scala:225) [flink-dist_2.12-1.13.1.jar:1.13.1] at akka.dispatch.Mailbox.exec(Mailbox.scala:235) [flink-dist_2.12-1.13.1.jar:1.13.1] at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [flink-dist_2.12-1.13.1.jar:1.13.1] at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [flink-dist_2.12-1.13.1.jar:1.13.1] at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [flink-dist_2.12-1.13.1.jar:1.13.1] at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [flink-dist_2.12-1.13.1.jar:1.13.1] Caused by: java.io.NotSerializableException: org.apache.flink.runtime.resourcemanager.TaskManagerInfoWithSlots at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184) ~[?:1.8.0_292] at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348) ~[?:1.8.0_292] at org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:624) ~[flink-dist_2.12-1.13.1.jar:1.13.1] at org.apache.flink.runtime.rpc.akka.AkkaRpcSerializedValue.valueOf(AkkaRpcSerializedValue.java:66) ~[flink-dist_2.12-1.13.1.jar:1.13.1] at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.serializeRemoteResultAndVerifySize(AkkaRpcActor.java:387) ~[flink-dist_2.12-1.13.1.jar:1.13.1] ... 29 mo