Hi Victor, There used to be several relevant issues reported [1] [2] [3]. I guess you have encountered the same problem. This issue has been fixed in 1.8 [4]. Could you try it on a later version (1.8+)?
1. https://issues.apache.org/jira/browse/FLINK-11137 2. https://issues.apache.org/jira/browse/FLINK-11215 3. https://issues.apache.org/jira/browse/FLINK-11708 4. https://issues.apache.org/jira/browse/FLINK-11718 Thanks, Biao /'bɪ.aʊ/ On Fri, Aug 9, 2019 at 4:01 PM Victor Wong <jiasheng.w...@outlook.com> wrote: > Hi, > > I’m using Flink version *1.7.1*, and I encountered this exception which > was a little weird from my point of view; > > TaskManager successfully registered at resource manager, however after 5 > minutes (which is the default value of taskmanager.registration.timeout > config) it threw out RegistrationTimeoutException; > > > > Here is the related logs of TM: > > 2019-08-09 01:30:24,061 INFO > org.apache.flink.runtime.taskexecutor.TaskExecutor - Connecting > to ResourceManager akka.tcp://flink@xxx > /user/resourcemanager(00000000000000000000000000000000). > > 2019-08-09 01:30:24,296 INFO > org.apache.flink.runtime.taskexecutor.TaskExecutor - Resolved > ResourceManager address, beginning registration > > 2019-08-09 01:30:24,296 INFO > org.apache.flink.runtime.taskexecutor.TaskExecutor - > Registration at ResourceManager attempt 1 (timeout=100ms) > > 2019-08-09 01:30:24,379 INFO > org.apache.flink.runtime.taskexecutor.TaskExecutor - *Successful > registration at resource manager* akka.tcp://flink@xxx/user/resourcemanager > under registration id 4535dea14648f6de68f32fb1a375806e. > > 2019-08-09 01:30:24,404 INFO > org.apache.flink.runtime.taskexecutor.TaskExecutor - Receive > slot request AllocationID{372d1e10019c93c6c41d52b449cea5f2} for job > e7b86795178efe43d7cac107c6cb8c33 from resource manager with leader id > 00000000000000000000000000000000. > > … > > 2019-08-09 01:30:33,590 INFO > org.apache.flink.runtime.taskexecutor.TaskExecutor - > Un-registering task and sending final execution state FINISHED to > JobManager for task Source: xxxx ; // *I don’t know if this is related, > so I add it here in case; This Flink Kafka source just finished because it > consumed no Kafka partitions (Flink Kafka parallelism > Kafka topic > partitions)* > > … > > 2019-08-09 01:35:24,753 ERROR > org.apache.flink.runtime.taskexecutor.TaskExecutor - Fatal error > occurred in TaskExecutor akka.tcp://flink@xxx/user/taskmanager_0. > > org.apache.flink.runtime.taskexecutor.exceptions.RegistrationTimeoutException: > Could not register at the ResourceManager within the specified maximum > registration duration 300000 ms. This indicates a problem with this > instance. Terminating now. > > at > org.apache.flink.runtime.taskexecutor.TaskExecutor.registrationTimeout(TaskExecutor.java:1037) > > at > org.apache.flink.runtime.taskexecutor.TaskExecutor.lambda$startRegistrationTimeout$3(TaskExecutor.java:1023) > > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:332) > > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:158) > > at > org.apache.flink.runtime.rpc.akka.AkkaRpcActor.onReceive(AkkaRpcActor.java:142) > > at > akka.actor.UntypedActor$$anonfun$receive$1.applyOrElse(UntypedActor.scala:165) > > at akka.actor.Actor$class.aroundReceive(Actor.scala:502) > > at akka.actor.UntypedActor.aroundReceive(UntypedActor.scala:95) > > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526) > > at akka.actor.ActorCell.invoke(ActorCell.scala:495) > > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257) > > at akka.dispatch.Mailbox.run(Mailbox.scala:224) > > at akka.dispatch.Mailbox.exec(Mailbox.scala:234) > > at > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > > > > Thanks, > > Victor >