[ https://issues.apache.org/jira/browse/FLINK-11552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772172#comment-16772172 ]
William Cummings commented on FLINK-11552: ------------------------------------------ This is a "standalone" cluster on top of ec2, no k8s layer. > Akka association issues in 1.7.x > -------------------------------- > > Key: FLINK-11552 > URL: https://issues.apache.org/jira/browse/FLINK-11552 > Project: Flink > Issue Type: Bug > Components: Cluster Management > Affects Versions: 1.7.0, 1.7.1 > Reporter: William Cummings > Priority: Critical > > When testing our application on 1.7.0 and 1.7.1, taskmanagers associate > correctly, but when a job is submitted it enters the RUNNING state, but no > work is ever done. In the jobmanager logs (w/ akka logging turned up to DEBUG > & "akka.log.lifecycle.events: true") I can observe some akka errors. > Eventually a taskmanager is lost, and the task fails. > Please let me know if there is any additional information I can collect to > help diagnose. If someone can point me in the right direction I'd be happy to > implement a fix. > I've attached the relevant logs below: > {noformat} > 019-02-07 17:45:58,543 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] > Caused by: [app-flink-taskmanager-065199ce40199b440: unknown error] > 2019-02-07 17:45:58,548 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] > Caused by: [app-flink-taskmanager-0056cc1c18d1cff79: unknown error] > 2019-02-07 17:45:58,563 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] > Caused by: [app-flink-taskmanager-034a6a653e17966ed: unknown error] > 2019-02-07 17:46:10,538 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] > Caused by: [app-flink-taskmanager-034a6a653e17966ed: unknown error] > 2019-02-07 17:46:10,548 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] > Caused by: [app-flink-taskmanager-0056cc1c18d1cff79: unknown error] > 2019-02-07 17:46:10,568 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] > Caused by: [app-flink-taskmanager-065199ce40199b440: unknown error] > 2019-02-07 17:46:24,204 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] > Caused by: [app-flink-taskmanager-0056cc1c18d1cff79: unknown error] > 2019-02-07 17:46:24,210 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] > Caused by: [app-flink-taskmanager-065199ce40199b440: unknown error] > 2019-02-07 17:46:24,211 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] > Caused by: [app-flink-taskmanager-034a6a653e17966ed: unknown error] > 2019-02-07 17:46:35,304 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] > Caused by: [app-flink-taskmanager-0056cc1c18d1cff79: unknown error] > 2019-02-07 17:46:35,309 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] > Caused by: [app-flink-taskmanager-065199ce40199b440: unknown error] > 2019-02-07 17:46:35,317 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] > Caused by: [app-flink-taskmanager-034a6a653e17966ed: unknown error] > 2019-02-07 17:46:48,152 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] > Caused by: [app-flink-taskmanager-065199ce40199b440: unknown error] > 2019-02-07 17:46:48,178 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] > Caused by: [app-flink-taskmanager-0056cc1c18d1cff79: unknown error] > 2019-02-07 17:46:48,180 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] > Caused by: [app-flink-taskmanager-034a6a653e17966ed: unknown error] > 2019-02-07 17:47:00,615 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] > Caused by: [app-flink-taskmanager-065199ce40199b440: unknown error] > 2019-02-07 17:47:00,634 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] > Caused by: [app-flink-taskmanager-034a6a653e17966ed: unknown error] > 2019-02-07 17:47:00,635 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] > Caused by: [app-flink-taskmanager-0056cc1c18d1cff79: unknown error] > 2019-02-07 17:47:20,366 WARN > org.apache.flink.runtime.rest.handler.legacy.metrics.MetricFetcher - > Requesting TaskManager's path for query services failed. > akka.pattern.AskTimeoutException: Ask timed out on > [Actor[akka://flink/user/dispatcher#1335681479]] after [10000 ms]. > Sender[null] sent message of type > "org.apache.flink.runtime.rpc.messages.LocalFencedMessage". > at > akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:604) > at akka.actor.Scheduler$$anon$4.run(Scheduler.scala:126) > at > scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:601) > at > scala.concurrent.BatchingExecutor$class.execute(BatchingExecutor.scala:109) > at > scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:599) > at > akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:329) > at > akka.actor.LightArrayRevolverScheduler$$anon$4.executeBucket$1(LightArrayRevolverScheduler.scala:280) > at > akka.actor.LightArrayRevolverScheduler$$anon$4.nextTick(LightArrayRevolverScheduler.scala:284) > at > akka.actor.LightArrayRevolverScheduler$$anon$4.run(LightArrayRevolverScheduler.scala:236) > at java.lang.Thread.run(Thread.java:745) > 2019-02-07 17:47:20,367 INFO > akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef - Message > [akka.actor.Status$Failure] from Actor[akka://flink/deadLetters] to > Actor[akka://flink/deadLetters] was not delivered. [38] dead letters > encountered. This logging can be turned off or adjusted with configuration > settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. > 2019-02-07 17:47:20,367 INFO > akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef - Message > [akka.actor.Status$Failure] from Actor[akka://flink/deadLetters] to > Actor[akka://flink/deadLetters] was not delivered. [39] dead letters > encountered. This logging can be turned off or adjusted with configuration > settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. > 2019-02-07 17:47:21,058 INFO akka.actor.EmptyLocalActorRef > - Message [org.apache.flink.types.SerializableOptional] from > Actor[akka://flink/deadLetters] to Actor[akka://flink/temp/$8b] was not > delivered. [40] dead letters encountered. This logging can be turned off or > adjusted with configuration settings 'akka.log-dead-letters' and > 'akka.log-dead-letters-during-shutdown'. > 2019-02-07 17:47:26,910 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] > Caused by: [app-flink-taskmanager-034a6a653e17966ed: unknown error] > 2019-02-07 17:47:26,927 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] > Caused by: [app-flink-taskmanager-0056cc1c18d1cff79: unknown error] > 2019-02-07 17:47:26,952 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] > Caused by: [app-flink-taskmanager-065199ce40199b440: unknown error] > 2019-02-07 17:47:36,616 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] > Caused by: [app-flink-taskmanager-065199ce40199b440] > 2019-02-07 17:47:36,617 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] > Caused by: [app-flink-taskmanager-0056cc1c18d1cff79] > 2019-02-07 17:47:36,617 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] > Caused by: [app-flink-taskmanager-034a6a653e17966ed] > 2019-02-07 17:47:49,921 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] > Caused by: [app-flink-taskmanager-0056cc1c18d1cff79: unknown error] > 2019-02-07 17:47:49,933 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] > Caused by: [app-flink-taskmanager-065199ce40199b440: unknown error] > 2019-02-07 17:47:49,949 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] > Caused by: [app-flink-taskmanager-034a6a653e17966ed: unknown error] > 2019-02-07 17:48:05,517 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] > Caused by: [app-flink-taskmanager-065199ce40199b440: unknown error] > 2019-02-07 17:48:05,520 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] > Caused by: [app-flink-taskmanager-034a6a653e17966ed: unknown error] > 2019-02-07 17:48:05,520 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] > Caused by: [app-flink-taskmanager-0056cc1c18d1cff79: unknown error] > 2019-02-07 17:48:16,266 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] > Caused by: [app-flink-taskmanager-0056cc1c18d1cff79: unknown error] > 2019-02-07 17:48:16,273 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] > Caused by: [app-flink-taskmanager-065199ce40199b440: unknown error] > 2019-02-07 17:48:16,280 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] > Caused by: [app-flink-taskmanager-034a6a653e17966ed: unknown error] > 2019-02-07 17:48:26,316 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] > Caused by: [app-flink-taskmanager-0056cc1c18d1cff79: unknown error] > 2019-02-07 17:48:26,317 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] > Caused by: [app-flink-taskmanager-034a6a653e17966ed: unknown error] > 2019-02-07 17:48:26,342 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] > Caused by: [app-flink-taskmanager-065199ce40199b440: unknown error] > 2019-02-07 17:48:36,736 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] > Caused by: [app-flink-taskmanager-0056cc1c18d1cff79: unknown error] > 2019-02-07 17:48:36,739 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] > Caused by: [app-flink-taskmanager-034a6a653e17966ed: unknown error] > 2019-02-07 17:48:36,771 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] > Caused by: [app-flink-taskmanager-065199ce40199b440: unknown error] > 2019-02-07 17:48:48,014 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] > Caused by: [app-flink-taskmanager-065199ce40199b440: unknown error] > 2019-02-07 17:48:48,018 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] > Caused by: [app-flink-taskmanager-034a6a653e17966ed: unknown error] > 2019-02-07 17:48:48,038 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] > Caused by: [app-flink-taskmanager-0056cc1c18d1cff79: unknown error] > 2019-02-07 17:48:57,129 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] > Caused by: [app-flink-taskmanager-034a6a653e17966ed] > 2019-02-07 17:48:57,129 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] > Caused by: [app-flink-taskmanager-065199ce40199b440] > 2019-02-07 17:48:57,129 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] > Caused by: [app-flink-taskmanager-0056cc1c18d1cff79] > 2019-02-07 17:49:12,894 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] > Caused by: [app-flink-taskmanager-034a6a653e17966ed: unknown error] > 2019-02-07 17:49:12,899 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] > Caused by: [app-flink-taskmanager-065199ce40199b440: unknown error] > 2019-02-07 17:49:13,005 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] > Caused by: [app-flink-taskmanager-0056cc1c18d1cff79: unknown error] > 2019-02-07 17:49:25,532 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] > Caused by: [app-flink-taskmanager-0056cc1c18d1cff79: unknown error] > 2019-02-07 17:49:25,532 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] > Caused by: [app-flink-taskmanager-065199ce40199b440: unknown error] > 2019-02-07 17:49:25,559 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] > Caused by: [app-flink-taskmanager-034a6a653e17966ed: unknown error] > 2019-02-07 17:49:39,784 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] > Caused by: [app-flink-taskmanager-065199ce40199b440: unknown error] > 2019-02-07 17:49:39,786 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] > Caused by: [app-flink-taskmanager-034a6a653e17966ed: unknown error] > 2019-02-07 17:49:39,790 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has > failed, address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] > Caused by: [app-flink-taskmanager-0056cc1c18d1cff79: unknown error] > 2019-02-07 17:49:54,111 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink@app-flink-taskmanager-0056cc1c18d1cff79:6122] has failed, > address is now gated for [50] ms. Reason: [Disassociated] > 2019-02-07 17:49:54,976 WARN > org.apache.flink.runtime.rest.handler.legacy.metrics.MetricFetcher - > Requesting TaskManager's path for query services failed. > akka.pattern.AskTimeoutException: Ask timed out on > [Actor[akka://flink/user/dispatcher#1335681479]] after [10000 ms]. > Sender[null] sent message of type > "org.apache.flink.runtime.rpc.messages.LocalFencedMessage". > at > akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:604) > at akka.actor.Scheduler$$anon$4.run(Scheduler.scala:126) > at > scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:601) > at > scala.concurrent.BatchingExecutor$class.execute(BatchingExecutor.scala:109) > at > scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:599) > at > akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:329) > at > akka.actor.LightArrayRevolverScheduler$$anon$4.executeBucket$1(LightArrayRevolverScheduler.scala:280) > at > akka.actor.LightArrayRevolverScheduler$$anon$4.nextTick(LightArrayRevolverScheduler.scala:284) > at > akka.actor.LightArrayRevolverScheduler$$anon$4.run(LightArrayRevolverScheduler.scala:236) > at java.lang.Thread.run(Thread.java:745) > 2019-02-07 17:49:54,977 INFO > akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef - Message > [akka.actor.Status$Failure] from Actor[akka://flink/deadLetters] to > Actor[akka://flink/deadLetters] was not delivered. [41] dead letters > encountered. This logging can be turned off or adjusted with configuration > settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. > 2019-02-07 17:49:54,977 INFO > akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef - Message > [akka.actor.Status$Failure] from Actor[akka://flink/deadLetters] to > Actor[akka://flink/deadLetters] was not delivered. [42] dead letters > encountered. This logging can be turned off or adjusted with configuration > settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'. > 2019-02-07 17:49:55,037 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink@app-flink-taskmanager-0056cc1c18d1cff79:6122] has failed, > address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink@app-flink-taskmanager-0056cc1c18d1cff79:6122]] Caused by: > [app-flink-taskmanager-0056cc1c18d1cff79: unknown error] > 2019-02-07 17:49:55,037 INFO akka.remote.RemoteActorRef > - Message > [org.apache.flink.runtime.rpc.messages.RemoteRpcInvocation] from > Actor[akka://flink/temp/$Mc] to > Actor[akka.tcp://flink@app-flink-taskmanager-0056cc1c18d1cff79:6122/user/taskmanager_0#1564625337] > was not delivered. [43] dead letters encountered. This logging can be turned > off or adjusted with configuration settings 'akka.log-dead-letters' and > 'akka.log-dead-letters-during-shutdown'. > 2019-02-07 17:49:55,668 WARN akka.remote.ReliableDeliverySupervisor > - Association with remote system > [akka.tcp://flink@app-flink-taskmanager-0056cc1c18d1cff79:6122] has failed, > address is now gated for [50] ms. Reason: [Association failed with > [akka.tcp://flink@app-flink-taskmanager-0056cc1c18d1cff79:6122]] Caused by: > [app-flink-taskmanager-0056cc1c18d1cff79] > 2019-02-07 17:49:55,669 INFO akka.remote.RemoteActorRef > - Message > [org.apache.flink.runtime.rpc.messages.RemoteRpcInvocation] from > Actor[akka://flink/deadLetters] to > Actor[akka.tcp://flink@app-flink-taskmanager-0056cc1c18d1cff79:6122/user/taskmanager_0#1564625337] > was not delivered. [44] dead letters encountered. This logging can be turned > off or adjusted with configuration settings 'akka.log-dead-letters' and > 'akka.log-dead-letters-during-shutdown'. > 2019-02-07 17:49:57,343 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph - > Item-DedupEvents-0 -> Item-TransformEventMapper-0 -> CheckUniqueMapper-0 -> > CustomerToItemFlatMapper-0 (5/6) (684b98348707f04b1c1270874cd1d02c) switched > from RUNNING to FAILED. > org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException: > Connection unexpectedly closed by remote task manager > 'app-flink-taskmanager-0056cc1c18d1cff79/10.100.23.120:6125'. This might > indicate that the remote task manager was lost. > at > org.apache.flink.runtime.io.network.netty.CreditBasedPartitionRequestClientHandler.channelInactive(CreditBasedPartitionRequestClientHandler.java:136) > at > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) > at > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) > at > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) > at > org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:377) > at > org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:342) > at > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) > at > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) > at > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) > at > org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1429) > at > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) > at > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) > at > org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:947) > at > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:822) > at > org.apache.flink.shaded.netty4.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) > at > org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404) > at > org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463) > at > org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884) > at java.lang.Thread.run(Thread.java:745) > 2019-02-07 17:49:57,343 INFO > org.apache.flink.runtime.executiongraph.ExecutionGraph - Job > Mode-Item-AppJob (be9764bbcb7f535b0d0c8bc767c8f651) switched from state > RUNNING to FAILING. > org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException: > Connection unexpectedly closed by remote task manager > 'app-flink-taskmanager-0056cc1c18d1cff79/10.100.23.120:6125'. This might > indicate that the remote task manager was lost. > at > org.apache.flink.runtime.io.network.netty.CreditBasedPartitionRequestClientHandler.channelInactive(CreditBasedPartitionRequestClientHandler.java:136) > at > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) > at > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) > at > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) > at > org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:377) > at > org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:342) > at > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) > at > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) > at > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224) > at > org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1429) > at > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245) > at > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231) > at > org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:947) > at > org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:822) > at > org.apache.flink.shaded.netty4.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163) > at > org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404) > at > org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463) > at > org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884) > at java.lang.Thread.run(Thread.java:745) > {noformat} > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)