[ 
https://issues.apache.org/jira/browse/FLINK-11552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772172#comment-16772172
 ] 

William Cummings commented on FLINK-11552:
------------------------------------------

This is a "standalone" cluster on top of ec2, no k8s layer.

> Akka association issues in 1.7.x
> --------------------------------
>
>                 Key: FLINK-11552
>                 URL: https://issues.apache.org/jira/browse/FLINK-11552
>             Project: Flink
>          Issue Type: Bug
>          Components: Cluster Management
>    Affects Versions: 1.7.0, 1.7.1
>            Reporter: William Cummings
>            Priority: Critical
>
> When testing our application on 1.7.0 and 1.7.1, taskmanagers associate 
> correctly, but when a job is submitted it enters the RUNNING state, but no 
> work is ever done. In the jobmanager logs (w/ akka logging turned up to DEBUG 
> & "akka.log.lifecycle.events: true") I can observe some akka errors. 
> Eventually a taskmanager is lost, and the task fails.
> Please let me know if there is any additional information I can collect to 
> help diagnose. If someone can point me in the right direction I'd be happy to 
> implement a fix.
> I've attached the relevant logs below:
> {noformat}
> 019-02-07 17:45:58,543 WARN  akka.remote.ReliableDeliverySupervisor           
>              - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] 
> Caused by: [app-flink-taskmanager-065199ce40199b440: unknown error]
> 2019-02-07 17:45:58,548 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] 
> Caused by: [app-flink-taskmanager-0056cc1c18d1cff79: unknown error]
> 2019-02-07 17:45:58,563 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] 
> Caused by: [app-flink-taskmanager-034a6a653e17966ed: unknown error]
> 2019-02-07 17:46:10,538 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] 
> Caused by: [app-flink-taskmanager-034a6a653e17966ed: unknown error]
> 2019-02-07 17:46:10,548 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] 
> Caused by: [app-flink-taskmanager-0056cc1c18d1cff79: unknown error]
> 2019-02-07 17:46:10,568 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] 
> Caused by: [app-flink-taskmanager-065199ce40199b440: unknown error]
> 2019-02-07 17:46:24,204 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] 
> Caused by: [app-flink-taskmanager-0056cc1c18d1cff79: unknown error]
> 2019-02-07 17:46:24,210 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] 
> Caused by: [app-flink-taskmanager-065199ce40199b440: unknown error]
> 2019-02-07 17:46:24,211 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] 
> Caused by: [app-flink-taskmanager-034a6a653e17966ed: unknown error]
> 2019-02-07 17:46:35,304 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] 
> Caused by: [app-flink-taskmanager-0056cc1c18d1cff79: unknown error]
> 2019-02-07 17:46:35,309 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] 
> Caused by: [app-flink-taskmanager-065199ce40199b440: unknown error]
> 2019-02-07 17:46:35,317 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] 
> Caused by: [app-flink-taskmanager-034a6a653e17966ed: unknown error]
> 2019-02-07 17:46:48,152 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] 
> Caused by: [app-flink-taskmanager-065199ce40199b440: unknown error]
> 2019-02-07 17:46:48,178 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] 
> Caused by: [app-flink-taskmanager-0056cc1c18d1cff79: unknown error]
> 2019-02-07 17:46:48,180 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] 
> Caused by: [app-flink-taskmanager-034a6a653e17966ed: unknown error]
> 2019-02-07 17:47:00,615 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] 
> Caused by: [app-flink-taskmanager-065199ce40199b440: unknown error]
> 2019-02-07 17:47:00,634 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] 
> Caused by: [app-flink-taskmanager-034a6a653e17966ed: unknown error]
> 2019-02-07 17:47:00,635 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] 
> Caused by: [app-flink-taskmanager-0056cc1c18d1cff79: unknown error]
> 2019-02-07 17:47:20,366 WARN  
> org.apache.flink.runtime.rest.handler.legacy.metrics.MetricFetcher  - 
> Requesting TaskManager's path for query services failed.
> akka.pattern.AskTimeoutException: Ask timed out on 
> [Actor[akka://flink/user/dispatcher#1335681479]] after [10000 ms]. 
> Sender[null] sent message of type 
> "org.apache.flink.runtime.rpc.messages.LocalFencedMessage".
>     at 
> akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:604)
>     at akka.actor.Scheduler$$anon$4.run(Scheduler.scala:126)
>     at 
> scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:601)
>     at 
> scala.concurrent.BatchingExecutor$class.execute(BatchingExecutor.scala:109)
>     at 
> scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:599)
>     at 
> akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:329)
>     at 
> akka.actor.LightArrayRevolverScheduler$$anon$4.executeBucket$1(LightArrayRevolverScheduler.scala:280)
>     at 
> akka.actor.LightArrayRevolverScheduler$$anon$4.nextTick(LightArrayRevolverScheduler.scala:284)
>     at 
> akka.actor.LightArrayRevolverScheduler$$anon$4.run(LightArrayRevolverScheduler.scala:236)
>     at java.lang.Thread.run(Thread.java:745)
> 2019-02-07 17:47:20,367 INFO  
> akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef   - Message 
> [akka.actor.Status$Failure] from Actor[akka://flink/deadLetters] to 
> Actor[akka://flink/deadLetters] was not delivered. [38] dead letters 
> encountered. This logging can be turned off or adjusted with configuration 
> settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
> 2019-02-07 17:47:20,367 INFO  
> akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef   - Message 
> [akka.actor.Status$Failure] from Actor[akka://flink/deadLetters] to 
> Actor[akka://flink/deadLetters] was not delivered. [39] dead letters 
> encountered. This logging can be turned off or adjusted with configuration 
> settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
> 2019-02-07 17:47:21,058 INFO  akka.actor.EmptyLocalActorRef                   
>               - Message [org.apache.flink.types.SerializableOptional] from 
> Actor[akka://flink/deadLetters] to Actor[akka://flink/temp/$8b] was not 
> delivered. [40] dead letters encountered. This logging can be turned off or 
> adjusted with configuration settings 'akka.log-dead-letters' and 
> 'akka.log-dead-letters-during-shutdown'.
> 2019-02-07 17:47:26,910 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] 
> Caused by: [app-flink-taskmanager-034a6a653e17966ed: unknown error]
> 2019-02-07 17:47:26,927 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] 
> Caused by: [app-flink-taskmanager-0056cc1c18d1cff79: unknown error]
> 2019-02-07 17:47:26,952 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] 
> Caused by: [app-flink-taskmanager-065199ce40199b440: unknown error]
> 2019-02-07 17:47:36,616 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] 
> Caused by: [app-flink-taskmanager-065199ce40199b440]
> 2019-02-07 17:47:36,617 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] 
> Caused by: [app-flink-taskmanager-0056cc1c18d1cff79]
> 2019-02-07 17:47:36,617 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] 
> Caused by: [app-flink-taskmanager-034a6a653e17966ed]
> 2019-02-07 17:47:49,921 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] 
> Caused by: [app-flink-taskmanager-0056cc1c18d1cff79: unknown error]
> 2019-02-07 17:47:49,933 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] 
> Caused by: [app-flink-taskmanager-065199ce40199b440: unknown error]
> 2019-02-07 17:47:49,949 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] 
> Caused by: [app-flink-taskmanager-034a6a653e17966ed: unknown error]
> 2019-02-07 17:48:05,517 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] 
> Caused by: [app-flink-taskmanager-065199ce40199b440: unknown error]
> 2019-02-07 17:48:05,520 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] 
> Caused by: [app-flink-taskmanager-034a6a653e17966ed: unknown error]
> 2019-02-07 17:48:05,520 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] 
> Caused by: [app-flink-taskmanager-0056cc1c18d1cff79: unknown error]
> 2019-02-07 17:48:16,266 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] 
> Caused by: [app-flink-taskmanager-0056cc1c18d1cff79: unknown error]
> 2019-02-07 17:48:16,273 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] 
> Caused by: [app-flink-taskmanager-065199ce40199b440: unknown error]
> 2019-02-07 17:48:16,280 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] 
> Caused by: [app-flink-taskmanager-034a6a653e17966ed: unknown error]
> 2019-02-07 17:48:26,316 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] 
> Caused by: [app-flink-taskmanager-0056cc1c18d1cff79: unknown error]
> 2019-02-07 17:48:26,317 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] 
> Caused by: [app-flink-taskmanager-034a6a653e17966ed: unknown error]
> 2019-02-07 17:48:26,342 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] 
> Caused by: [app-flink-taskmanager-065199ce40199b440: unknown error]
> 2019-02-07 17:48:36,736 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] 
> Caused by: [app-flink-taskmanager-0056cc1c18d1cff79: unknown error]
> 2019-02-07 17:48:36,739 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] 
> Caused by: [app-flink-taskmanager-034a6a653e17966ed: unknown error]
> 2019-02-07 17:48:36,771 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] 
> Caused by: [app-flink-taskmanager-065199ce40199b440: unknown error]
> 2019-02-07 17:48:48,014 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] 
> Caused by: [app-flink-taskmanager-065199ce40199b440: unknown error]
> 2019-02-07 17:48:48,018 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] 
> Caused by: [app-flink-taskmanager-034a6a653e17966ed: unknown error]
> 2019-02-07 17:48:48,038 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] 
> Caused by: [app-flink-taskmanager-0056cc1c18d1cff79: unknown error]
> 2019-02-07 17:48:57,129 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] 
> Caused by: [app-flink-taskmanager-034a6a653e17966ed]
> 2019-02-07 17:48:57,129 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] 
> Caused by: [app-flink-taskmanager-065199ce40199b440]
> 2019-02-07 17:48:57,129 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] 
> Caused by: [app-flink-taskmanager-0056cc1c18d1cff79]
> 2019-02-07 17:49:12,894 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] 
> Caused by: [app-flink-taskmanager-034a6a653e17966ed: unknown error]
> 2019-02-07 17:49:12,899 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] 
> Caused by: [app-flink-taskmanager-065199ce40199b440: unknown error]
> 2019-02-07 17:49:13,005 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] 
> Caused by: [app-flink-taskmanager-0056cc1c18d1cff79: unknown error]
> 2019-02-07 17:49:25,532 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] 
> Caused by: [app-flink-taskmanager-0056cc1c18d1cff79: unknown error]
> 2019-02-07 17:49:25,532 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] 
> Caused by: [app-flink-taskmanager-065199ce40199b440: unknown error]
> 2019-02-07 17:49:25,559 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] 
> Caused by: [app-flink-taskmanager-034a6a653e17966ed: unknown error]
> 2019-02-07 17:49:39,784 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-065199ce40199b440:39271]] 
> Caused by: [app-flink-taskmanager-065199ce40199b440: unknown error]
> 2019-02-07 17:49:39,786 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-034a6a653e17966ed:46051]] 
> Caused by: [app-flink-taskmanager-034a6a653e17966ed: unknown error]
> 2019-02-07 17:49:39,790 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833] has 
> failed, address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink-metrics@app-flink-taskmanager-0056cc1c18d1cff79:41833]] 
> Caused by: [app-flink-taskmanager-0056cc1c18d1cff79: unknown error]
> 2019-02-07 17:49:54,111 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink@app-flink-taskmanager-0056cc1c18d1cff79:6122] has failed, 
> address is now gated for [50] ms. Reason: [Disassociated]
> 2019-02-07 17:49:54,976 WARN  
> org.apache.flink.runtime.rest.handler.legacy.metrics.MetricFetcher  - 
> Requesting TaskManager's path for query services failed.
> akka.pattern.AskTimeoutException: Ask timed out on 
> [Actor[akka://flink/user/dispatcher#1335681479]] after [10000 ms]. 
> Sender[null] sent message of type 
> "org.apache.flink.runtime.rpc.messages.LocalFencedMessage".
>     at 
> akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:604)
>     at akka.actor.Scheduler$$anon$4.run(Scheduler.scala:126)
>     at 
> scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:601)
>     at 
> scala.concurrent.BatchingExecutor$class.execute(BatchingExecutor.scala:109)
>     at 
> scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:599)
>     at 
> akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:329)
>     at 
> akka.actor.LightArrayRevolverScheduler$$anon$4.executeBucket$1(LightArrayRevolverScheduler.scala:280)
>     at 
> akka.actor.LightArrayRevolverScheduler$$anon$4.nextTick(LightArrayRevolverScheduler.scala:284)
>     at 
> akka.actor.LightArrayRevolverScheduler$$anon$4.run(LightArrayRevolverScheduler.scala:236)
>     at java.lang.Thread.run(Thread.java:745)
> 2019-02-07 17:49:54,977 INFO  
> akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef   - Message 
> [akka.actor.Status$Failure] from Actor[akka://flink/deadLetters] to 
> Actor[akka://flink/deadLetters] was not delivered. [41] dead letters 
> encountered. This logging can be turned off or adjusted with configuration 
> settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
> 2019-02-07 17:49:54,977 INFO  
> akka.remote.RemoteActorRefProvider$RemoteDeadLetterActorRef   - Message 
> [akka.actor.Status$Failure] from Actor[akka://flink/deadLetters] to 
> Actor[akka://flink/deadLetters] was not delivered. [42] dead letters 
> encountered. This logging can be turned off or adjusted with configuration 
> settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
> 2019-02-07 17:49:55,037 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink@app-flink-taskmanager-0056cc1c18d1cff79:6122] has failed, 
> address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink@app-flink-taskmanager-0056cc1c18d1cff79:6122]] Caused by: 
> [app-flink-taskmanager-0056cc1c18d1cff79: unknown error]
> 2019-02-07 17:49:55,037 INFO  akka.remote.RemoteActorRef                      
>               - Message 
> [org.apache.flink.runtime.rpc.messages.RemoteRpcInvocation] from 
> Actor[akka://flink/temp/$Mc] to 
> Actor[akka.tcp://flink@app-flink-taskmanager-0056cc1c18d1cff79:6122/user/taskmanager_0#1564625337]
>  was not delivered. [43] dead letters encountered. This logging can be turned 
> off or adjusted with configuration settings 'akka.log-dead-letters' and 
> 'akka.log-dead-letters-during-shutdown'.
> 2019-02-07 17:49:55,668 WARN  akka.remote.ReliableDeliverySupervisor          
>               - Association with remote system 
> [akka.tcp://flink@app-flink-taskmanager-0056cc1c18d1cff79:6122] has failed, 
> address is now gated for [50] ms. Reason: [Association failed with 
> [akka.tcp://flink@app-flink-taskmanager-0056cc1c18d1cff79:6122]] Caused by: 
> [app-flink-taskmanager-0056cc1c18d1cff79]
> 2019-02-07 17:49:55,669 INFO  akka.remote.RemoteActorRef                      
>               - Message 
> [org.apache.flink.runtime.rpc.messages.RemoteRpcInvocation] from 
> Actor[akka://flink/deadLetters] to 
> Actor[akka.tcp://flink@app-flink-taskmanager-0056cc1c18d1cff79:6122/user/taskmanager_0#1564625337]
>  was not delivered. [44] dead letters encountered. This logging can be turned 
> off or adjusted with configuration settings 'akka.log-dead-letters' and 
> 'akka.log-dead-letters-during-shutdown'.
> 2019-02-07 17:49:57,343 INFO  
> org.apache.flink.runtime.executiongraph.ExecutionGraph        - 
> Item-DedupEvents-0 -> Item-TransformEventMapper-0 -> CheckUniqueMapper-0 -> 
> CustomerToItemFlatMapper-0 (5/6) (684b98348707f04b1c1270874cd1d02c) switched 
> from RUNNING to FAILED.
> org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException: 
> Connection unexpectedly closed by remote task manager 
> 'app-flink-taskmanager-0056cc1c18d1cff79/10.100.23.120:6125'. This might 
> indicate that the remote task manager was lost.
>     at 
> org.apache.flink.runtime.io.network.netty.CreditBasedPartitionRequestClientHandler.channelInactive(CreditBasedPartitionRequestClientHandler.java:136)
>     at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
>     at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
>     at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
>     at 
> org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:377)
>     at 
> org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:342)
>     at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
>     at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
>     at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
>     at 
> org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1429)
>     at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
>     at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
>     at 
> org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:947)
>     at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:822)
>     at 
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
>     at 
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404)
>     at 
> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
>     at 
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884)
>     at java.lang.Thread.run(Thread.java:745)
> 2019-02-07 17:49:57,343 INFO  
> org.apache.flink.runtime.executiongraph.ExecutionGraph        - Job 
> Mode-Item-AppJob (be9764bbcb7f535b0d0c8bc767c8f651) switched from state 
> RUNNING to FAILING.
> org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException: 
> Connection unexpectedly closed by remote task manager 
> 'app-flink-taskmanager-0056cc1c18d1cff79/10.100.23.120:6125'. This might 
> indicate that the remote task manager was lost.
>     at 
> org.apache.flink.runtime.io.network.netty.CreditBasedPartitionRequestClientHandler.channelInactive(CreditBasedPartitionRequestClientHandler.java:136)
>     at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
>     at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
>     at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
>     at 
> org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:377)
>     at 
> org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:342)
>     at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
>     at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
>     at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
>     at 
> org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1429)
>     at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
>     at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
>     at 
> org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:947)
>     at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:822)
>     at 
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
>     at 
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:404)
>     at 
> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
>     at 
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:884)
>     at java.lang.Thread.run(Thread.java:745)
> {noformat}
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to