[ 
https://issues.apache.org/jira/browse/FLINK-22663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17346599#comment-17346599
 ] 

Jinhong Liu edited comment on FLINK-22663 at 5/18/21, 5:53 AM:
---------------------------------------------------------------

[~fly_in_gis] 

Firstly, this issue occurs just at least one TaskManger is running on the Dead 
NoManager.

Secondly, when the issue occurs, all the containers include the AppMaster 
cannot exit, not only the containers on the Dead NodeManager.

Finally, I find a configuration that can help containers exit quickly, 
_taskmanager.registration.timeout_, the default value is 5 min. If I set it to 
1 min, the containers can exit one minute later, but the AppMaster still needs 
about 10 mins to exit.


was (Author: jinhongliu):
[~fly_in_gis] 

Firstly, this issue occurs just at least one TaskManger is running on the Dead 
NoManager.

Secondly, when the issue occurs, all the containers include the AppMaster 
cannot exit, not only the containers on the Dead NodeManager.

Finally, I find a configuration that can help containers exit quickly, 
_taskmanager.registration.timeout_, the default value is 5 min. If I set it to 
1 min, the containers exit after one minute later, but the AppMaster still need 
about 10 mins to exit.

> Release YARN resource very slow when cancel the job after some NodeManagers 
> shutdown
> ------------------------------------------------------------------------------------
>
>                 Key: FLINK-22663
>                 URL: https://issues.apache.org/jira/browse/FLINK-22663
>             Project: Flink
>          Issue Type: Bug
>          Components: Deployment / YARN
>    Affects Versions: 1.12.2
>            Reporter: Jinhong Liu
>            Priority: Major
>              Labels: YARN
>
> When I test flink on YARN, there is a case that may cause some problems.
> Hadoop Version: 2.7.3
> Flink Version: 1.12.2
> I deploy a flink job on YARN, when the job is running I stop one NodeManager, 
> after one or two minutes, the job is auto recovered. But in this situation, 
> if I cancel the job, the containers cannot be released immediately, there are 
> still some containers that are running include the app master. About 5 
> minutes later, these containers exit, and about 10 minutes later the app 
> master exit.
> I check the log of app master, seems it try to stop the containers on the 
> NodeManger which I have already stopped.
> {code:java}
> 2021-05-14 06:15:17,389 INFO  
> org.apache.flink.runtime.executiongraph.ExecutionGraph       [] - Job class 
> tv.freewheel.reporting.fastlane.Fastlane$ (da883ab39a7a82e4d45a3803bc77dd6f) 
> switched from state CANCELLING to CANCELED.
> 2021-05-14 06:15:17,389 INFO  
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator    [] - Stopping 
> checkpoint coordinator for job da883ab39a7a82e4d45a3803bc77dd6f.
> 2021-05-14 06:15:17,390 INFO  
> org.apache.flink.runtime.checkpoint.StandaloneCompletedCheckpointStore [] - 
> Shutting down
> 2021-05-14 06:15:17,408 INFO  
> org.apache.flink.runtime.dispatcher.MiniDispatcher           [] - Job 
> da883ab39a7a82e4d45a3803bc77dd6f reached globally terminal state CANCELED.
> 2021-05-14 06:15:17,409 INFO  
> org.apache.flink.runtime.dispatcher.MiniDispatcher           [] - Shutting 
> down cluster with state CANCELED, jobCancelled: true, executionMode: DETACHED
> 2021-05-14 06:15:17,409 INFO  
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - Shutting 
> YarnJobClusterEntrypoint down with application status CANCELED. Diagnostics 
> null.
> 2021-05-14 06:15:17,409 INFO  
> org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] - Shutting 
> down rest endpoint.
> 2021-05-14 06:15:17,420 INFO  org.apache.flink.runtime.jobmaster.JobMaster    
>              [] - Stopping the JobMaster for job class 
> tv.freewheel.reporting.fastlane.Fastlane$(da883ab39a7a82e4d45a3803bc77dd6f).
> 2021-05-14 06:15:17,422 INFO  
> org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] - Removing 
> cache directory 
> /tmp/flink-web-af72a00c-0ddd-4e5e-a62c-8244d6caa552/flink-web-ui
> 2021-05-14 06:15:17,432 INFO  
> org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] - 
> http://ip-10-23-19-197.ec2.internal:43811 lost leadership
> 2021-05-14 06:15:17,432 INFO  
> org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint [] - Shut down 
> complete.
> 2021-05-14 06:15:17,436 INFO  
> org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - 
> Shut down cluster because application is in CANCELED, diagnostics null.
> 2021-05-14 06:15:17,436 INFO  org.apache.flink.yarn.YarnResourceManagerDriver 
>              [] - Unregister application from the YARN Resource Manager with 
> final status KILLED.
> 2021-05-14 06:15:17,458 INFO  
> org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl     [] - Suspending 
> SlotPool.
> 2021-05-14 06:15:17,458 INFO  org.apache.flink.runtime.jobmaster.JobMaster    
>              [] - Close ResourceManager connection 
> 493862ba148679a4f16f7de5ffaef665: Stopping JobMaster for job class 
> tv.freewheel.reporting.fastlane.Fastlane$(da883ab39a7a82e4d45a3803bc77dd6f)..
> 2021-05-14 06:15:17,458 INFO  
> org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl     [] - Stopping 
> SlotPool.
> 2021-05-14 06:15:17,482 INFO  
> org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl        [] - Waiting for 
> application to be successfully unregistered.
> 2021-05-14 06:15:17,566 INFO  org.apache.flink.runtime.history.FsJobArchivist 
>              [] - Job da883ab39a7a82e4d45a3803bc77dd6f has been archived at 
> hdfs:/realtime/flink-archive/da883ab39a7a82e4d45a3803bc77dd6f.
> 2021-05-14 06:15:17,589 INFO  
> org.apache.flink.runtime.entrypoint.component.DispatcherResourceManagerComponent
>  [] - Closing components.
> 2021-05-14 06:15:17,590 INFO  
> org.apache.flink.runtime.dispatcher.runner.JobDispatcherLeaderProcess [] - 
> Stopping JobDispatcherLeaderProcess.
> 2021-05-14 06:15:17,590 INFO  
> org.apache.flink.runtime.dispatcher.MiniDispatcher           [] - Stopping 
> dispatcher 
> akka.tcp://flink@ip-10-23-19-197.ec2.internal:40340/user/rpc/dispatcher_1.
> 2021-05-14 06:15:17,590 INFO  
> org.apache.flink.runtime.dispatcher.MiniDispatcher           [] - Stopping 
> all currently running jobs of dispatcher 
> akka.tcp://flink@ip-10-23-19-197.ec2.internal:40340/user/rpc/dispatcher_1.
> 2021-05-14 06:15:17,591 INFO  
> org.apache.flink.runtime.rest.handler.legacy.backpressure.BackPressureRequestCoordinator
>  [] - Shutting down back pressure request coordinator.
> 2021-05-14 06:15:17,591 INFO  
> org.apache.flink.runtime.dispatcher.MiniDispatcher           [] - Stopped 
> dispatcher 
> akka.tcp://flink@ip-10-23-19-197.ec2.internal:40340/user/rpc/dispatcher_1.
> 2021-05-14 06:15:17,594 INFO  
> org.apache.flink.runtime.resourcemanager.active.ActiveResourceManager [] - 
> Disconnect job manager 
> 00000000000000000000000000000...@akka.tcp://flink@ip-10-23-19-197.ec2.internal:40340/user/rpc/jobmanager_2
>  for job da883ab39a7a82e4d45a3803bc77dd6f from the resource manager.
> 2021-05-14 06:15:17,600 INFO  
> org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl [] - 
> Interrupted while waiting for queue
> java.lang.InterruptedException: null
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2014)
>  ~[?:1.8.0_161]
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2048)
>  ~[?:1.8.0_161]
>       at 
> java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) 
> ~[?:1.8.0_161]
>       at 
> org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$CallbackHandlerThread.run(AMRMClientAsyncImpl.java:287)
>  [flink-shaded-hadoop-2-uber-2.7.5-7.0.jar:2.7.5-7.0]
> 2021-05-14 06:15:17,648 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-27-242.ec2.internal:41435
> 2021-05-14 06:15:17,699 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-28-67.ec2.internal:38916
> 2021-05-14 06:15:17,741 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-24-139.ec2.internal:42226
> 2021-05-14 06:15:17,796 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-24-71.ec2.internal:44804
> 2021-05-14 06:15:17,809 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-27-242.ec2.internal:41435
> 2021-05-14 06:15:17,813 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-28-67.ec2.internal:38916
> 2021-05-14 06:15:17,817 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-27-242.ec2.internal:41435
> 2021-05-14 06:15:17,822 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-24-71.ec2.internal:44804
> 2021-05-14 06:15:17,879 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-19-86.ec2.internal:45099
> 2021-05-14 06:15:17,889 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-19-197.ec2.internal:44443
> 2021-05-14 06:15:17,898 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-24-139.ec2.internal:42226
> 2021-05-14 06:15:17,903 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-24-139.ec2.internal:42226
> 2021-05-14 06:15:17,907 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-19-86.ec2.internal:45099
> 2021-05-14 06:15:17,911 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-28-67.ec2.internal:38916
> 2021-05-14 06:15:17,960 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-25-241.ec2.internal:42723
> 2021-05-14 06:15:17,964 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink@ip-10-23-27-242.ec2.internal:43814] has failed, address is 
> now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:15:17,964 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink-metrics@ip-10-23-27-242.ec2.internal:38826] has failed, 
> address is now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:15:18,016 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink@ip-10-23-28-67.ec2.internal:45022] has failed, address is 
> now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:15:18,016 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink-metrics@ip-10-23-28-67.ec2.internal:40808] has failed, 
> address is now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:15:18,061 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink@ip-10-23-24-139.ec2.internal:33912] has failed, address is 
> now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:15:18,061 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink-metrics@ip-10-23-24-139.ec2.internal:44652] has failed, 
> address is now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:15:18,120 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink-metrics@ip-10-23-24-71.ec2.internal:42454] has failed, 
> address is now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:15:18,120 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink@ip-10-23-24-71.ec2.internal:41756] has failed, address is 
> now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:15:18,125 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink@ip-10-23-27-242.ec2.internal:36652] has failed, address is 
> now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:15:18,125 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink-metrics@ip-10-23-27-242.ec2.internal:37709] has failed, 
> address is now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:15:18,126 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink@ip-10-23-28-67.ec2.internal:40308] has failed, address is 
> now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:15:18,126 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink-metrics@ip-10-23-28-67.ec2.internal:34524] has failed, 
> address is now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:15:18,141 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink-metrics@ip-10-23-27-242.ec2.internal:44435] has failed, 
> address is now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:15:18,141 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink@ip-10-23-27-242.ec2.internal:37224] has failed, address is 
> now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:15:18,143 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink@ip-10-23-24-71.ec2.internal:38940] has failed, address is 
> now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:15:18,143 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink-metrics@ip-10-23-24-71.ec2.internal:33014] has failed, 
> address is now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:15:18,202 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink-metrics@ip-10-23-19-86.ec2.internal:39939] has failed, 
> address is now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:15:18,202 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink@ip-10-23-19-86.ec2.internal:35165] has failed, address is 
> now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:15:18,204 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink@ip-10-23-19-197.ec2.internal:45913] has failed, address is 
> now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:15:18,204 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink-metrics@ip-10-23-19-197.ec2.internal:36333] has failed, 
> address is now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:15:18,220 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink-metrics@ip-10-23-28-67.ec2.internal:35366] has failed, 
> address is now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:15:18,220 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink@ip-10-23-28-67.ec2.internal:45411] has failed, address is 
> now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:15:18,223 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink-metrics@ip-10-23-24-139.ec2.internal:34759] has failed, 
> address is now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:15:18,223 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink@ip-10-23-24-139.ec2.internal:42621] has failed, address is 
> now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:15:18,228 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink@ip-10-23-19-86.ec2.internal:40782] has failed, address is 
> now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:15:18,228 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink-metrics@ip-10-23-19-86.ec2.internal:36612] has failed, 
> address is now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:15:18,251 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink-metrics@ip-10-23-24-139.ec2.internal:38342] has failed, 
> address is now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:15:18,251 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink@ip-10-23-24-139.ec2.internal:36176] has failed, address is 
> now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:18:18,220 ERROR 
> org.apache.hadoop.yarn.client.api.impl.NMClientImpl          [] - Failed to 
> stop Container container_1620970870707_0001_01_000057when stopping 
> NMClientImpl
> 2021-05-14 06:18:18,240 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-19-86.ec2.internal:45099
> 2021-05-14 06:18:18,295 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-24-71.ec2.internal:44804
> 2021-05-14 06:18:18,299 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-25-241.ec2.internal:42723
> 2021-05-14 06:18:18,557 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink@ip-10-23-19-86.ec2.internal:44399] has failed, address is 
> now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:18:18,557 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink-metrics@ip-10-23-19-86.ec2.internal:33246] has failed, 
> address is now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:18:18,611 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink@ip-10-23-24-71.ec2.internal:39100] has failed, address is 
> now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:18:18,611 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink-metrics@ip-10-23-24-71.ec2.internal:35428] has failed, 
> address is now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:21:01,684 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink@ip-10-23-27-242.ec2.internal:41510] has failed, address is 
> now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:21:01,730 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink@ip-10-23-19-197.ec2.internal:39595] has failed, address is 
> now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:21:01,741 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink@ip-10-23-24-71.ec2.internal:46788] has failed, address is 
> now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:21:01,754 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink-metrics@ip-10-23-24-71.ec2.internal:46748] has failed, 
> address is now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:21:01,754 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink-metrics@ip-10-23-24-71.ec2.internal:34218] has failed, 
> address is now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:21:01,761 WARN  akka.remote.ReliableDeliverySupervisor          
>              [] - Association with remote system 
> [akka.tcp://flink@ip-10-23-19-86.ec2.internal:42730] has failed, address is 
> now gated for [50] ms. Reason: [Disassociated] 
> 2021-05-14 06:21:18,522 ERROR 
> org.apache.hadoop.yarn.client.api.impl.NMClientImpl          [] - Failed to 
> stop Container container_1620970870707_0001_01_000078when stopping 
> NMClientImpl
> 2021-05-14 06:21:18,567 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-28-67.ec2.internal:38916
> 2021-05-14 06:21:18,571 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-19-197.ec2.internal:44443
> 2021-05-14 06:21:18,605 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-27-242.ec2.internal:41435
> 2021-05-14 06:21:18,657 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-24-71.ec2.internal:44804
> 2021-05-14 06:21:18,698 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-19-86.ec2.internal:45099
> 2021-05-14 06:21:18,702 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-27-242.ec2.internal:41435
> 2021-05-14 06:21:18,705 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-19-197.ec2.internal:44443
> 2021-05-14 06:21:18,707 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-25-241.ec2.internal:42723
> 2021-05-14 06:24:18,934 ERROR 
> org.apache.hadoop.yarn.client.api.impl.NMClientImpl          [] - Failed to 
> stop Container container_1620970870707_0001_01_000008when stopping 
> NMClientImpl
> 2021-05-14 06:24:18,986 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-19-86.ec2.internal:45099
> 2021-05-14 06:24:19,036 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-24-139.ec2.internal:42226
> 2021-05-14 06:24:19,077 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-24-71.ec2.internal:44804
> 2021-05-14 06:24:19,080 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-24-139.ec2.internal:42226
> 2021-05-14 06:24:19,083 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-25-241.ec2.internal:42723
> 2021-05-14 06:27:19,303 ERROR 
> org.apache.hadoop.yarn.client.api.impl.NMClientImpl          [] - Failed to 
> stop Container container_1620970870707_0001_01_000029when stopping 
> NMClientImpl
> 2021-05-14 06:27:19,349 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-28-67.ec2.internal:38916
> 2021-05-14 06:27:19,353 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-28-67.ec2.internal:38916
> 2021-05-14 06:27:19,402 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-24-71.ec2.internal:44804
> 2021-05-14 06:27:19,466 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-24-139.ec2.internal:42226
> 2021-05-14 06:27:19,470 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-24-71.ec2.internal:44804
> 2021-05-14 06:27:19,504 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-27-242.ec2.internal:41435
> 2021-05-14 06:27:19,508 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-27-242.ec2.internal:41435
> 2021-05-14 06:27:19,510 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-19-197.ec2.internal:44443
> 2021-05-14 06:27:19,545 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-19-86.ec2.internal:45099
> 2021-05-14 06:27:19,548 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-24-71.ec2.internal:44804
> 2021-05-14 06:27:19,551 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-19-86.ec2.internal:45099
> 2021-05-14 06:27:19,554 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-24-139.ec2.internal:42226
> 2021-05-14 06:27:19,557 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-19-197.ec2.internal:44443
> 2021-05-14 06:27:19,559 INFO  
> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy [] - 
> Opening proxy : ip-10-23-25-241.ec2.internal:42723
> 2021-05-14 06:27:50,793 INFO  
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint        [] - RECEIVED 
> SIGNAL 15: SIGTERM. Shutting down as requested.
> 2021-05-14 06:27:50,794 INFO  org.apache.flink.runtime.blob.BlobServer        
>              [] - Stopped BLOB server at 0.0.0.0:44447
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to