[ https://issues.apache.org/jira/browse/YARN-11284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17597483#comment-17597483 ]
ASF GitHub Bot commented on YARN-11284: --------------------------------------- slfan1989 commented on code in PR #4814: URL: https://github.com/apache/hadoop/pull/4814#discussion_r957953877 ########## hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/uam/UnmanagedAMPoolManager.java: ########## @@ -501,4 +472,51 @@ public Map<String, FinishApplicationMasterResponse> batchFinishApplicationMaster return responseMap; } + + Runnable createForceFinishApplicationThread() { + return () -> { + + ExecutorCompletionService<KillApplicationResponse> completionService = + new ExecutorCompletionService<>(threadpool); + + // Save a local copy of the key set so that it won't change with the map + Set<String> addressList = new HashSet<>(unmanagedAppMasterMap.keySet()); + + LOG.warn("Abnormal shutdown of UAMPoolManager, still {} UAMs in map", addressList.size()); + + for (final String uamId : addressList) { + completionService.submit(() -> { + try { + ApplicationId appId = appIdMap.get(uamId); + LOG.info("Force-killing UAM id {} for application {}", uamId, appId); + return unmanagedAppMasterMap.remove(uamId).forceKillApplication(); + } catch (Exception e) { + LOG.error("Failed to kill unmanaged application master", e); + return null; + } + }); + } + + for (int i = 0; i < addressList.size(); ++i) { + try { + Future<KillApplicationResponse> future = completionService.take(); + future.get(); Review Comment: The code in this part remains the same as the code in the original trunk version. This part of the code is to force Kill Application. It does not care whether the Kill is successful or not, because the application will have a timeout and will be killed after the timeout. Even if the forced Kill fails, it should have no effect. From the perspective of code implementation, we should identify the status of the force kill. > [Federation] Improve UnmanagedAMPoolManager WithoutBlock ServiceStop > -------------------------------------------------------------------- > > Key: YARN-11284 > URL: https://issues.apache.org/jira/browse/YARN-11284 > Project: Hadoop YARN > Issue Type: Improvement > Components: federation > Affects Versions: 3.4.0 > Reporter: fanshilun > Assignee: fanshilun > Priority: Major > Labels: pull-request-available > > There is a todo in UnmanagedAMPoolManager#ServiceStop > {code:java} > TODO: move waiting for the kill to finish into a separate thread, without > blocking the serviceStop. {code} > I use a separate thread for this work, no longer Block blocking the > serviceStop -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org