[jira] [Commented] (YARN-11196) NUMA Awareness support in DefaultContainerExecutor

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584658#comment-17584658
 ] 

ASF GitHub Bot commented on YARN-11196:
---

PrabhuJoseph commented on code in PR #4742:
URL: https://github.com/apache/hadoop/pull/4742#discussion_r954564319


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java:
##
@@ -372,16 +409,19 @@ public int relaunchContainer(ContainerStartContext ctx)
* as the current working directory for the command. If null,
* the current working directory is not modified.
* @param environment the container environment
+   * @param numaCommands list of prefix numa commands
* @return the new {@link ShellCommandExecutor}
* @see ShellCommandExecutor
*/
-  protected CommandExecutor buildCommandExecutor(String wrapperScriptPath, 
-  String containerIdStr, String user, Path pidFile, Resource resource,
-  File workDir, Map environment) {
-
+  protected CommandExecutor buildCommandExecutor(String wrapperScriptPath,
+String containerIdStr, String user, Path pidFile, 
Resource resource,
+File workDir, Map environment, 
String[] numaCommands) {
+
 String[] command = getRunCommand(wrapperScriptPath,
 containerIdStr, user, pidFile, this.getConf(), resource);
 
+command = concatStringCommands(command, numaCommands);

Review Comment:
   Shall we skip calling this when numaCommands is not passed.





> NUMA Awareness support in DefaultContainerExecutor
> --
>
> Key: YARN-11196
> URL: https://issues.apache.org/jira/browse/YARN-11196
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.3.3
>Reporter: Prabhu Joseph
>Assignee: Samrat Deb
>Priority: Major
>  Labels: pull-request-available
>
> [YARN-5764|https://issues.apache.org/jira/browse/YARN-5764] has added support 
> of NUMA Awareness for Containers launched through LinuxContainerExecutor. 
> This feature is useful to have in DefaultContainerExecutor as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11196) NUMA Awareness support in DefaultContainerExecutor

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584657#comment-17584657
 ] 

ASF GitHub Bot commented on YARN-11196:
---

PrabhuJoseph commented on code in PR #4742:
URL: https://github.com/apache/hadoop/pull/4742#discussion_r954562550


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/TestDefaultContainerExecutor.java:
##
@@ -736,4 +755,197 @@ public void testPickDirectory() throws Exception {
 //new FsPermission(ApplicationLocalizer.LOGDIR_PERM), true);
 //  }
 
+  @Before
+  public void setUp() throws IOException, YarnException {
+yarnConfiguration = new YarnConfiguration();
+setNumaConfig();
+Context mockContext = createAndGetMockContext();
+NMStateStoreService nmStateStoreService =
+mock(NMStateStoreService.class);
+when(mockContext.getNMStateStore()).thenReturn(nmStateStoreService);
+numaResourceAllocator = new NumaResourceAllocator(mockContext) {
+  @Override
+  public String executeNGetCmdOutput(Configuration config)
+  throws YarnRuntimeException {
+return getNumaCmdOutput();
+  }
+};
+
+numaResourceAllocator.init(yarnConfiguration);
+FileContext lfs = FileContext.getLocalFSFileContext();
+containerExecutor = new DefaultContainerExecutor(lfs) {
+  @Override
+  public Configuration getConf() {
+return yarnConfiguration;
+  }
+};
+containerExecutor.setNumaResourceAllocator(numaResourceAllocator);
+mockContainer = mock(Container.class);
+  }
+
+  private void setNumaConfig() {
+yarnConfiguration.set(YarnConfiguration.NM_NUMA_AWARENESS_ENABLED, "true");
+yarnConfiguration.set(YarnConfiguration.NM_NUMA_AWARENESS_READ_TOPOLOGY, 
"true");
+yarnConfiguration.set(YarnConfiguration.NM_NUMA_AWARENESS_NUMACTL_CMD, 
"/usr/bin/numactl");
+  }
+
+
+  private String getNumaCmdOutput() {
+// architecture of 8 cpu cores
+// randomly picked size of memory
+return "available: 2 nodes (0-1)\n\t"
++ "node 0 cpus: 0 2 4 6\n\t"
++ "node 0 size: 73717 MB\n\t"
++ "node 0 free: 73717 MB\n\t"
++ "node 1 cpus: 1 3 5 7\n\t"
++ "node 1 size: 73717 MB\n\t"
++ "node 1 free: 73717 MB\n\t"
++ "node distances:\n\t"
++ "node 0 1\n\t"
++ "0: 10 20\n\t"
++ "1: 20 10";
+  }
+
+  private Context createAndGetMockContext() {
+Context mockContext = mock(Context.class);
+@SuppressWarnings("unchecked")
+ConcurrentHashMap mockContainers = mock(
+ConcurrentHashMap.class);
+mockContainer = mock(Container.class);
+when(mockContainer.getResourceMappings())
+.thenReturn(new ResourceMappings());
+when(mockContainers.get(any())).thenReturn(mockContainer);
+when(mockContext.getContainers()).thenReturn(mockContainers);
+when(mockContainer.getResource()).thenReturn(Resource.newInstance(2048, 
2));
+return mockContext;
+  }
+
+  private void testAllocateNumaResource(String containerId, Resource resource,
+String memNodes, String cpuNodes) 
throws Exception {
+when(mockContainer.getContainerId())
+.thenReturn(ContainerId.fromString(containerId));
+when(mockContainer.getResource()).thenReturn(resource);
+NumaResourceAllocation numaResourceAllocation =
+numaResourceAllocator.allocateNumaNodes(mockContainer);
+String[] commands = 
containerExecutor.getNumaCommands(numaResourceAllocation);
+assertEquals(Arrays.asList(commands), Arrays.asList("/usr/bin/numactl",
+"--interleave=" + memNodes, "--cpunodebind=" + cpuNodes));
+  }
+
+  @Test
+  public void testAllocateNumaMemoryResource() throws Exception {
+// keeping cores constant for testing memory resources
+
+// allocates node 0 for memory and cpu
+testAllocateNumaResource("container_1481156246874_0001_01_01",
+Resource.newInstance(2048, 2), "0", "0");
+
+// allocates node 1 for memory and cpu since allocator uses round robin 
assignment
+testAllocateNumaResource("container_1481156246874_0001_01_02",
+Resource.newInstance(6, 2), "1", "1");
+
+// allocates node 0,1 for memory since there is no sufficient memory in 
any one node
+testAllocateNumaResource("container_1481156246874_0001_01_03",
+Resource.newInstance(8, 2), "0,1", "0");
+
+// returns null since there are no sufficient resources available for the 
request
+when(mockContainer.getContainerId()).thenReturn(
+ContainerId.fromString("container_1481156246874_0001_01_04"));
+when(mockContainer.getResource())
+.thenReturn(Resource.newInstance(8, 2));
+Assert.assertNull(numa

[jira] [Commented] (YARN-9708) Yarn Router Support DelegationToken

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584653#comment-17584653
 ] 

ASF GitHub Bot commented on YARN-9708:
--

slfan1989 commented on code in PR #4746:
URL: https://github.com/apache/hadoop/pull/4746#discussion_r954557599


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/federation/store/impl/MemoryFederationStateStore.java:
##
@@ -395,4 +535,17 @@ public DeleteReservationHomeSubClusterResponse 
deleteReservationHomeSubCluster(
 reservations.remove(reservationId);
 return DeleteReservationHomeSubClusterResponse.newInstance();
   }
-}
+
+  /**
+   * Get DelegationKey By based on MasterKey.
+   *
+   * @param masterKey masterKey
+   * @return DelegationKey
+   */
+  private DelegationKey getDelegationKeyByMasterKey(RouterMasterKey masterKey) 
{

Review Comment:
   I will fix it.





> Yarn Router Support DelegationToken
> ---
>
> Key: YARN-9708
> URL: https://issues.apache.org/jira/browse/YARN-9708
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: router
>Affects Versions: 3.1.1
>Reporter: Xie YiFan
>Assignee: fanshilun
>Priority: Minor
>  Labels: pull-request-available
> Attachments: Add_getDelegationToken_and_SecureLogin_in_router.patch, 
> RMDelegationTokenSecretManager_storeNewMasterKey.svg, 
> RouterDelegationTokenSecretManager_storeNewMasterKey.svg
>
>
> 1.we use router as proxy to manage multiple cluster which be independent of 
> each other in order to apply unified client. Thus, we implement our 
> customized AMRMProxyPolicy that doesn't broadcast ResourceRequest to other 
> cluster.
> 2.Our production environment need kerberos. But router doesn't support 
> SecureLogin for now.
> https://issues.apache.org/jira/browse/YARN-6539 desn't work. So we 
> improvement it.
> 3.Some framework like oozie would get Token via yarnclient#getDelegationToken 
> which router doesn't support. Our solution is that adding homeCluster to 
> ApplicationSubmissionContextProto & GetDelegationTokenRequestProto. Job would 
> be submitted with specified clusterid so that router knows which cluster to 
> submit this job. Router would get Token from one RM according to specified 
> clusterid when client call getDelegation meanwhile apply some mechanism to 
> save this token in memory.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9708) Yarn Router Support DelegationToken

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584652#comment-17584652
 ] 

ASF GitHub Bot commented on YARN-9708:
--

slfan1989 commented on code in PR #4746:
URL: https://github.com/apache/hadoop/pull/4746#discussion_r954556809


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/client/impl/pb/package-info.java:
##
@@ -0,0 +1,20 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+@InterfaceAudience.Public
+package org.apache.hadoop.yarn.security.client.impl.pb;
+import org.apache.hadoop.classification.InterfaceAudience;

Review Comment:
   I will modify the code.





> Yarn Router Support DelegationToken
> ---
>
> Key: YARN-9708
> URL: https://issues.apache.org/jira/browse/YARN-9708
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: router
>Affects Versions: 3.1.1
>Reporter: Xie YiFan
>Assignee: fanshilun
>Priority: Minor
>  Labels: pull-request-available
> Attachments: Add_getDelegationToken_and_SecureLogin_in_router.patch, 
> RMDelegationTokenSecretManager_storeNewMasterKey.svg, 
> RouterDelegationTokenSecretManager_storeNewMasterKey.svg
>
>
> 1.we use router as proxy to manage multiple cluster which be independent of 
> each other in order to apply unified client. Thus, we implement our 
> customized AMRMProxyPolicy that doesn't broadcast ResourceRequest to other 
> cluster.
> 2.Our production environment need kerberos. But router doesn't support 
> SecureLogin for now.
> https://issues.apache.org/jira/browse/YARN-6539 desn't work. So we 
> improvement it.
> 3.Some framework like oozie would get Token via yarnclient#getDelegationToken 
> which router doesn't support. Our solution is that adding homeCluster to 
> ApplicationSubmissionContextProto & GetDelegationTokenRequestProto. Job would 
> be submitted with specified clusterid so that router knows which cluster to 
> submit this job. Router would get Token from one RM according to specified 
> clusterid when client call getDelegation meanwhile apply some mechanism to 
> save this token in memory.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11196) NUMA Awareness support in DefaultContainerExecutor

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584651#comment-17584651
 ] 

ASF GitHub Bot commented on YARN-11196:
---

PrabhuJoseph commented on code in PR #4742:
URL: https://github.com/apache/hadoop/pull/4742#discussion_r954556722


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java:
##
@@ -1040,4 +1080,92 @@ public void updateYarnSysFS(Context ctx, String user,
   String appId, String spec) throws IOException {
 throw new ServiceStateException("Implementation unavailable");
   }
+
+  @Override
+  public int reacquireContainer(ContainerReacquisitionContext ctx)
+  throws IOException, InterruptedException {
+try {
+  if (numaResourceAllocator != null) {
+numaResourceAllocator.recoverNumaResource(ctx.getContainerId());
+  }
+  return super.reacquireContainer(ctx);
+} finally {
+  postComplete(ctx.getContainerId());
+}
+  }
+
+  /**
+   * clean up and release of resources.
+   *
+   * @param containerId containerId of running container
+   */
+  public void postComplete(final ContainerId containerId) {
+if (numaResourceAllocator != null) {
+  try {
+numaResourceAllocator.releaseNumaResource(containerId);
+  } catch (ResourceHandlerException e) {
+LOG.warn("NumaResource release failed for " +
+"containerId: {}. Exception: ", containerId, e);
+  }
+}
+  }
+
+  /**
+   * @param resourceAllocation NonNull NumaResourceAllocation object reference
+   * @return Array of numa specific commands
+   */
+  String[] getNumaCommands(NumaResourceAllocation resourceAllocation) {
+String[] numaCommand = new String[3];
+numaCommand[0] = 
this.getConf().get(YarnConfiguration.NM_NUMA_AWARENESS_NUMACTL_CMD,

Review Comment:
   Better to read this config and initialize in the init.





> NUMA Awareness support in DefaultContainerExecutor
> --
>
> Key: YARN-11196
> URL: https://issues.apache.org/jira/browse/YARN-11196
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.3.3
>Reporter: Prabhu Joseph
>Assignee: Samrat Deb
>Priority: Major
>  Labels: pull-request-available
>
> [YARN-5764|https://issues.apache.org/jira/browse/YARN-5764] has added support 
> of NUMA Awareness for Containers launched through LinuxContainerExecutor. 
> This feature is useful to have in DefaultContainerExecutor as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11253) Add Configuration to delegationToken RemoverScanInterval

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584643#comment-17584643
 ] 

ASF GitHub Bot commented on YARN-11253:
---

slfan1989 commented on code in PR #4751:
URL: https://github.com/apache/hadoop/pull/4751#discussion_r954542677


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml:
##
@@ -1077,6 +1077,14 @@
 8640
   
 
+  
+
+  RM delegation token remove-scan interval in ms
+
+yarn.resourcemanager.delegation.token.remove-scan-interval
+360

Review Comment:
   I will modify the code.





> Add Configuration to delegationToken RemoverScanInterval
> 
>
> Key: YARN-11253
> URL: https://issues.apache.org/jira/browse/YARN-11253
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.4.0, 3.3.4
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Major
>  Labels: pull-request-available
>
> When reading the code, I found the case of hard coding, I think the 
> parameters should be abstracted into the configuration.
> org.apache.hadoop.yarn.server.resourcemanager.RMSecretManagerService#
> createRMDelegationTokenSecretManager
> {code:java}
> protected RMDelegationTokenSecretManager 
> createRMDelegationTokenSecretManager(Configuration conf, RMContext rmContext) 
> {  
>// . 360 This hard code should be extracted    
>return new RMDelegationTokenSecretManager(secretKeyInterval, 
> tokenMaxLifetime, tokenRenewInterval, 360, rmContext); 
> } 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11253) Add Configuration to delegationToken RemoverScanInterval

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584642#comment-17584642
 ] 

ASF GitHub Bot commented on YARN-11253:
---

slfan1989 commented on code in PR #4751:
URL: https://github.com/apache/hadoop/pull/4751#discussion_r954542145


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMSecretManagerService.java:
##
@@ -135,9 +135,11 @@ protected RMDelegationTokenSecretManager 
createRMDelegationTokenSecretManager(
 long tokenRenewInterval =
 conf.getLong(YarnConfiguration.RM_DELEGATION_TOKEN_RENEW_INTERVAL_KEY,
 YarnConfiguration.RM_DELEGATION_TOKEN_RENEW_INTERVAL_DEFAULT);
-
+long removeScanInterval =
+
conf.getLong(YarnConfiguration.RM_DELEGATION_TOKEN_REMOVE_SCAN_INTERVAL_KEY,

Review Comment:
   Thanks for your suggestion, I will modify the code.





> Add Configuration to delegationToken RemoverScanInterval
> 
>
> Key: YARN-11253
> URL: https://issues.apache.org/jira/browse/YARN-11253
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.4.0, 3.3.4
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Major
>  Labels: pull-request-available
>
> When reading the code, I found the case of hard coding, I think the 
> parameters should be abstracted into the configuration.
> org.apache.hadoop.yarn.server.resourcemanager.RMSecretManagerService#
> createRMDelegationTokenSecretManager
> {code:java}
> protected RMDelegationTokenSecretManager 
> createRMDelegationTokenSecretManager(Configuration conf, RMContext rmContext) 
> {  
>// . 360 This hard code should be extracted    
>return new RMDelegationTokenSecretManager(secretKeyInterval, 
> tokenMaxLifetime, tokenRenewInterval, 360, rmContext); 
> } 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11277) trigger deletion of log-dir by size for NonAggregatingLogHandler

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584641#comment-17584641
 ] 

ASF GitHub Bot commented on YARN-11277:
---

hadoop-yetus commented on PR #4797:
URL: https://github.com/apache/hadoop/pull/4797#issuecomment-1226818007

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 46s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  1s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 39s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m 42s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   9m 48s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   8m 40s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   2m  6s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   4m 12s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   3m 33s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   3m 33s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   6m 37s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 13s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 27s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m  6s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   9m  7s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   9m  7s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   8m 31s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   8m 31s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m 48s |  |  
hadoop-yarn-project/hadoop-yarn: The patch generated 0 new + 187 unchanged - 34 
fixed = 187 total (was 221)  |
   | +1 :green_heart: |  mvnsite  |   3m 39s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   3m 19s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   3m  3s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   6m 19s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  21m 58s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   1m 21s |  |  hadoop-yarn-api in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   5m 11s |  |  hadoop-yarn-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |  24m 34s |  |  hadoop-yarn-server-nodemanager 
in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 13s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 198m 23s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4797/4/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4797 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint |
   | uname | Linux 98eddf566c8d 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 
01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 6ac41d5e94d64cf61e7a95f84c0ab1bee7d25997 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
 

[jira] [Commented] (YARN-11277) trigger deletion of log-dir by size for NonAggregatingLogHandler

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584636#comment-17584636
 ] 

ASF GitHub Bot commented on YARN-11277:
---

hadoop-yetus commented on PR #4797:
URL: https://github.com/apache/hadoop/pull/4797#issuecomment-1226812566

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 36s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  1s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 27s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m 11s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   9m 46s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   8m 37s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   2m  4s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   4m 11s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   3m 45s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   3m 24s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   6m 25s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m 55s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 32s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m  4s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   9m  6s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   9m  6s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   8m 32s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   8m 32s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  1s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   1m 53s | 
[/results-checkstyle-hadoop-yarn-project_hadoop-yarn.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4797/3/artifact/out/results-checkstyle-hadoop-yarn-project_hadoop-yarn.txt)
 |  hadoop-yarn-project/hadoop-yarn: The patch generated 2 new + 187 unchanged 
- 34 fixed = 189 total (was 221)  |
   | +1 :green_heart: |  mvnsite  |   3m 23s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   3m 11s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 58s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   6m 18s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  21m 38s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   1m 37s |  |  hadoop-yarn-api in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   5m 19s |  |  hadoop-yarn-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |  24m 37s |  |  hadoop-yarn-server-nodemanager 
in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 14s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 196m 57s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4797/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4797 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint |
   | uname | Linux c866605e3a73 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 
01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 6a8abd25903dd1290fb5c816a457d13d21240211 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK

[jira] [Commented] (YARN-11177) Support getNewReservation, submitReservation, updateReservation, deleteReservation API's for Federation

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584632#comment-17584632
 ] 

ASF GitHub Bot commented on YARN-11177:
---

slfan1989 commented on code in PR #4764:
URL: https://github.com/apache/hadoop/pull/4764#discussion_r954528905


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/test/java/org/apache/hadoop/yarn/server/router/clientrm/TestFederationClientInterceptor.java:
##
@@ -1254,4 +1275,250 @@ public void testNodesToAttributes() throws Exception {
 NodeAttributeType.STRING, "nvida");
 Assert.assertTrue(nodeAttributeMap.get("0-host1").contains(gpu));
   }
+
+  @Test
+  public void testGetNewReservation() throws Exception {
+LOG.info("Test FederationClientInterceptor : Get NewReservation request.");
+
+// null request
+LambdaTestUtils.intercept(YarnException.class,
+"Missing getNewReservation request.", () -> 
interceptor.getNewReservation(null));
+
+// normal request
+GetNewReservationRequest request = GetNewReservationRequest.newInstance();
+GetNewReservationResponse response = 
interceptor.getNewReservation(request);
+Assert.assertNotNull(response);
+
+ReservationId reservationId = response.getReservationId();
+Assert.assertNotNull(reservationId);
+Assert.assertTrue(reservationId.toString().contains("reservation"));
+Assert.assertEquals(reservationId.getClusterTimestamp(), 
ResourceManager.getClusterTimeStamp());
+  }
+
+  @Test
+  public void testSubmitReservation() throws Exception {
+LOG.info("Test FederationClientInterceptor : SubmitReservation request.");
+
+// get new reservationId
+GetNewReservationRequest request = GetNewReservationRequest.newInstance();
+GetNewReservationResponse response = 
interceptor.getNewReservation(request);
+Assert.assertNotNull(response);
+
+// allow plan follower to synchronize, manually trigger an assignment
+Map mockRMs = interceptor.getMockRMs();
+for (MockRM mockRM : mockRMs.values()) {
+  ReservationSystem reservationSystem = mockRM.getReservationSystem();
+  reservationSystem.synchronizePlan("root.decided", true);
+}
+
+// Submit Reservation
+ReservationId reservationId = response.getReservationId();
+ReservationDefinition rDefinition = createReservationDefinition(1024, 1);
+ReservationSubmissionRequest rSubmissionRequest = 
ReservationSubmissionRequest.newInstance(
+rDefinition, "decided", reservationId);
+
+ReservationSubmissionResponse submissionResponse =
+interceptor.submitReservation(rSubmissionRequest);
+Assert.assertNotNull(submissionResponse);
+
+SubClusterId subClusterId = 
stateStoreUtil.queryReservationHomeSC(reservationId);
+Assert.assertNotNull(subClusterId);
+Assert.assertTrue(subClusters.contains(subClusterId));
+  }
+
+  @Test
+  public void testSubmitReservationEmptyRequest() throws Exception {
+LOG.info("Test FederationClientInterceptor : SubmitReservation request 
empty.");
+
+// null request1
+LambdaTestUtils.intercept(YarnException.class,
+"Missing submitReservation request or reservationId or reservation 
definition or queue.",
+() -> interceptor.submitReservation(null));
+
+// null request2
+LambdaTestUtils.intercept(YarnException.class,
+"Missing submitReservation request or reservationId or reservation 
definition or queue.",
+() -> interceptor.submitReservation(
+ReservationSubmissionRequest.newInstance(null, null, null)));
+
+// null request3
+ReservationSubmissionRequest request3 =
+ReservationSubmissionRequest.newInstance(null, "q1", null);
+LambdaTestUtils.intercept(YarnException.class,
+"Missing submitReservation request or reservationId or reservation 
definition or queue.",
+() -> interceptor.submitReservation(request3));
+
+// null request4
+ReservationId reservationId = ReservationId.newInstance(Time.now(), 1);
+ReservationSubmissionRequest request4 =
+ReservationSubmissionRequest.newInstance(null, null,  reservationId);
+LambdaTestUtils.intercept(YarnException.class,
+"Missing submitReservation request or reservationId or reservation 
definition or queue.",
+() -> interceptor.submitReservation(request4));
+
+// null request5
+long defaultDuration = 60;
+long arrival = Time.now();
+long deadline = arrival + (int)(defaultDuration * 1.1);
+
+ReservationRequest rRequest = ReservationRequest.newInstance(
+Resource.newInstance(1024, 1), 1, 1, defaultDuration);
+ReservationRequest[] rRequests = new ReservationRequest[] {rRequest};
+ReservationDefinition rDefinition = createReservationDefinition(arrival, 
deadline, rRequests,
+ReservationRequestInterpreter.R_ALL, "u1");
+ReservationSubmissionRequest reque

[jira] [Commented] (YARN-11177) Support getNewReservation, submitReservation, updateReservation, deleteReservation API's for Federation

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584628#comment-17584628
 ] 

ASF GitHub Bot commented on YARN-11177:
---

slfan1989 commented on code in PR #4764:
URL: https://github.com/apache/hadoop/pull/4764#discussion_r954527995


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/test/java/org/apache/hadoop/yarn/server/router/clientrm/TestFederationClientInterceptor.java:
##
@@ -1254,4 +1275,250 @@ public void testNodesToAttributes() throws Exception {
 NodeAttributeType.STRING, "nvida");
 Assert.assertTrue(nodeAttributeMap.get("0-host1").contains(gpu));
   }
+
+  @Test
+  public void testGetNewReservation() throws Exception {
+LOG.info("Test FederationClientInterceptor : Get NewReservation request.");
+
+// null request
+LambdaTestUtils.intercept(YarnException.class,
+"Missing getNewReservation request.", () -> 
interceptor.getNewReservation(null));
+
+// normal request
+GetNewReservationRequest request = GetNewReservationRequest.newInstance();
+GetNewReservationResponse response = 
interceptor.getNewReservation(request);
+Assert.assertNotNull(response);
+
+ReservationId reservationId = response.getReservationId();
+Assert.assertNotNull(reservationId);
+Assert.assertTrue(reservationId.toString().contains("reservation"));
+Assert.assertEquals(reservationId.getClusterTimestamp(), 
ResourceManager.getClusterTimeStamp());
+  }
+
+  @Test
+  public void testSubmitReservation() throws Exception {
+LOG.info("Test FederationClientInterceptor : SubmitReservation request.");
+
+// get new reservationId
+GetNewReservationRequest request = GetNewReservationRequest.newInstance();
+GetNewReservationResponse response = 
interceptor.getNewReservation(request);
+Assert.assertNotNull(response);
+
+// allow plan follower to synchronize, manually trigger an assignment
+Map mockRMs = interceptor.getMockRMs();
+for (MockRM mockRM : mockRMs.values()) {
+  ReservationSystem reservationSystem = mockRM.getReservationSystem();
+  reservationSystem.synchronizePlan("root.decided", true);
+}
+
+// Submit Reservation
+ReservationId reservationId = response.getReservationId();
+ReservationDefinition rDefinition = createReservationDefinition(1024, 1);
+ReservationSubmissionRequest rSubmissionRequest = 
ReservationSubmissionRequest.newInstance(
+rDefinition, "decided", reservationId);
+
+ReservationSubmissionResponse submissionResponse =
+interceptor.submitReservation(rSubmissionRequest);
+Assert.assertNotNull(submissionResponse);
+
+SubClusterId subClusterId = 
stateStoreUtil.queryReservationHomeSC(reservationId);
+Assert.assertNotNull(subClusterId);
+Assert.assertTrue(subClusters.contains(subClusterId));
+  }
+
+  @Test
+  public void testSubmitReservationEmptyRequest() throws Exception {
+LOG.info("Test FederationClientInterceptor : SubmitReservation request 
empty.");
+
+// null request1
+LambdaTestUtils.intercept(YarnException.class,
+"Missing submitReservation request or reservationId or reservation 
definition or queue.",
+() -> interceptor.submitReservation(null));
+
+// null request2
+LambdaTestUtils.intercept(YarnException.class,
+"Missing submitReservation request or reservationId or reservation 
definition or queue.",
+() -> interceptor.submitReservation(
+ReservationSubmissionRequest.newInstance(null, null, null)));

Review Comment:
   I will fix it.





> Support getNewReservation, submitReservation, updateReservation, 
> deleteReservation API's for Federation
> ---
>
> Key: YARN-11177
> URL: https://issues.apache.org/jira/browse/YARN-11177
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11177) Support getNewReservation, submitReservation, updateReservation, deleteReservation API's for Federation

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584625#comment-17584625
 ] 

ASF GitHub Bot commented on YARN-11177:
---

slfan1989 commented on code in PR #4764:
URL: https://github.com/apache/hadoop/pull/4764#discussion_r954525110


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/main/java/org/apache/hadoop/yarn/server/router/clientrm/FederationClientInterceptor.java:
##
@@ -925,13 +1041,61 @@ public ReservationListResponse listReservations(
   @Override
   public ReservationUpdateResponse updateReservation(
   ReservationUpdateRequest request) throws YarnException, IOException {
-throw new NotImplementedException("Code is not implemented");
+
+if (request == null || request.getReservationId() == null
+|| request.getReservationDefinition() == null) {

Review Comment:
   I will fix it.





> Support getNewReservation, submitReservation, updateReservation, 
> deleteReservation API's for Federation
> ---
>
> Key: YARN-11177
> URL: https://issues.apache.org/jira/browse/YARN-11177
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11177) Support getNewReservation, submitReservation, updateReservation, deleteReservation API's for Federation

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584624#comment-17584624
 ] 

ASF GitHub Bot commented on YARN-11177:
---

slfan1989 commented on code in PR #4764:
URL: https://github.com/apache/hadoop/pull/4764#discussion_r954524957


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/main/java/org/apache/hadoop/yarn/server/router/clientrm/FederationClientInterceptor.java:
##
@@ -888,13 +890,127 @@ public MoveApplicationAcrossQueuesResponse 
moveApplicationAcrossQueues(
   @Override
   public GetNewReservationResponse getNewReservation(
   GetNewReservationRequest request) throws YarnException, IOException {
-throw new NotImplementedException("Code is not implemented");
+
+if (request == null) {
+  routerMetrics.incrGetNewReservationFailedRetrieved();
+  String errMsg = "Missing getNewReservation request.";
+  RouterServerUtil.logAndThrowException(errMsg, null);
+}
+
+long startTime = clock.getTime();
+Map subClustersActive =
+federationFacade.getSubClusters(true);
+
+for (int i = 0; i < numSubmitRetries; ++i) {
+  SubClusterId subClusterId = getRandomActiveSubCluster(subClustersActive);
+  LOG.info("getNewReservation try #{} on SubCluster {}.", i, subClusterId);
+  ApplicationClientProtocol clientRMProxy = 
getClientRMProxyForSubCluster(subClusterId);
+  GetNewReservationResponse response = null;
+  try {
+response = clientRMProxy.getNewReservation(request);
+if (response != null) {
+  long stopTime = clock.getTime();
+  routerMetrics.succeededGetNewReservationRetrieved(stopTime - 
startTime);
+  return response;
+}
+  } catch (Exception e) {
+LOG.warn("Unable to create a new Reservation in SubCluster {}.", 
subClusterId.getId(), e);
+subClustersActive.remove(subClusterId);
+  }
+}
+
+routerMetrics.incrGetNewReservationFailedRetrieved();
+String errMsg = "Failed to create a new reservation.";
+throw new YarnException(errMsg);
   }
 
   @Override
   public ReservationSubmissionResponse submitReservation(
   ReservationSubmissionRequest request) throws YarnException, IOException {
-throw new NotImplementedException("Code is not implemented");
+
+if (request == null || request.getReservationId() == null
+|| request.getReservationDefinition() == null || 
request.getQueue() == null) {
+  routerMetrics.incrSubmitReservationFailedRetrieved();
+  RouterServerUtil.logAndThrowException(
+  "Missing submitReservation request or reservationId " +
+   "or reservation definition or queue.", null);
+}
+
+long startTime = clock.getTime();
+ReservationId reservationId = request.getReservationId();
+
+long retryCount = 0;
+boolean firstRetry = true;
+
+while (retryCount < numSubmitRetries) {
+
+  SubClusterId subClusterId = 
policyFacade.getReservationHomeSubCluster(request);
+  LOG.info("submitReservation reservationId {} try #{} on SubCluster {}.",
+  reservationId, retryCount, subClusterId);
+
+  ReservationHomeSubCluster reservationHomeSubCluster =
+  ReservationHomeSubCluster.newInstance(reservationId, subClusterId);
+
+  // If it is the first attempt,use StateStore to add the
+  // mapping of reservationId and subClusterId.
+  // if the number of attempts is greater than 1, use StateStore to update 
the mapping.
+  if (firstRetry) {
+try {
+  // persist the mapping of reservationId and the subClusterId which 
has
+  // been selected as its home
+  subClusterId = 
federationFacade.addReservationHomeSubCluster(reservationHomeSubCluster);
+  firstRetry = false;
+} catch (YarnException e) {
+  routerMetrics.incrSubmitReservationFailedRetrieved();
+  RouterServerUtil.logAndThrowException(e,
+  "Unable to insert the ReservationId %s into the 
FederationStateStore.",
+   reservationId);
+}
+  } else {
+try {
+  // update the mapping of reservationId and the home subClusterId to
+  // the new subClusterId we have selected
+  
federationFacade.updateReservationHomeSubCluster(reservationHomeSubCluster);
+} catch (YarnException e) {
+  SubClusterId subClusterIdInStateStore =
+  federationFacade.getReservationHomeSubCluster(reservationId);
+  if (subClusterId == subClusterIdInStateStore) {
+LOG.info("Reservation {} already submitted on SubCluster {}.",
+reservationId, subClusterId);
+  } else {
+routerMetrics.incrSubmitReservationFailedRetrieved();
+RouterServerUtil.logAndThrowException(e,
+"Unable to update the ReservationId %s into 

[jira] [Commented] (YARN-11177) Support getNewReservation, submitReservation, updateReservation, deleteReservation API's for Federation

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584623#comment-17584623
 ] 

ASF GitHub Bot commented on YARN-11177:
---

slfan1989 commented on code in PR #4764:
URL: https://github.com/apache/hadoop/pull/4764#discussion_r954524494


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/main/java/org/apache/hadoop/yarn/server/router/clientrm/FederationClientInterceptor.java:
##
@@ -888,13 +890,127 @@ public MoveApplicationAcrossQueuesResponse 
moveApplicationAcrossQueues(
   @Override
   public GetNewReservationResponse getNewReservation(
   GetNewReservationRequest request) throws YarnException, IOException {
-throw new NotImplementedException("Code is not implemented");
+
+if (request == null) {
+  routerMetrics.incrGetNewReservationFailedRetrieved();
+  String errMsg = "Missing getNewReservation request.";
+  RouterServerUtil.logAndThrowException(errMsg, null);
+}
+
+long startTime = clock.getTime();
+Map subClustersActive =
+federationFacade.getSubClusters(true);
+
+for (int i = 0; i < numSubmitRetries; ++i) {
+  SubClusterId subClusterId = getRandomActiveSubCluster(subClustersActive);
+  LOG.info("getNewReservation try #{} on SubCluster {}.", i, subClusterId);
+  ApplicationClientProtocol clientRMProxy = 
getClientRMProxyForSubCluster(subClusterId);
+  GetNewReservationResponse response = null;
+  try {
+response = clientRMProxy.getNewReservation(request);
+if (response != null) {
+  long stopTime = clock.getTime();
+  routerMetrics.succeededGetNewReservationRetrieved(stopTime - 
startTime);
+  return response;
+}
+  } catch (Exception e) {
+LOG.warn("Unable to create a new Reservation in SubCluster {}.", 
subClusterId.getId(), e);
+subClustersActive.remove(subClusterId);
+  }
+}
+
+routerMetrics.incrGetNewReservationFailedRetrieved();
+String errMsg = "Failed to create a new reservation.";
+throw new YarnException(errMsg);
   }
 
   @Override
   public ReservationSubmissionResponse submitReservation(
   ReservationSubmissionRequest request) throws YarnException, IOException {
-throw new NotImplementedException("Code is not implemented");
+
+if (request == null || request.getReservationId() == null
+|| request.getReservationDefinition() == null || 
request.getQueue() == null) {
+  routerMetrics.incrSubmitReservationFailedRetrieved();
+  RouterServerUtil.logAndThrowException(
+  "Missing submitReservation request or reservationId " +
+   "or reservation definition or queue.", null);
+}
+
+long startTime = clock.getTime();
+ReservationId reservationId = request.getReservationId();
+
+long retryCount = 0;
+boolean firstRetry = true;
+
+while (retryCount < numSubmitRetries) {
+
+  SubClusterId subClusterId = 
policyFacade.getReservationHomeSubCluster(request);
+  LOG.info("submitReservation reservationId {} try #{} on SubCluster {}.",
+  reservationId, retryCount, subClusterId);
+
+  ReservationHomeSubCluster reservationHomeSubCluster =
+  ReservationHomeSubCluster.newInstance(reservationId, subClusterId);
+
+  // If it is the first attempt,use StateStore to add the
+  // mapping of reservationId and subClusterId.
+  // if the number of attempts is greater than 1, use StateStore to update 
the mapping.
+  if (firstRetry) {
+try {
+  // persist the mapping of reservationId and the subClusterId which 
has
+  // been selected as its home
+  subClusterId = 
federationFacade.addReservationHomeSubCluster(reservationHomeSubCluster);
+  firstRetry = false;
+} catch (YarnException e) {
+  routerMetrics.incrSubmitReservationFailedRetrieved();
+  RouterServerUtil.logAndThrowException(e,
+  "Unable to insert the ReservationId %s into the 
FederationStateStore.",
+   reservationId);
+}
+  } else {
+try {
+  // update the mapping of reservationId and the home subClusterId to
+  // the new subClusterId we have selected
+  
federationFacade.updateReservationHomeSubCluster(reservationHomeSubCluster);
+} catch (YarnException e) {
+  SubClusterId subClusterIdInStateStore =
+  federationFacade.getReservationHomeSubCluster(reservationId);
+  if (subClusterId == subClusterIdInStateStore) {
+LOG.info("Reservation {} already submitted on SubCluster {}.",
+reservationId, subClusterId);
+  } else {
+routerMetrics.incrSubmitReservationFailedRetrieved();
+RouterServerUtil.logAndThrowException(e,
+"Unable to update the ReservationId %s into 

[jira] [Commented] (YARN-11177) Support getNewReservation, submitReservation, updateReservation, deleteReservation API's for Federation

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584622#comment-17584622
 ] 

ASF GitHub Bot commented on YARN-11177:
---

slfan1989 commented on code in PR #4764:
URL: https://github.com/apache/hadoop/pull/4764#discussion_r954524092


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/main/java/org/apache/hadoop/yarn/server/router/clientrm/FederationClientInterceptor.java:
##
@@ -888,13 +890,127 @@ public MoveApplicationAcrossQueuesResponse 
moveApplicationAcrossQueues(
   @Override
   public GetNewReservationResponse getNewReservation(
   GetNewReservationRequest request) throws YarnException, IOException {
-throw new NotImplementedException("Code is not implemented");
+
+if (request == null) {
+  routerMetrics.incrGetNewReservationFailedRetrieved();
+  String errMsg = "Missing getNewReservation request.";
+  RouterServerUtil.logAndThrowException(errMsg, null);
+}
+
+long startTime = clock.getTime();
+Map subClustersActive =
+federationFacade.getSubClusters(true);
+
+for (int i = 0; i < numSubmitRetries; ++i) {
+  SubClusterId subClusterId = getRandomActiveSubCluster(subClustersActive);
+  LOG.info("getNewReservation try #{} on SubCluster {}.", i, subClusterId);
+  ApplicationClientProtocol clientRMProxy = 
getClientRMProxyForSubCluster(subClusterId);
+  GetNewReservationResponse response = null;
+  try {
+response = clientRMProxy.getNewReservation(request);
+if (response != null) {
+  long stopTime = clock.getTime();
+  routerMetrics.succeededGetNewReservationRetrieved(stopTime - 
startTime);
+  return response;
+}
+  } catch (Exception e) {
+LOG.warn("Unable to create a new Reservation in SubCluster {}.", 
subClusterId.getId(), e);
+subClustersActive.remove(subClusterId);
+  }
+}
+
+routerMetrics.incrGetNewReservationFailedRetrieved();
+String errMsg = "Failed to create a new reservation.";
+throw new YarnException(errMsg);
   }
 
   @Override
   public ReservationSubmissionResponse submitReservation(
   ReservationSubmissionRequest request) throws YarnException, IOException {
-throw new NotImplementedException("Code is not implemented");
+
+if (request == null || request.getReservationId() == null
+|| request.getReservationDefinition() == null || 
request.getQueue() == null) {
+  routerMetrics.incrSubmitReservationFailedRetrieved();
+  RouterServerUtil.logAndThrowException(
+  "Missing submitReservation request or reservationId " +
+   "or reservation definition or queue.", null);
+}
+
+long startTime = clock.getTime();
+ReservationId reservationId = request.getReservationId();
+
+long retryCount = 0;
+boolean firstRetry = true;
+
+while (retryCount < numSubmitRetries) {
+
+  SubClusterId subClusterId = 
policyFacade.getReservationHomeSubCluster(request);
+  LOG.info("submitReservation reservationId {} try #{} on SubCluster {}.",
+  reservationId, retryCount, subClusterId);
+
+  ReservationHomeSubCluster reservationHomeSubCluster =
+  ReservationHomeSubCluster.newInstance(reservationId, subClusterId);
+
+  // If it is the first attempt,use StateStore to add the
+  // mapping of reservationId and subClusterId.
+  // if the number of attempts is greater than 1, use StateStore to update 
the mapping.
+  if (firstRetry) {
+try {
+  // persist the mapping of reservationId and the subClusterId which 
has
+  // been selected as its home
+  subClusterId = 
federationFacade.addReservationHomeSubCluster(reservationHomeSubCluster);
+  firstRetry = false;
+} catch (YarnException e) {
+  routerMetrics.incrSubmitReservationFailedRetrieved();
+  RouterServerUtil.logAndThrowException(e,
+  "Unable to insert the ReservationId %s into the 
FederationStateStore.",
+   reservationId);
+}
+  } else {
+try {
+  // update the mapping of reservationId and the home subClusterId to
+  // the new subClusterId we have selected
+  
federationFacade.updateReservationHomeSubCluster(reservationHomeSubCluster);
+} catch (YarnException e) {
+  SubClusterId subClusterIdInStateStore =
+  federationFacade.getReservationHomeSubCluster(reservationId);
+  if (subClusterId == subClusterIdInStateStore) {
+LOG.info("Reservation {} already submitted on SubCluster {}.",
+reservationId, subClusterId);
+  } else {
+routerMetrics.incrSubmitReservationFailedRetrieved();
+RouterServerUtil.logAndThrowException(e,
+"Unable to update the ReservationId %s into 

[jira] [Commented] (YARN-11177) Support getNewReservation, submitReservation, updateReservation, deleteReservation API's for Federation

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584620#comment-17584620
 ] 

ASF GitHub Bot commented on YARN-11177:
---

slfan1989 commented on code in PR #4764:
URL: https://github.com/apache/hadoop/pull/4764#discussion_r954523877


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/main/java/org/apache/hadoop/yarn/server/router/clientrm/FederationClientInterceptor.java:
##
@@ -888,13 +890,127 @@ public MoveApplicationAcrossQueuesResponse 
moveApplicationAcrossQueues(
   @Override
   public GetNewReservationResponse getNewReservation(
   GetNewReservationRequest request) throws YarnException, IOException {
-throw new NotImplementedException("Code is not implemented");
+
+if (request == null) {
+  routerMetrics.incrGetNewReservationFailedRetrieved();
+  String errMsg = "Missing getNewReservation request.";
+  RouterServerUtil.logAndThrowException(errMsg, null);
+}
+
+long startTime = clock.getTime();
+Map subClustersActive =

Review Comment:
   I will fix it.





> Support getNewReservation, submitReservation, updateReservation, 
> deleteReservation API's for Federation
> ---
>
> Key: YARN-11177
> URL: https://issues.apache.org/jira/browse/YARN-11177
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11177) Support getNewReservation, submitReservation, updateReservation, deleteReservation API's for Federation

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584606#comment-17584606
 ] 

ASF GitHub Bot commented on YARN-11177:
---

slfan1989 commented on code in PR #4764:
URL: https://github.com/apache/hadoop/pull/4764#discussion_r954499120


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/main/java/org/apache/hadoop/yarn/server/router/clientrm/FederationClientInterceptor.java:
##
@@ -1624,6 +1788,35 @@ protected SubClusterId getApplicationHomeSubCluster(
 throw new YarnException(errorMsg);
   }
 
+  protected SubClusterId getReservationHomeSubCluster(ReservationId 
reservationId)
+  throws YarnException {
+
+if (reservationId == null) {
+  LOG.error("ReservationId is Null, Can't find in SubCluster.");
+  return null;
+}
+
+SubClusterId resultSubClusterId = null;
+
+// try looking for applicationId in Home SubCluster
+try {
+  resultSubClusterId = 
federationFacade.getReservationHomeSubCluster(reservationId);
+} catch (YarnException ex) {
+  if(LOG.isDebugEnabled()){
+LOG.debug("Can't find reservationId = {} in home sub cluster, " +
+" try foreach sub clusters.", reservationId);

Review Comment:
   I will fix it.



##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/test/java/org/apache/hadoop/yarn/server/router/clientrm/TestFederationClientInterceptor.java:
##
@@ -203,6 +218,12 @@ protected YarnConfiguration createConfiguration() {
 
 // Disable StateStoreFacade cache
 conf.setInt(YarnConfiguration.FEDERATION_CACHE_TIME_TO_LIVE_SECS, 0);
+
+conf.setInt("yarn.scheduler.minimum-allocation-mb", 512);
+conf.setInt("yarn.scheduler.minimum-allocation-vcores", 1);
+conf.setInt("yarn.scheduler.maximum-allocation-mb", 102400);

Review Comment:
   I will fix it.





> Support getNewReservation, submitReservation, updateReservation, 
> deleteReservation API's for Federation
> ---
>
> Key: YARN-11177
> URL: https://issues.apache.org/jira/browse/YARN-11177
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-6539) Create SecureLogin inside Router

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-6539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584590#comment-17584590
 ] 

ASF GitHub Bot commented on YARN-6539:
--

zhengchenyu closed pull request #4354: YARN-6539. Create SecureLogin inside 
Router.
URL: https://github.com/apache/hadoop/pull/4354




> Create SecureLogin inside Router
> 
>
> Key: YARN-6539
> URL: https://issues.apache.org/jira/browse/YARN-6539
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: Giovanni Matteo Fumarola
>Assignee: Xie YiFan
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: YARN-6359_1.patch, YARN-6359_2.patch, 
> YARN-6539-branch-3.1.0.004.patch, YARN-6539-branch-3.1.0.005.patch, 
> YARN-6539.006.patch, YARN-6539.007.patch, YARN-6539.008.patch, 
> YARN-6539_3.patch, YARN-6539_4.patch
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11177) Support getNewReservation, submitReservation, updateReservation, deleteReservation API's for Federation

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584582#comment-17584582
 ] 

ASF GitHub Bot commented on YARN-11177:
---

slfan1989 commented on code in PR #4764:
URL: https://github.com/apache/hadoop/pull/4764#discussion_r954475014


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/main/java/org/apache/hadoop/yarn/server/router/clientrm/FederationClientInterceptor.java:
##
@@ -925,13 +1041,61 @@ public ReservationListResponse listReservations(
   @Override
   public ReservationUpdateResponse updateReservation(
   ReservationUpdateRequest request) throws YarnException, IOException {
-throw new NotImplementedException("Code is not implemented");
+
+if (request == null || request.getReservationId() == null
+|| request.getReservationDefinition() == null) {
+  routerMetrics.incrUpdateReservationFailedRetrieved();
+  RouterServerUtil.logAndThrowException(
+  "Missing updateReservation request or reservationId or reservation 
definition.", null);
+}
+
+long startTime = clock.getTime();
+ReservationId reservationId = request.getReservationId();
+SubClusterId subClusterId = getReservationHomeSubCluster(reservationId);
+
+ApplicationClientProtocol client;

Review Comment:
   I will fix it.



##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/main/java/org/apache/hadoop/yarn/server/router/clientrm/FederationClientInterceptor.java:
##
@@ -925,13 +1041,61 @@ public ReservationListResponse listReservations(
   @Override
   public ReservationUpdateResponse updateReservation(
   ReservationUpdateRequest request) throws YarnException, IOException {
-throw new NotImplementedException("Code is not implemented");
+
+if (request == null || request.getReservationId() == null
+|| request.getReservationDefinition() == null) {
+  routerMetrics.incrUpdateReservationFailedRetrieved();
+  RouterServerUtil.logAndThrowException(
+  "Missing updateReservation request or reservationId or reservation 
definition.", null);
+}
+
+long startTime = clock.getTime();
+ReservationId reservationId = request.getReservationId();
+SubClusterId subClusterId = getReservationHomeSubCluster(reservationId);
+
+ApplicationClientProtocol client;
+ReservationUpdateResponse response = null;
+try {
+  client = getClientRMProxyForSubCluster(subClusterId);
+  response = client.updateReservation(request);
+} catch (Exception ex) {
+  routerMetrics.incrUpdateReservationFailedRetrieved();
+  RouterServerUtil.logAndThrowException(
+  "Unable to reservation update due to exception.", ex);
+}
+long stopTime = clock.getTime();
+routerMetrics.succeededUpdateReservationRetrieved(stopTime - startTime);
+return response;
   }
 
   @Override
   public ReservationDeleteResponse deleteReservation(
   ReservationDeleteRequest request) throws YarnException, IOException {
-throw new NotImplementedException("Code is not implemented");
+if (request == null || request.getReservationId() == null) {
+  routerMetrics.incrDeleteReservationFailedRetrieved();
+  RouterServerUtil.logAndThrowException(
+  "Missing deleteReservation request or reservationId.", null);
+}
+
+long startTime = clock.getTime();
+ReservationId reservationId = request.getReservationId();
+SubClusterId subClusterId = getReservationHomeSubCluster(reservationId);
+
+ApplicationClientProtocol client;
+ReservationDeleteResponse response = null;
+try {
+  client = getClientRMProxyForSubCluster(subClusterId);

Review Comment:
   I will fix it.





> Support getNewReservation, submitReservation, updateReservation, 
> deleteReservation API's for Federation
> ---
>
> Key: YARN-11177
> URL: https://issues.apache.org/jira/browse/YARN-11177
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11177) Support getNewReservation, submitReservation, updateReservation, deleteReservation API's for Federation

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584580#comment-17584580
 ] 

ASF GitHub Bot commented on YARN-11177:
---

slfan1989 commented on code in PR #4764:
URL: https://github.com/apache/hadoop/pull/4764#discussion_r954473098


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/main/java/org/apache/hadoop/yarn/server/router/clientrm/FederationClientInterceptor.java:
##
@@ -888,13 +890,127 @@ public MoveApplicationAcrossQueuesResponse 
moveApplicationAcrossQueues(
   @Override
   public GetNewReservationResponse getNewReservation(
   GetNewReservationRequest request) throws YarnException, IOException {
-throw new NotImplementedException("Code is not implemented");
+
+if (request == null) {
+  routerMetrics.incrGetNewReservationFailedRetrieved();
+  String errMsg = "Missing getNewReservation request.";
+  RouterServerUtil.logAndThrowException(errMsg, null);
+}
+
+long startTime = clock.getTime();
+Map subClustersActive =
+federationFacade.getSubClusters(true);
+
+for (int i = 0; i < numSubmitRetries; ++i) {
+  SubClusterId subClusterId = getRandomActiveSubCluster(subClustersActive);
+  LOG.info("getNewReservation try #{} on SubCluster {}.", i, subClusterId);
+  ApplicationClientProtocol clientRMProxy = 
getClientRMProxyForSubCluster(subClusterId);
+  GetNewReservationResponse response = null;
+  try {
+response = clientRMProxy.getNewReservation(request);
+if (response != null) {
+  long stopTime = clock.getTime();
+  routerMetrics.succeededGetNewReservationRetrieved(stopTime - 
startTime);
+  return response;
+}
+  } catch (Exception e) {
+LOG.warn("Unable to create a new Reservation in SubCluster {}.", 
subClusterId.getId(), e);
+subClustersActive.remove(subClusterId);
+  }
+}
+
+routerMetrics.incrGetNewReservationFailedRetrieved();
+String errMsg = "Failed to create a new reservation.";
+throw new YarnException(errMsg);
   }
 
   @Override
   public ReservationSubmissionResponse submitReservation(
   ReservationSubmissionRequest request) throws YarnException, IOException {
-throw new NotImplementedException("Code is not implemented");
+
+if (request == null || request.getReservationId() == null
+|| request.getReservationDefinition() == null || 
request.getQueue() == null) {

Review Comment:
   I will fix it.





> Support getNewReservation, submitReservation, updateReservation, 
> deleteReservation API's for Federation
> ---
>
> Key: YARN-11177
> URL: https://issues.apache.org/jira/browse/YARN-11177
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11277) trigger deletion of log-dir by size for NonAggregatingLogHandler

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584549#comment-17584549
 ] 

ASF GitHub Bot commented on YARN-11277:
---

slfan1989 commented on PR #4797:
URL: https://github.com/apache/hadoop/pull/4797#issuecomment-1226681435

   @leixm Fix CheckStyle.




> trigger deletion of log-dir by size for NonAggregatingLogHandler
> 
>
> Key: YARN-11277
> URL: https://issues.apache.org/jira/browse/YARN-11277
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.4.0
>Reporter: Xianming Lei
>Priority: Minor
>  Labels: pull-request-available
>
> In our yarn cluster, the log files of some containers are too large, which 
> causes the NodeManager to frequently switch to the unhealthy state. For logs 
> that are too large, we can consider deleting them directly without delaying 
> yarn.nodemanager.log.retain-seconds.
> Cluster environment:
>  # 8k nodes+
>  # 50w+ apps  / day
> Configuration:
>  # yarn.nodemanager.log.retain-seconds=3days
>  # yarn.log-aggregation-enable=false
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11275) [Federation] Add batchFinishApplicationMaster in UAMPoolManager

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584547#comment-17584547
 ] 

ASF GitHub Bot commented on YARN-11275:
---

slfan1989 commented on code in PR #4792:
URL: https://github.com/apache/hadoop/pull/4792#discussion_r954433018


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/uam/UnmanagedAMPoolManager.java:
##
@@ -450,4 +452,52 @@ public void drainUAMHeartbeats() {
   uam.drainHeartbeatThread();
 }
   }
+
+  /**
+   * Complete FinishApplicationMaster interface calls in batches.
+   *
+   * @param request FinishApplicationMasterRequest
+   * @param appId application Id
+   * @return Returns the Map map,
+   * the key is subClusterId, the value is 
FinishApplicationMasterResponse
+   */
+  public Map 
batchFinishApplicationMaster(
+  FinishApplicationMasterRequest request, String appId) {
+
+Map responseMap = new HashMap<>();
+Set subClusterIds = this.unmanagedAppMasterMap.keySet();
+
+if (subClusterIds != null && !subClusterIds.isEmpty()) {
+  ExecutorCompletionService> 
finishAppService =
+  new ExecutorCompletionService<>(this.threadpool);
+  LOG.info("Sending finish application request to {} sub-cluster RMs", 
subClusterIds.size());
+
+  for (final String subClusterId : subClusterIds) {
+finishAppService.submit(() -> {
+  LOG.info("Sending finish application request to RM {}", 
subClusterId);
+  FinishApplicationMasterResponse uamResponse = null;
+  try {
+uamResponse = finishApplicationMaster(subClusterId, request);

Review Comment:
   Thanks for your suggestion, the code looks very good, I will modify it.



##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/amrmproxy/TestFederationInterceptor.java:
##
@@ -969,4 +969,58 @@ private PreemptionMessage createDummyPreemptionMessage(
 preemptionMessage.setContract(contract);
 return preemptionMessage;
   }
+
+  @Test
+  public void testBatchFinishApplicationMaster() throws IOException, 
InterruptedException {
+
+final RegisterApplicationMasterRequest registerReq =
+Records.newRecord(RegisterApplicationMasterRequest.class);
+registerReq.setHost(Integer.toString(testAppId));
+registerReq.setRpcPort(testAppId);
+registerReq.setTrackingUrl("");
+
+UserGroupInformation ugi = 
interceptor.getUGIWithToken(interceptor.getAttemptId());
+
+ugi.doAs((PrivilegedExceptionAction) () -> {
+
+  // Register the application
+  RegisterApplicationMasterRequest registerReq1 =
+  Records.newRecord(RegisterApplicationMasterRequest.class);
+  registerReq1.setHost(Integer.toString(testAppId));
+  registerReq1.setRpcPort(0);
+  registerReq1.setTrackingUrl("");
+
+  // Register ApplicationMaster
+  RegisterApplicationMasterResponse registerResponse =
+  interceptor.registerApplicationMaster(registerReq1);
+  Assert.assertNotNull(registerResponse);
+  lastResponseId = 0;
+
+  Assert.assertEquals(0, interceptor.getUnmanagedAMPoolSize());
+
+  // Allocate the first batch of containers, with sc1 and sc2 active
+  registerSubCluster(SubClusterId.newInstance("SC-1"));
+  registerSubCluster(SubClusterId.newInstance("SC-2"));
+
+  int numberOfContainers = 3;
+  List containers =
+  getContainersAndAssert(numberOfContainers, numberOfContainers * 2);
+  Assert.assertEquals(2, interceptor.getUnmanagedAMPoolSize());
+  Assert.assertEquals(numberOfContainers * 2, containers.size());
+
+  // Finish the application
+  FinishApplicationMasterRequest finishReq =
+  Records.newRecord(FinishApplicationMasterRequest.class);
+  finishReq.setDiagnostics("");
+  finishReq.setTrackingUrl("");
+  finishReq.setFinalApplicationStatus(FinalApplicationStatus.SUCCEEDED);
+
+  FinishApplicationMasterResponse finshResponse =

Review Comment:
   I will fix it.





> [Federation] Add batchFinishApplicationMaster in UAMPoolManager
> ---
>
> Key: YARN-11275
> URL: https://issues.apache.org/jira/browse/YARN-11275
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: federation, nodemanager
>Affects Versions: 3.4.0
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h

[jira] [Commented] (YARN-11275) [Federation] Add batchFinishApplicationMaster in UAMPoolManager

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584521#comment-17584521
 ] 

ASF GitHub Bot commented on YARN-11275:
---

goiri commented on code in PR #4792:
URL: https://github.com/apache/hadoop/pull/4792#discussion_r954407506


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/amrmproxy/TestFederationInterceptor.java:
##
@@ -969,4 +969,58 @@ private PreemptionMessage createDummyPreemptionMessage(
 preemptionMessage.setContract(contract);
 return preemptionMessage;
   }
+
+  @Test
+  public void testBatchFinishApplicationMaster() throws IOException, 
InterruptedException {
+
+final RegisterApplicationMasterRequest registerReq =
+Records.newRecord(RegisterApplicationMasterRequest.class);
+registerReq.setHost(Integer.toString(testAppId));
+registerReq.setRpcPort(testAppId);
+registerReq.setTrackingUrl("");
+
+UserGroupInformation ugi = 
interceptor.getUGIWithToken(interceptor.getAttemptId());
+
+ugi.doAs((PrivilegedExceptionAction) () -> {
+
+  // Register the application
+  RegisterApplicationMasterRequest registerReq1 =
+  Records.newRecord(RegisterApplicationMasterRequest.class);
+  registerReq1.setHost(Integer.toString(testAppId));
+  registerReq1.setRpcPort(0);
+  registerReq1.setTrackingUrl("");
+
+  // Register ApplicationMaster
+  RegisterApplicationMasterResponse registerResponse =
+  interceptor.registerApplicationMaster(registerReq1);
+  Assert.assertNotNull(registerResponse);
+  lastResponseId = 0;
+
+  Assert.assertEquals(0, interceptor.getUnmanagedAMPoolSize());
+
+  // Allocate the first batch of containers, with sc1 and sc2 active
+  registerSubCluster(SubClusterId.newInstance("SC-1"));
+  registerSubCluster(SubClusterId.newInstance("SC-2"));
+
+  int numberOfContainers = 3;
+  List containers =
+  getContainersAndAssert(numberOfContainers, numberOfContainers * 2);
+  Assert.assertEquals(2, interceptor.getUnmanagedAMPoolSize());
+  Assert.assertEquals(numberOfContainers * 2, containers.size());
+
+  // Finish the application
+  FinishApplicationMasterRequest finishReq =
+  Records.newRecord(FinishApplicationMasterRequest.class);
+  finishReq.setDiagnostics("");
+  finishReq.setTrackingUrl("");
+  finishReq.setFinalApplicationStatus(FinalApplicationStatus.SUCCEEDED);
+
+  FinishApplicationMasterResponse finshResponse =

Review Comment:
   Single line?



##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/uam/UnmanagedAMPoolManager.java:
##
@@ -450,4 +452,52 @@ public void drainUAMHeartbeats() {
   uam.drainHeartbeatThread();
 }
   }
+
+  /**
+   * Complete FinishApplicationMaster interface calls in batches.
+   *
+   * @param request FinishApplicationMasterRequest
+   * @param appId application Id
+   * @return Returns the Map map,
+   * the key is subClusterId, the value is 
FinishApplicationMasterResponse
+   */
+  public Map 
batchFinishApplicationMaster(
+  FinishApplicationMasterRequest request, String appId) {
+
+Map responseMap = new HashMap<>();
+Set subClusterIds = this.unmanagedAppMasterMap.keySet();
+
+if (subClusterIds != null && !subClusterIds.isEmpty()) {
+  ExecutorCompletionService> 
finishAppService =
+  new ExecutorCompletionService<>(this.threadpool);
+  LOG.info("Sending finish application request to {} sub-cluster RMs", 
subClusterIds.size());
+
+  for (final String subClusterId : subClusterIds) {
+finishAppService.submit(() -> {
+  LOG.info("Sending finish application request to RM {}", 
subClusterId);
+  FinishApplicationMasterResponse uamResponse = null;
+  try {
+uamResponse = finishApplicationMaster(subClusterId, request);

Review Comment:
   ```
   try {
 FinishApplicationMasterResponse uamResponse = 
finishApplicationMaster(subClusterId, request);
 return Collections.singletonMap(subClusterId, uamResponse);
   } catch (Throwable e) {
 LOG.warn("Failed to finish unmanaged application master: RM address: {} 
ApplicationId: {}",
 subClusterId, appId, e);
 return Collections.singletonMap(subClusterId, null);
   }





> [Federation] Add batchFinishApplicationMaster in UAMPoolManager
> ---
>
> Key: YARN-11275
> URL: https://issues.apache.org/jira/browse/YARN-11275
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: federation, nodemanager
>Affects Versions: 3.4.0
>Reporter: fanshilun
>Assignee: fanshilun
>  

[jira] [Commented] (YARN-11177) Support getNewReservation, submitReservation, updateReservation, deleteReservation API's for Federation

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584517#comment-17584517
 ] 

ASF GitHub Bot commented on YARN-11177:
---

goiri commented on code in PR #4764:
URL: https://github.com/apache/hadoop/pull/4764#discussion_r954396015


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/main/java/org/apache/hadoop/yarn/server/router/clientrm/FederationClientInterceptor.java:
##
@@ -925,13 +1041,61 @@ public ReservationListResponse listReservations(
   @Override
   public ReservationUpdateResponse updateReservation(
   ReservationUpdateRequest request) throws YarnException, IOException {
-throw new NotImplementedException("Code is not implemented");
+
+if (request == null || request.getReservationId() == null
+|| request.getReservationDefinition() == null) {

Review Comment:
   Indentation



##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/main/java/org/apache/hadoop/yarn/server/router/clientrm/FederationClientInterceptor.java:
##
@@ -888,13 +890,127 @@ public MoveApplicationAcrossQueuesResponse 
moveApplicationAcrossQueues(
   @Override
   public GetNewReservationResponse getNewReservation(
   GetNewReservationRequest request) throws YarnException, IOException {
-throw new NotImplementedException("Code is not implemented");
+
+if (request == null) {
+  routerMetrics.incrGetNewReservationFailedRetrieved();
+  String errMsg = "Missing getNewReservation request.";
+  RouterServerUtil.logAndThrowException(errMsg, null);
+}
+
+long startTime = clock.getTime();
+Map subClustersActive =
+federationFacade.getSubClusters(true);
+
+for (int i = 0; i < numSubmitRetries; ++i) {
+  SubClusterId subClusterId = getRandomActiveSubCluster(subClustersActive);
+  LOG.info("getNewReservation try #{} on SubCluster {}.", i, subClusterId);
+  ApplicationClientProtocol clientRMProxy = 
getClientRMProxyForSubCluster(subClusterId);
+  GetNewReservationResponse response = null;
+  try {
+response = clientRMProxy.getNewReservation(request);
+if (response != null) {
+  long stopTime = clock.getTime();
+  routerMetrics.succeededGetNewReservationRetrieved(stopTime - 
startTime);
+  return response;
+}
+  } catch (Exception e) {
+LOG.warn("Unable to create a new Reservation in SubCluster {}.", 
subClusterId.getId(), e);
+subClustersActive.remove(subClusterId);
+  }
+}
+
+routerMetrics.incrGetNewReservationFailedRetrieved();
+String errMsg = "Failed to create a new reservation.";
+throw new YarnException(errMsg);
   }
 
   @Override
   public ReservationSubmissionResponse submitReservation(
   ReservationSubmissionRequest request) throws YarnException, IOException {
-throw new NotImplementedException("Code is not implemented");
+
+if (request == null || request.getReservationId() == null
+|| request.getReservationDefinition() == null || 
request.getQueue() == null) {
+  routerMetrics.incrSubmitReservationFailedRetrieved();
+  RouterServerUtil.logAndThrowException(
+  "Missing submitReservation request or reservationId " +
+   "or reservation definition or queue.", null);
+}
+
+long startTime = clock.getTime();
+ReservationId reservationId = request.getReservationId();
+
+long retryCount = 0;
+boolean firstRetry = true;
+
+while (retryCount < numSubmitRetries) {
+
+  SubClusterId subClusterId = 
policyFacade.getReservationHomeSubCluster(request);
+  LOG.info("submitReservation reservationId {} try #{} on SubCluster {}.",
+  reservationId, retryCount, subClusterId);
+
+  ReservationHomeSubCluster reservationHomeSubCluster =
+  ReservationHomeSubCluster.newInstance(reservationId, subClusterId);
+
+  // If it is the first attempt,use StateStore to add the
+  // mapping of reservationId and subClusterId.
+  // if the number of attempts is greater than 1, use StateStore to update 
the mapping.
+  if (firstRetry) {
+try {
+  // persist the mapping of reservationId and the subClusterId which 
has
+  // been selected as its home
+  subClusterId = 
federationFacade.addReservationHomeSubCluster(reservationHomeSubCluster);
+  firstRetry = false;
+} catch (YarnException e) {
+  routerMetrics.incrSubmitReservationFailedRetrieved();
+  RouterServerUtil.logAndThrowException(e,
+  "Unable to insert the ReservationId %s into the 
FederationStateStore.",
+   reservationId);
+}
+  } else {
+try {
+  // update the mapping of reservationId and the home subClusterId to
+  // the new subClusterId we have selected
+  

[jira] [Commented] (YARN-11277) trigger deletion of log-dir by size for NonAggregatingLogHandler

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1758#comment-1758
 ] 

ASF GitHub Bot commented on YARN-11277:
---

hadoop-yetus commented on PR #4797:
URL: https://github.com/apache/hadoop/pull/4797#issuecomment-1226185659

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 38s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 45s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  26m 59s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  10m 34s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   9m 13s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   2m  3s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m 53s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   3m 37s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   3m 29s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   6m 43s |  |  trunk passed  |
   | -1 :x: |  shadedclient  |   3m 55s |  |  branch has errors when building 
and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 30s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 11s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   9m  6s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   9m  5s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   8m 30s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   8m 30s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   1m 57s | 
[/results-checkstyle-hadoop-yarn-project_hadoop-yarn.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4797/2/artifact/out/results-checkstyle-hadoop-yarn-project_hadoop-yarn.txt)
 |  hadoop-yarn-project/hadoop-yarn: The patch generated 2 new + 187 unchanged 
- 34 fixed = 189 total (was 221)  |
   | +1 :green_heart: |  mvnsite  |   3m 16s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   3m 12s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   3m 15s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   6m 31s |  |  the patch passed  |
   | -1 :x: |  shadedclient  |   3m 35s |  |  patch has errors when building 
and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   1m 20s |  |  hadoop-yarn-api in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   4m 54s |  |  hadoop-yarn-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |  25m  2s |  |  hadoop-yarn-server-nodemanager 
in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m  4s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 163m 22s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4797/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4797 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint |
   | uname | Linux 6252cc763226 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 
01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 9d16fc2482b0d057d5ba635c67c9c1c081864535 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/ja

[jira] [Commented] (YARN-9708) Yarn Router Support DelegationToken

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584421#comment-17584421
 ] 

ASF GitHub Bot commented on YARN-9708:
--

hadoop-yetus commented on PR #4746:
URL: https://github.com/apache/hadoop/pull/4746#issuecomment-1226102617

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m  1s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  1s |  |  xmllint was not available.  |
   | +0 :ok: |  buf  |   0m  1s |  |  buf was not available.  |
   | +0 :ok: |  buf  |   0m  1s |  |  buf was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 5 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m  5s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  28m 42s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  10m 42s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   9m 10s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   2m  6s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   4m 30s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   4m 10s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   3m 53s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   7m 23s |  |  trunk passed  |
   | -1 :x: |  shadedclient  |   2m 50s |  |  branch has errors when building 
and testing our client artifacts.  |
   | -0 :warning: |  patch  |   3m 15s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 27s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 39s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   9m 56s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  cc  |   9m 56s |  |  the patch passed  |
   | -1 :x: |  javac  |   9m 56s | 
[/results-compile-javac-hadoop-yarn-project_hadoop-yarn-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4746/9/artifact/out/results-compile-javac-hadoop-yarn-project_hadoop-yarn-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  
hadoop-yarn-project_hadoop-yarn-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
 with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 generated 1 new + 
740 unchanged - 0 fixed = 741 total (was 740)  |
   | +1 :green_heart: |  compile  |   9m  1s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  cc  |   9m  1s |  |  the patch passed  |
   | -1 :x: |  javac  |   9m  1s | 
[/results-compile-javac-hadoop-yarn-project_hadoop-yarn-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4746/9/artifact/out/results-compile-javac-hadoop-yarn-project_hadoop-yarn-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  
hadoop-yarn-project_hadoop-yarn-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
 with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 generated 3 new 
+ 649 unchanged - 2 fixed = 652 total (was 651)  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   1m 50s | 
[/results-checkstyle-hadoop-yarn-project_hadoop-yarn.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4746/9/artifact/out/results-checkstyle-hadoop-yarn-project_hadoop-yarn.txt)
 |  hadoop-yarn-project/hadoop-yarn: The patch generated 3 new + 26 unchanged - 
2 fixed = 29 total (was 28)  |
   | +1 :green_heart: |  mvnsite  |   4m  3s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   3m 42s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   3m 36s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs

[jira] [Commented] (YARN-11276) Add lru cache for RMWebServices.getApps

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584391#comment-17584391
 ] 

ASF GitHub Bot commented on YARN-11276:
---

hadoop-yetus commented on PR #4793:
URL: https://github.com/apache/hadoop/pull/4793#issuecomment-1226033589

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 48s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 29s |  |  Maven dependency ordering for branch  |
   | -1 :x: |  mvninstall  |   0m 30s | 
[/branch-mvninstall-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4793/9/artifact/out/branch-mvninstall-root.txt)
 |  root in trunk failed.  |
   | -1 :x: |  compile  |   0m 29s | 
[/branch-compile-hadoop-yarn-project_hadoop-yarn-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4793/9/artifact/out/branch-compile-hadoop-yarn-project_hadoop-yarn-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  hadoop-yarn in trunk failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | -1 :x: |  compile  |   0m 29s | 
[/branch-compile-hadoop-yarn-project_hadoop-yarn-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4793/9/artifact/out/branch-compile-hadoop-yarn-project_hadoop-yarn-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  hadoop-yarn in trunk failed with JDK Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.  |
   | -0 :warning: |  checkstyle  |   0m 30s | 
[/buildtool-branch-checkstyle-hadoop-yarn-project_hadoop-yarn.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4793/9/artifact/out/buildtool-branch-checkstyle-hadoop-yarn-project_hadoop-yarn.txt)
 |  The patch fails to run checkstyle in hadoop-yarn  |
   | -1 :x: |  mvnsite  |   0m 29s | 
[/branch-mvnsite-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-api.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4793/9/artifact/out/branch-mvnsite-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-api.txt)
 |  hadoop-yarn-api in trunk failed.  |
   | -1 :x: |  mvnsite  |   0m 29s | 
[/branch-mvnsite-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4793/9/artifact/out/branch-mvnsite-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common.txt)
 |  hadoop-yarn-common in trunk failed.  |
   | -1 :x: |  mvnsite  |   0m 29s | 
[/branch-mvnsite-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4793/9/artifact/out/branch-mvnsite-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt)
 |  hadoop-yarn-server-resourcemanager in trunk failed.  |
   | -1 :x: |  javadoc  |   0m 29s | 
[/branch-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-api-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4793/9/artifact/out/branch-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-api-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  hadoop-yarn-api in trunk failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | -1 :x: |  javadoc  |   0m 28s | 
[/branch-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4793/9/artifact/out/branch-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-common-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  hadoop-yarn-common in trunk failed with JDK Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.  |
   | -1 :x: |  javadoc  |   0m 29s | 
[/branch-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4793/9/artifact/out/branch-javadoc-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  hado

[jira] [Commented] (YARN-11177) Support getNewReservation, submitReservation, updateReservation, deleteReservation API's for Federation

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584379#comment-17584379
 ] 

ASF GitHub Bot commented on YARN-11177:
---

hadoop-yetus commented on PR #4764:
URL: https://github.com/apache/hadoop/pull/4764#issuecomment-1226005145

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m 13s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  2s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  2s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  2s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 6 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 18s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  28m 34s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   5m 15s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   4m 14s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 31s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   3m  2s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 53s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 25s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   5m  3s |  |  trunk passed  |
   | -1 :x: |  shadedclient  |   2m  7s |  |  branch has errors when building 
and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 27s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m  3s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   4m 26s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   4m 26s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   3m 19s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   3m 19s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m 14s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   2m 14s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 49s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 38s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   4m 37s |  |  the patch passed  |
   | -1 :x: |  shadedclient  |   1m 46s |  |  patch has errors when building 
and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 59s |  |  hadoop-yarn-server-common in 
the patch passed.  |
   | +1 :green_heart: |  unit  | 103m 21s |  |  
hadoop-yarn-server-resourcemanager in the patch passed.  |
   | +1 :green_heart: |  unit  |   3m 32s |  |  hadoop-yarn-server-router in 
the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 38s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 208m 22s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4764/9/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4764 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint |
   | uname | Linux 5b5897d42341 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 
01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 2c702f84ff006dfa5ce4765d561e02ca4bd488ad |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4764/9/testReport/ 

[jira] [Commented] (YARN-9708) Yarn Router Support DelegationToken

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584355#comment-17584355
 ] 

ASF GitHub Bot commented on YARN-9708:
--

goiri commented on code in PR #4746:
URL: https://github.com/apache/hadoop/pull/4746#discussion_r954007246


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/security/client/impl/pb/package-info.java:
##
@@ -0,0 +1,20 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+@InterfaceAudience.Public
+package org.apache.hadoop.yarn.security.client.impl.pb;
+import org.apache.hadoop.classification.InterfaceAudience;

Review Comment:
   Import Public



##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/federation/store/impl/MemoryFederationStateStore.java:
##
@@ -395,4 +535,17 @@ public DeleteReservationHomeSubClusterResponse 
deleteReservationHomeSubCluster(
 reservations.remove(reservationId);
 return DeleteReservationHomeSubClusterResponse.newInstance();
   }
-}
+
+  /**
+   * Get DelegationKey By based on MasterKey.
+   *
+   * @param masterKey masterKey
+   * @return DelegationKey
+   */
+  private DelegationKey getDelegationKeyByMasterKey(RouterMasterKey masterKey) 
{

Review Comment:
   static?





> Yarn Router Support DelegationToken
> ---
>
> Key: YARN-9708
> URL: https://issues.apache.org/jira/browse/YARN-9708
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: router
>Affects Versions: 3.1.1
>Reporter: Xie YiFan
>Assignee: fanshilun
>Priority: Minor
>  Labels: pull-request-available
> Attachments: Add_getDelegationToken_and_SecureLogin_in_router.patch, 
> RMDelegationTokenSecretManager_storeNewMasterKey.svg, 
> RouterDelegationTokenSecretManager_storeNewMasterKey.svg
>
>
> 1.we use router as proxy to manage multiple cluster which be independent of 
> each other in order to apply unified client. Thus, we implement our 
> customized AMRMProxyPolicy that doesn't broadcast ResourceRequest to other 
> cluster.
> 2.Our production environment need kerberos. But router doesn't support 
> SecureLogin for now.
> https://issues.apache.org/jira/browse/YARN-6539 desn't work. So we 
> improvement it.
> 3.Some framework like oozie would get Token via yarnclient#getDelegationToken 
> which router doesn't support. Our solution is that adding homeCluster to 
> ApplicationSubmissionContextProto & GetDelegationTokenRequestProto. Job would 
> be submitted with specified clusterid so that router knows which cluster to 
> submit this job. Router would get Token from one RM according to specified 
> clusterid when client call getDelegation meanwhile apply some mechanism to 
> save this token in memory.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11177) Support getNewReservation, submitReservation, updateReservation, deleteReservation API's for Federation

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584345#comment-17584345
 ] 

ASF GitHub Bot commented on YARN-11177:
---

goiri commented on code in PR #4764:
URL: https://github.com/apache/hadoop/pull/4764#discussion_r954003678


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/main/java/org/apache/hadoop/yarn/server/router/clientrm/FederationClientInterceptor.java:
##
@@ -925,13 +1041,61 @@ public ReservationListResponse listReservations(
   @Override
   public ReservationUpdateResponse updateReservation(
   ReservationUpdateRequest request) throws YarnException, IOException {
-throw new NotImplementedException("Code is not implemented");
+
+if (request == null || request.getReservationId() == null
+|| request.getReservationDefinition() == null) {
+  routerMetrics.incrUpdateReservationFailedRetrieved();
+  RouterServerUtil.logAndThrowException(
+  "Missing updateReservation request or reservationId or reservation 
definition.", null);
+}
+
+long startTime = clock.getTime();
+ReservationId reservationId = request.getReservationId();
+SubClusterId subClusterId = getReservationHomeSubCluster(reservationId);
+
+ApplicationClientProtocol client;

Review Comment:
   Declare in the try



##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/main/java/org/apache/hadoop/yarn/server/router/clientrm/FederationClientInterceptor.java:
##
@@ -888,13 +890,127 @@ public MoveApplicationAcrossQueuesResponse 
moveApplicationAcrossQueues(
   @Override
   public GetNewReservationResponse getNewReservation(
   GetNewReservationRequest request) throws YarnException, IOException {
-throw new NotImplementedException("Code is not implemented");
+
+if (request == null) {
+  routerMetrics.incrGetNewReservationFailedRetrieved();
+  String errMsg = "Missing getNewReservation request.";
+  RouterServerUtil.logAndThrowException(errMsg, null);
+}
+
+long startTime = clock.getTime();
+Map subClustersActive =
+federationFacade.getSubClusters(true);
+
+for (int i = 0; i < numSubmitRetries; ++i) {
+  SubClusterId subClusterId = getRandomActiveSubCluster(subClustersActive);
+  LOG.info("getNewReservation try #{} on SubCluster {}.", i, subClusterId);
+  ApplicationClientProtocol clientRMProxy = 
getClientRMProxyForSubCluster(subClusterId);
+  GetNewReservationResponse response = null;
+  try {
+response = clientRMProxy.getNewReservation(request);
+if (response != null) {
+  long stopTime = clock.getTime();
+  routerMetrics.succeededGetNewReservationRetrieved(stopTime - 
startTime);
+  return response;
+}
+  } catch (Exception e) {
+LOG.warn("Unable to create a new Reservation in SubCluster {}.", 
subClusterId.getId(), e);
+subClustersActive.remove(subClusterId);
+  }
+}
+
+routerMetrics.incrGetNewReservationFailedRetrieved();
+String errMsg = "Failed to create a new reservation.";
+throw new YarnException(errMsg);
   }
 
   @Override
   public ReservationSubmissionResponse submitReservation(
   ReservationSubmissionRequest request) throws YarnException, IOException {
-throw new NotImplementedException("Code is not implemented");
+
+if (request == null || request.getReservationId() == null
+|| request.getReservationDefinition() == null || 
request.getQueue() == null) {

Review Comment:
   indentation



##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/main/java/org/apache/hadoop/yarn/server/router/clientrm/FederationClientInterceptor.java:
##
@@ -925,13 +1041,61 @@ public ReservationListResponse listReservations(
   @Override
   public ReservationUpdateResponse updateReservation(
   ReservationUpdateRequest request) throws YarnException, IOException {
-throw new NotImplementedException("Code is not implemented");
+
+if (request == null || request.getReservationId() == null
+|| request.getReservationDefinition() == null) {
+  routerMetrics.incrUpdateReservationFailedRetrieved();
+  RouterServerUtil.logAndThrowException(
+  "Missing updateReservation request or reservationId or reservation 
definition.", null);
+}
+
+long startTime = clock.getTime();
+ReservationId reservationId = request.getReservationId();
+SubClusterId subClusterId = getReservationHomeSubCluster(reservationId);
+
+ApplicationClientProtocol client;
+ReservationUpdateResponse response = null;
+try {
+  client = getClientRMProxyForSubCluster(subClusterId);
+  response = client.updateReservation(request);
+} catch (Exception ex) {
+  routerMetrics.incrUpdateReservationFailedRetrieved

[jira] [Commented] (YARN-11253) Add Configuration to delegationToken RemoverScanInterval

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584341#comment-17584341
 ] 

ASF GitHub Bot commented on YARN-11253:
---

goiri commented on code in PR #4751:
URL: https://github.com/apache/hadoop/pull/4751#discussion_r953999366


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/RMSecretManagerService.java:
##
@@ -135,9 +135,11 @@ protected RMDelegationTokenSecretManager 
createRMDelegationTokenSecretManager(
 long tokenRenewInterval =
 conf.getLong(YarnConfiguration.RM_DELEGATION_TOKEN_RENEW_INTERVAL_KEY,
 YarnConfiguration.RM_DELEGATION_TOKEN_RENEW_INTERVAL_DEFAULT);
-
+long removeScanInterval =
+
conf.getLong(YarnConfiguration.RM_DELEGATION_TOKEN_REMOVE_SCAN_INTERVAL_KEY,

Review Comment:
   We should do getTimeDuration



##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/resources/yarn-default.xml:
##
@@ -1077,6 +1077,14 @@
 8640
   
 
+  
+
+  RM delegation token remove-scan interval in ms
+
+yarn.resourcemanager.delegation.token.remove-scan-interval
+360

Review Comment:
   10h and use time duration





> Add Configuration to delegationToken RemoverScanInterval
> 
>
> Key: YARN-11253
> URL: https://issues.apache.org/jira/browse/YARN-11253
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.4.0, 3.3.4
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Major
>  Labels: pull-request-available
>
> When reading the code, I found the case of hard coding, I think the 
> parameters should be abstracted into the configuration.
> org.apache.hadoop.yarn.server.resourcemanager.RMSecretManagerService#
> createRMDelegationTokenSecretManager
> {code:java}
> protected RMDelegationTokenSecretManager 
> createRMDelegationTokenSecretManager(Configuration conf, RMContext rmContext) 
> {  
>// . 360 This hard code should be extracted    
>return new RMDelegationTokenSecretManager(secretKeyInterval, 
> tokenMaxLifetime, tokenRenewInterval, 360, rmContext); 
> } 
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11276) Add lru cache for RMWebServices.getApps

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584324#comment-17584324
 ] 

ASF GitHub Bot commented on YARN-11276:
---

hadoop-yetus commented on PR #4793:
URL: https://github.com/apache/hadoop/pull/4793#issuecomment-1225904734

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 47s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  1s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  16m  2s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  27m  6s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  10m 12s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   9m  0s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 59s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   4m 17s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   4m  1s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   3m 34s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   7m 15s |  |  trunk passed  |
   | -1 :x: |  shadedclient  |   4m 18s |  |  branch has errors when building 
and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 29s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 27s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   9m 49s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   9m 49s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   8m 51s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   8m 51s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   1m 58s | 
[/results-checkstyle-hadoop-yarn-project_hadoop-yarn.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4793/8/artifact/out/results-checkstyle-hadoop-yarn-project_hadoop-yarn.txt)
 |  hadoop-yarn-project/hadoop-yarn: The patch generated 5 new + 174 unchanged 
- 0 fixed = 179 total (was 174)  |
   | +1 :green_heart: |  mvnsite  |   3m 45s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   3m 24s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   3m 16s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   7m 44s |  |  the patch passed  |
   | -1 :x: |  shadedclient  |   4m  0s |  |  patch has errors when building 
and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   1m 24s |  |  hadoop-yarn-api in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   5m  5s |  |  hadoop-yarn-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  | 100m 56s |  |  
hadoop-yarn-server-resourcemanager in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 23s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 245m 25s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4793/8/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4793 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint |
   | uname | Linux 8f1f431fd5a6 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 
01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / fc010ab080bde9b23887a0a3a757eda9afd657ff |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm

[jira] [Commented] (YARN-11247) Remove unused classes introduced by YARN-9615

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584316#comment-17584316
 ] 

ASF GitHub Bot commented on YARN-11247:
---

slfan1989 commented on PR #4720:
URL: https://github.com/apache/hadoop/pull/4720#issuecomment-1225870451

   @ayushtkn Can you help review this pr? I want to delete 
DisableEventTypeMetrics.java because this class is not used.
   
   Thank you very much!




> Remove unused classes introduced by YARN-9615
> -
>
> Key: YARN-11247
> URL: https://issues.apache.org/jira/browse/YARN-11247
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: resourcemanager
>Affects Versions: 3.4.0
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Minor
>  Labels: pull-request-available
> Attachments: DisableEventTypeMetrics-Not used.png
>
>
> YARN-9615 adds Metric to RM's dispatcher, but the patch introduces a class 
> without any usage
> org.apache.hadoop.yarn.metrics#DisableEventTypeMetrics
> 1. Without any code references
> 2. Without any test code references
> 3. Delete this class, the local can be compiled successfully
> I think this class can be removed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11240) Fix incorrect placeholder in yarn-module

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584287#comment-17584287
 ] 

ASF GitHub Bot commented on YARN-11240:
---

ayushtkn merged PR #4678:
URL: https://github.com/apache/hadoop/pull/4678




> Fix incorrect placeholder in yarn-module
> 
>
> Key: YARN-11240
> URL: https://issues.apache.org/jira/browse/YARN-11240
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.4.0
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Try to deal with the moudle problem at a time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11240) Fix incorrect placeholder in yarn-module

2022-08-24 Thread Ayush Saxena (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584288#comment-17584288
 ] 

Ayush Saxena commented on YARN-11240:
-

Committed to trunk.

Thanx [~slfan1989] for the contribution!!!

> Fix incorrect placeholder in yarn-module
> 
>
> Key: YARN-11240
> URL: https://issues.apache.org/jira/browse/YARN-11240
> Project: Hadoop YARN
>  Issue Type: Improvement
>Affects Versions: 3.4.0
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Try to deal with the moudle problem at a time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11219) [Federation] Add getAppActivities, getAppStatistics REST APIs for Router

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584279#comment-17584279
 ] 

ASF GitHub Bot commented on YARN-11219:
---

slfan1989 commented on PR #4757:
URL: https://github.com/apache/hadoop/pull/4757#issuecomment-1225806212

   @goiri Please help to review the code again, Thank you very much!




> [Federation] Add getAppActivities, getAppStatistics REST APIs for Router
> 
>
> Key: YARN-11219
> URL: https://issues.apache.org/jira/browse/YARN-11219
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: federation
>Affects Versions: 3.4.0
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11177) Support getNewReservation, submitReservation, updateReservation, deleteReservation API's for Federation

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584278#comment-17584278
 ] 

ASF GitHub Bot commented on YARN-11177:
---

slfan1989 commented on PR #4764:
URL: https://github.com/apache/hadoop/pull/4764#issuecomment-122580

   @goiri Please help to review the code again, Thank you very much!




> Support getNewReservation, submitReservation, updateReservation, 
> deleteReservation API's for Federation
> ---
>
> Key: YARN-11177
> URL: https://issues.apache.org/jira/browse/YARN-11177
> Project: Hadoop YARN
>  Issue Type: Sub-task
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9708) Yarn Router Support DelegationToken

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584277#comment-17584277
 ] 

ASF GitHub Bot commented on YARN-9708:
--

slfan1989 commented on PR #4746:
URL: https://github.com/apache/hadoop/pull/4746#issuecomment-1225805056

   @goiri Please help to review the code again, Thank you very much!




> Yarn Router Support DelegationToken
> ---
>
> Key: YARN-9708
> URL: https://issues.apache.org/jira/browse/YARN-9708
> Project: Hadoop YARN
>  Issue Type: New Feature
>  Components: router
>Affects Versions: 3.1.1
>Reporter: Xie YiFan
>Assignee: fanshilun
>Priority: Minor
>  Labels: pull-request-available
> Attachments: Add_getDelegationToken_and_SecureLogin_in_router.patch, 
> RMDelegationTokenSecretManager_storeNewMasterKey.svg, 
> RouterDelegationTokenSecretManager_storeNewMasterKey.svg
>
>
> 1.we use router as proxy to manage multiple cluster which be independent of 
> each other in order to apply unified client. Thus, we implement our 
> customized AMRMProxyPolicy that doesn't broadcast ResourceRequest to other 
> cluster.
> 2.Our production environment need kerberos. But router doesn't support 
> SecureLogin for now.
> https://issues.apache.org/jira/browse/YARN-6539 desn't work. So we 
> improvement it.
> 3.Some framework like oozie would get Token via yarnclient#getDelegationToken 
> which router doesn't support. Our solution is that adding homeCluster to 
> ApplicationSubmissionContextProto & GetDelegationTokenRequestProto. Job would 
> be submitted with specified clusterid so that router knows which cluster to 
> submit this job. Router would get Token from one RM according to specified 
> clusterid when client call getDelegation meanwhile apply some mechanism to 
> save this token in memory.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11253) Add Configuration to delegationToken RemoverScanInterval

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584265#comment-17584265
 ] 

ASF GitHub Bot commented on YARN-11253:
---

hadoop-yetus commented on PR #4751:
URL: https://github.com/apache/hadoop/pull/4751#issuecomment-1225776565

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 39s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  1s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 22s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m 33s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   9m 46s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   8m 36s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 58s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   4m  9s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   3m 54s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   3m 37s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   7m  7s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 42s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 33s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 18s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  10m 29s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |  10m 29s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  11m 43s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |  11m 43s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   2m 32s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   4m  7s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   3m 38s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   3m 47s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   9m 13s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  23m  5s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   1m 26s |  |  hadoop-yarn-api in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   5m 14s |  |  hadoop-yarn-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |  98m 49s |  |  
hadoop-yarn-server-resourcemanager in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m  8s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 282m 58s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4751/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4751 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint |
   | uname | Linux 276b62cfb077 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 
01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / d1a76acfd00f335894b079f277b46d692b19fea3 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4751/3/te

[jira] [Commented] (YARN-11276) Add lru cache for RMWebServices.getApps

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584256#comment-17584256
 ] 

ASF GitHub Bot commented on YARN-11276:
---

hadoop-yetus commented on PR #4793:
URL: https://github.com/apache/hadoop/pull/4793#issuecomment-1225758308

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 41s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 13s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m 30s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   9m 49s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   8m 46s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   2m 14s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   4m  5s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   3m 54s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   3m 24s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   6m 52s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m  0s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 30s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 20s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   9m 18s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   9m 18s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   9m 25s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   9m 25s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   2m 11s | 
[/results-checkstyle-hadoop-yarn-project_hadoop-yarn.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4793/7/artifact/out/results-checkstyle-hadoop-yarn-project_hadoop-yarn.txt)
 |  hadoop-yarn-project/hadoop-yarn: The patch generated 5 new + 174 unchanged 
- 0 fixed = 179 total (was 174)  |
   | +1 :green_heart: |  mvnsite  |   4m 31s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   4m  9s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   4m  2s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   7m 18s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 49s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   1m 26s |  |  hadoop-yarn-api in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   5m 21s |  |  hadoop-yarn-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  |  98m 37s |  |  
hadoop-yarn-server-resourcemanager in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 13s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 277m 58s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4793/7/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4793 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint |
   | uname | Linux cf5dc21d78c6 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 
01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / e91bc33f4aef663c62778b5ba4c99972b96cb0a7 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-

[jira] [Commented] (YARN-11275) [Federation] Add batchFinishApplicationMaster in UAMPoolManager

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584249#comment-17584249
 ] 

ASF GitHub Bot commented on YARN-11275:
---

slfan1989 commented on PR #4792:
URL: https://github.com/apache/hadoop/pull/4792#issuecomment-1225727281

   @goiri Please help to review the code again, Thank you very much!




> [Federation] Add batchFinishApplicationMaster in UAMPoolManager
> ---
>
> Key: YARN-11275
> URL: https://issues.apache.org/jira/browse/YARN-11275
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: federation, nodemanager
>Affects Versions: 3.4.0
>Reporter: fanshilun
>Assignee: fanshilun
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11277) trigger deletion of log-dir by size for NonAggregatingLogHandler

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584225#comment-17584225
 ] 

ASF GitHub Bot commented on YARN-11277:
---

slfan1989 commented on code in PR #4797:
URL: https://github.com/apache/hadoop/pull/4797#discussion_r953721092


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/loghandler/NonAggregatingLogHandler.java:
##
@@ -190,13 +196,24 @@ public void handle(LogHandlerEvent event) {
 } catch (IOException e) {
   LOG.error("Unable to record log deleter state", e);
 }
-try {
-  sched.schedule(logDeleter, this.deleteDelaySeconds,
-  TimeUnit.SECONDS);
-} catch (RejectedExecutionException e) {
-  // Handling this event in local thread before starting threads
-  // or after calling sched.shutdownNow().
+//delete no delay if log size exceed deleteThresholdMb
+if (enableTriggerDeleteBySize && appLogSize >= deleteThresholdMb * 
BYTES_PER_MB) {
+  LOG.info("Log Deletion for application: "
+   + appId + ", with no delay, size=" + appLogSize);

Review Comment:
   indentation?



##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/loghandler/NonAggregatingLogHandler.java:
##
@@ -190,13 +196,24 @@ public void handle(LogHandlerEvent event) {
 } catch (IOException e) {
   LOG.error("Unable to record log deleter state", e);
 }
-try {
-  sched.schedule(logDeleter, this.deleteDelaySeconds,
-  TimeUnit.SECONDS);
-} catch (RejectedExecutionException e) {
-  // Handling this event in local thread before starting threads
-  // or after calling sched.shutdownNow().
+//delete no delay if log size exceed deleteThresholdMb
+if (enableTriggerDeleteBySize && appLogSize >= deleteThresholdMb * 
BYTES_PER_MB) {
+  LOG.info("Log Deletion for application: "
+   + appId + ", with no delay, size=" + appLogSize);
   logDeleter.run();
+} else {
+  // Schedule - so that logs are available on the UI till they're 
deleted.
+  LOG.info("Scheduling Log Deletion for application: "
+   + appId + ", with delay of "

Review Comment:
   indentation?





> trigger deletion of log-dir by size for NonAggregatingLogHandler
> 
>
> Key: YARN-11277
> URL: https://issues.apache.org/jira/browse/YARN-11277
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.4.0
>Reporter: Xianming Lei
>Priority: Minor
>  Labels: pull-request-available
>
> In our yarn cluster, the log files of some containers are too large, which 
> causes the NodeManager to frequently switch to the unhealthy state. For logs 
> that are too large, we can consider deleting them directly without delaying 
> yarn.nodemanager.log.retain-seconds.
> Cluster environment:
>  # 8k nodes+
>  # 50w+ apps  / day
> Configuration:
>  # yarn.nodemanager.log.retain-seconds=3days
>  # yarn.log-aggregation-enable=false
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11277) trigger deletion of log-dir by size for NonAggregatingLogHandler

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584224#comment-17584224
 ] 

ASF GitHub Bot commented on YARN-11277:
---

slfan1989 commented on code in PR #4797:
URL: https://github.com/apache/hadoop/pull/4797#discussion_r953720422


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/loghandler/NonAggregatingLogHandler.java:
##
@@ -71,7 +71,10 @@ public class NonAggregatingLogHandler extends 
AbstractService implements
   private final LocalDirsHandlerService dirsHandler;
   private final NMStateStoreService stateStore;
   private long deleteDelaySeconds;
+  private boolean enableTriggerDeleteBySize;
+  private long deleteThresholdMb;
   private ScheduledThreadPoolExecutor sched;
+  public static final int BYTES_PER_MB = 1024 * 1024;

Review Comment:
   why static ?





> trigger deletion of log-dir by size for NonAggregatingLogHandler
> 
>
> Key: YARN-11277
> URL: https://issues.apache.org/jira/browse/YARN-11277
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.4.0
>Reporter: Xianming Lei
>Priority: Minor
>  Labels: pull-request-available
>
> In our yarn cluster, the log files of some containers are too large, which 
> causes the NodeManager to frequently switch to the unhealthy state. For logs 
> that are too large, we can consider deleting them directly without delaying 
> yarn.nodemanager.log.retain-seconds.
> Cluster environment:
>  # 8k nodes+
>  # 50w+ apps  / day
> Configuration:
>  # yarn.nodemanager.log.retain-seconds=3days
>  # yarn.log-aggregation-enable=false
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11277) trigger deletion of log-dir by size for NonAggregatingLogHandler

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584205#comment-17584205
 ] 

ASF GitHub Bot commented on YARN-11277:
---

hadoop-yetus commented on PR #4797:
URL: https://github.com/apache/hadoop/pull/4797#issuecomment-1225605681

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 36s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 1 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 16s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m 15s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   9m 45s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   8m 48s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   2m  2s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 31s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   2m 39s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   2m 13s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   4m 29s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m 47s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 33s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 26s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   9m  7s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   9m  7s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   8m 27s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   8m 27s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   1m 58s | 
[/results-checkstyle-hadoop-yarn-project_hadoop-yarn.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4797/1/artifact/out/results-checkstyle-hadoop-yarn-project_hadoop-yarn.txt)
 |  hadoop-yarn-project/hadoop-yarn: The patch generated 15 new + 215 unchanged 
- 6 fixed = 230 total (was 221)  |
   | +1 :green_heart: |  mvnsite  |   2m 24s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 55s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 53s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   4m 24s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 56s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  |   1m 18s | 
[/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-api.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4797/1/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-api.txt)
 |  hadoop-yarn-api in the patch passed.  |
   | +1 :green_heart: |  unit  |  25m 44s |  |  hadoop-yarn-server-nodemanager 
in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 22s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 181m 24s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.yarn.conf.TestYarnConfigurationFields |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4797/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4797 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux c90681ec29d2 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 
01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git 

[jira] [Commented] (YARN-11277) trigger deletion of log-dir by size for NonAggregatingLogHandler

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584201#comment-17584201
 ] 

ASF GitHub Bot commented on YARN-11277:
---

leixm commented on PR #4797:
URL: https://github.com/apache/hadoop/pull/4797#issuecomment-1225598272

   If we set yarn.nodemanager.log.retain-seconds to a small value, this will 
cause normal logs to be deleted too quickly and still not solve the problem - 
this is due to some huge log files, After this pr merge, we can not only keep 
normal logs for a long time, but also exclude the impact of these huge logs on 
NodeManager. @ashutoshcipher 




> trigger deletion of log-dir by size for NonAggregatingLogHandler
> 
>
> Key: YARN-11277
> URL: https://issues.apache.org/jira/browse/YARN-11277
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.4.0
>Reporter: Xianming Lei
>Priority: Minor
>  Labels: pull-request-available
>
> In our yarn cluster, the log files of some containers are too large, which 
> causes the NodeManager to frequently switch to the unhealthy state. For logs 
> that are too large, we can consider deleting them directly without delaying 
> yarn.nodemanager.log.retain-seconds.
> Cluster environment:
>  # 8k nodes+
>  # 50w+ apps  / day
> Configuration:
>  # yarn.nodemanager.log.retain-seconds=3days
>  # yarn.log-aggregation-enable=false
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11196) NUMA Awareness support in DefaultContainerExecutor

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584197#comment-17584197
 ] 

ASF GitHub Bot commented on YARN-11196:
---

hadoop-yetus commented on PR #4742:
URL: https://github.com/apache/hadoop/pull/4742#issuecomment-1225589943

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 50s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  38m 20s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 40s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   1m 33s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   0m 47s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 55s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  5s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 52s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 48s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m 14s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 47s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 26s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   1m 26s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 19s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   1m 19s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 34s | 
[/results-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4742/18/artifact/out/results-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt)
 |  
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager:
 The patch generated 2 new + 42 unchanged - 0 fixed = 44 total (was 42)  |
   | +1 :green_heart: |  mvnsite  |   0m 43s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 36s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 35s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   1m 32s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  20m 40s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  24m 11s |  |  hadoop-yarn-server-nodemanager 
in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 56s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 122m 47s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4742/18/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4742 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 6e0912771f5e 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 
01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 6172d5fedad395f2c2465e9c073d7082c7706720 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4742

[jira] [Commented] (YARN-11276) Add lru cache for RMWebServices.getApps

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584194#comment-17584194
 ] 

ASF GitHub Bot commented on YARN-11276:
---

leixm commented on PR #4793:
URL: https://github.com/apache/hadoop/pull/4793#issuecomment-1225587335

   All fixed, thanks for your review@slfan1989 




> Add lru cache for RMWebServices.getApps
> ---
>
> Key: YARN-11276
> URL: https://issues.apache.org/jira/browse/YARN-11276
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.4.0
>Reporter: Xianming Lei
>Priority: Minor
>  Labels: pull-request-available
>
> In our YARN cluster, there are thousands of apps running at the same time, 
> the return result of getApps reaches about 10M, and many requests are the 
> same input parameters, we can add cache for RMWebServices.getApps to reduce 
> processing delay



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11276) Add lru cache for RMWebServices.getApps

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584182#comment-17584182
 ] 

ASF GitHub Bot commented on YARN-11276:
---

slfan1989 commented on PR #4793:
URL: https://github.com/apache/hadoop/pull/4793#issuecomment-1225566277

   @ayushtkn Can you help review this PR? I feel that LRU Cache can help 
improve the access performance of getApps.




> Add lru cache for RMWebServices.getApps
> ---
>
> Key: YARN-11276
> URL: https://issues.apache.org/jira/browse/YARN-11276
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.4.0
>Reporter: Xianming Lei
>Priority: Minor
>  Labels: pull-request-available
>
> In our YARN cluster, there are thousands of apps running at the same time, 
> the return result of getApps reaches about 10M, and many requests are the 
> same input parameters, we can add cache for RMWebServices.getApps to reduce 
> processing delay



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11276) Add lru cache for RMWebServices.getApps

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584180#comment-17584180
 ] 

ASF GitHub Bot commented on YARN-11276:
---

slfan1989 commented on code in PR #4793:
URL: https://github.com/apache/hadoop/pull/4793#discussion_r953651456


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/LRUCache.java:
##
@@ -0,0 +1,64 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.yarn.util;
+
+import org.apache.hadoop.classification.VisibleForTesting;
+
+import java.util.Map;
+
+public class LRUCache {
+
+  private final long expireTimeMs;
+  private final Map> cache;
+
+  public LRUCache(int capacity) {
+this(capacity, -1);
+  }
+
+  public LRUCache(int capacity, long expireTimeMs) {
+cache = new LRUCacheHashMap<>(capacity, true);
+this.expireTimeMs = expireTimeMs;
+  }
+
+  public synchronized V get(K key) {
+CacheNode cacheNode = cache.get(key);
+if (cacheNode != null) {
+  if (expireTimeMs > 0 &&
+  System.currentTimeMillis() > cacheNode.getCacheTime() + 
expireTimeMs) {

Review Comment:
   indentation ?





> Add lru cache for RMWebServices.getApps
> ---
>
> Key: YARN-11276
> URL: https://issues.apache.org/jira/browse/YARN-11276
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.4.0
>Reporter: Xianming Lei
>Priority: Minor
>  Labels: pull-request-available
>
> In our YARN cluster, there are thousands of apps running at the same time, 
> the return result of getApps reaches about 10M, and many requests are the 
> same input parameters, we can add cache for RMWebServices.getApps to reduce 
> processing delay



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Resolved] (YARN-10993) Move domain specific logic out of CapacitySchedulerConfig

2022-08-24 Thread Jira


 [ 
https://issues.apache.org/jira/browse/YARN-10993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

András Győri resolved YARN-10993.
-
Resolution: Won't Fix

> Move domain specific logic out of CapacitySchedulerConfig
> -
>
> Key: YARN-10993
> URL: https://issues.apache.org/jira/browse/YARN-10993
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Andras Gyori
>Priority: Major
>
> CapacitySchedulerConfig should contain only getters/setters and parsing 
> logic. Everything else should be moved outside of the class to its 
> appropriate location.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Created] (YARN-11278) Ambiguous error message in mutation API

2022-08-24 Thread Jira
András Győri created YARN-11278:
---

 Summary: Ambiguous error message in mutation API
 Key: YARN-11278
 URL: https://issues.apache.org/jira/browse/YARN-11278
 Project: Hadoop YARN
  Issue Type: Improvement
  Components: capacity scheduler
Reporter: András Győri


In RMWebServices#updateSchedulerConfiguration, we are checking two 
prerequisites:
{code:java}
if (scheduler instanceof MutableConfScheduler && ((MutableConfScheduler)
scheduler).isConfigurationMutable()) { {code}
However, the error message is misleading in the second case (namely if the 
configuration is not mutable eg. a FILE_CONFIGURATION_STORE)
{code:java}
} else {
  return Response.status(Status.BAD_REQUEST)
  .entity("Configuration change only supported by " +
  "MutableConfScheduler.")
  .build(); {code}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9708) Yarn Router Support DelegationToken

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584142#comment-17584142
 ] 

ASF GitHub Bot commented on YARN-9708:
--

hadoop-yetus commented on PR #4746:
URL: https://github.com/apache/hadoop/pull/4746#issuecomment-1225477736

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 52s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +0 :ok: |  buf  |   0m  0s |  |  buf was not available.  |
   | +0 :ok: |  buf  |   0m  0s |  |  buf was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 5 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  14m 58s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  28m 23s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  10m 43s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   9m  5s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   2m  3s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   4m 26s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   4m 11s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   3m 56s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   7m 25s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  24m 10s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  24m 33s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 25s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 39s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   9m 46s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  cc  |   9m 46s |  |  the patch passed  |
   | -1 :x: |  javac  |   9m 46s | 
[/results-compile-javac-hadoop-yarn-project_hadoop-yarn-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4746/8/artifact/out/results-compile-javac-hadoop-yarn-project_hadoop-yarn-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt)
 |  
hadoop-yarn-project_hadoop-yarn-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
 with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 generated 1 new + 
740 unchanged - 0 fixed = 741 total (was 740)  |
   | +1 :green_heart: |  compile  |   9m  2s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  cc  |   9m  2s |  |  the patch passed  |
   | -1 :x: |  javac  |   9m  2s | 
[/results-compile-javac-hadoop-yarn-project_hadoop-yarn-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4746/8/artifact/out/results-compile-javac-hadoop-yarn-project_hadoop-yarn-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt)
 |  
hadoop-yarn-project_hadoop-yarn-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
 with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 generated 3 new 
+ 649 unchanged - 2 fixed = 652 total (was 651)  |
   | -1 :x: |  blanks  |   0m  0s | 
[/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4746/8/artifact/out/blanks-eol.txt)
 |  The patch has 10 line(s) that end in blanks. Use git apply --whitespace=fix 
<>. Refer https://git-scm.com/docs/git-apply  |
   | -0 :warning: |  checkstyle  |   1m 49s | 
[/results-checkstyle-hadoop-yarn-project_hadoop-yarn.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4746/8/artifact/out/results-checkstyle-hadoop-yarn-project_hadoop-yarn.txt)
 |  hadoop-yarn-project/hadoop-yarn: The patch generated 8 new + 26 unchanged - 
2 fixed = 34 total (was 28)  |
   | +1 :green_heart: |  mvnsite  |   4m  1s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   3m 38s |  |  the patch passed with JDK 
Private Build

[jira] [Commented] (YARN-11277) trigger deletion of log-dir by size for NonAggregatingLogHandler

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584139#comment-17584139
 ] 

ASF GitHub Bot commented on YARN-11277:
---

ashutoshcipher commented on PR #4797:
URL: https://github.com/apache/hadoop/pull/4797#issuecomment-1225452414

   @leixm - Did you still face the same issue after setting 
`yarn.nodemanager.log.retain-seconds` to a very small value?




> trigger deletion of log-dir by size for NonAggregatingLogHandler
> 
>
> Key: YARN-11277
> URL: https://issues.apache.org/jira/browse/YARN-11277
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.4.0
>Reporter: Xianming Lei
>Priority: Minor
>  Labels: pull-request-available
>
> In our yarn cluster, the log files of some containers are too large, which 
> causes the NodeManager to frequently switch to the unhealthy state. For logs 
> that are too large, we can consider deleting them directly without delaying 
> yarn.nodemanager.log.retain-seconds.
> Cluster environment:
>  # 8k nodes+
>  # 50w+ apps  / day
> Configuration:
>  # yarn.nodemanager.log.retain-seconds=3days
>  # yarn.log-aggregation-enable=false
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11277) trigger deletion of log-dir by size for NonAggregatingLogHandler

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584129#comment-17584129
 ] 

ASF GitHub Bot commented on YARN-11277:
---

leixm commented on PR #4797:
URL: https://github.com/apache/hadoop/pull/4797#issuecomment-1225418277

   @slfan1989  can you review this pr plz?




> trigger deletion of log-dir by size for NonAggregatingLogHandler
> 
>
> Key: YARN-11277
> URL: https://issues.apache.org/jira/browse/YARN-11277
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.4.0
>Reporter: Xianming Lei
>Priority: Minor
>
> In our yarn cluster, the log files of some containers are too large, which 
> causes the NodeManager to frequently switch to the unhealthy state. For logs 
> that are too large, we can consider deleting them directly without delaying 
> yarn.nodemanager.log.retain-seconds.
> Cluster environment:
>  # 8k nodes+
>  # 50w+ apps  / day
> Configuration:
>  # yarn.nodemanager.log.retain-seconds=3days
>  # yarn.log-aggregation-enable=false
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-11277) trigger deletion of log-dir by size for NonAggregatingLogHandler

2022-08-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated YARN-11277:
--
Labels: pull-request-available  (was: )

> trigger deletion of log-dir by size for NonAggregatingLogHandler
> 
>
> Key: YARN-11277
> URL: https://issues.apache.org/jira/browse/YARN-11277
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.4.0
>Reporter: Xianming Lei
>Priority: Minor
>  Labels: pull-request-available
>
> In our yarn cluster, the log files of some containers are too large, which 
> causes the NodeManager to frequently switch to the unhealthy state. For logs 
> that are too large, we can consider deleting them directly without delaying 
> yarn.nodemanager.log.retain-seconds.
> Cluster environment:
>  # 8k nodes+
>  # 50w+ apps  / day
> Configuration:
>  # yarn.nodemanager.log.retain-seconds=3days
>  # yarn.log-aggregation-enable=false
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Updated] (YARN-11277) trigger deletion of log-dir by size for NonAggregatingLogHandler

2022-08-24 Thread Xianming Lei (Jira)


 [ 
https://issues.apache.org/jira/browse/YARN-11277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xianming Lei updated YARN-11277:

Summary: trigger deletion of log-dir by size for NonAggregatingLogHandler  
(was: Add trigger log-dir deletion by size for NonAggregatingLogHandler)

> trigger deletion of log-dir by size for NonAggregatingLogHandler
> 
>
> Key: YARN-11277
> URL: https://issues.apache.org/jira/browse/YARN-11277
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: nodemanager
>Affects Versions: 3.4.0
>Reporter: Xianming Lei
>Priority: Minor
>
> In our yarn cluster, the log files of some containers are too large, which 
> causes the NodeManager to frequently switch to the unhealthy state. For logs 
> that are too large, we can consider deleting them directly without delaying 
> yarn.nodemanager.log.retain-seconds.
> Cluster environment:
>  # 8k nodes+
>  # 50w+ apps  / day
> Configuration:
>  # yarn.nodemanager.log.retain-seconds=3days
>  # yarn.log-aggregation-enable=false
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11276) Add lru cache for RMWebServices.getApps

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584091#comment-17584091
 ] 

ASF GitHub Bot commented on YARN-11276:
---

hadoop-yetus commented on PR #4793:
URL: https://github.com/apache/hadoop/pull/4793#issuecomment-1225323072

   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   1m  2s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 2 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  15m 28s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  25m 35s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   9m 57s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   8m 56s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   2m 20s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   4m 50s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   4m 32s |  |  trunk passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   4m 19s |  |  trunk passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   7m 19s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m 27s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 28s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   2m 17s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   9m  7s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   9m  7s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   8m 33s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   8m 33s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m 50s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   3m 39s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   3m 36s |  |  the patch passed with JDK 
Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   3m 25s |  |  the patch passed with JDK 
Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   7m  2s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 30s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |   1m 38s |  |  hadoop-yarn-api in the patch 
passed.  |
   | +1 :green_heart: |  unit  |   5m 13s |  |  hadoop-yarn-common in the patch 
passed.  |
   | +1 :green_heart: |  unit  | 102m 25s |  |  
hadoop-yarn-server-resourcemanager in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m 17s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 281m 28s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4793/6/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4793 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint |
   | uname | Linux ebeea3a84cd8 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 
01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 3d0ab7d0e0d7406a3afb7b27f70f3e4e87c09a08 |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private 
Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4793/6/te

[jira] [Commented] (YARN-11276) Add lru cache for RMWebServices.getApps

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584041#comment-17584041
 ] 

ASF GitHub Bot commented on YARN-11276:
---

slfan1989 commented on code in PR #4793:
URL: https://github.com/apache/hadoop/pull/4793#discussion_r953421456


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/conf/YarnConfiguration.java:
##
@@ -4681,6 +4681,18 @@ public static boolean areNodeLabelsEnabled(
   public static final String DEFAULT_YARN_WORKFLOW_ID_TAG_PREFIX =
   "workflowid:";
 
+  public static final String APPS_CACHE_ENABLE = YARN_PREFIX + 
"apps.cache.enable";

Review Comment:
   Can we add some parameter descriptions? 





> Add lru cache for RMWebServices.getApps
> ---
>
> Key: YARN-11276
> URL: https://issues.apache.org/jira/browse/YARN-11276
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.4.0
>Reporter: Xianming Lei
>Priority: Minor
>  Labels: pull-request-available
>
> In our YARN cluster, there are thousands of apps running at the same time, 
> the return result of getApps reaches about 10M, and many requests are the 
> same input parameters, we can add cache for RMWebServices.getApps to reduce 
> processing delay



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11276) Add lru cache for RMWebServices.getApps

2022-08-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17584042#comment-17584042
 ] 

ASF GitHub Bot commented on YARN-11276:
---

slfan1989 commented on code in PR #4793:
URL: https://github.com/apache/hadoop/pull/4793#discussion_r953421783


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/util/AppsCacheKey.java:
##
@@ -0,0 +1,142 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.yarn.util;
+
+import org.apache.commons.lang3.builder.EqualsBuilder;
+import org.apache.commons.lang3.builder.HashCodeBuilder;
+import org.apache.hadoop.security.UserGroupInformation;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.Set;
+
+public class AppsCacheKey {
+  private static final Logger LOG =
+  LoggerFactory.getLogger(AppsCacheKey.class.getName());

Review Comment:
   single line





> Add lru cache for RMWebServices.getApps
> ---
>
> Key: YARN-11276
> URL: https://issues.apache.org/jira/browse/YARN-11276
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: resourcemanager
>Affects Versions: 3.4.0
>Reporter: Xianming Lei
>Priority: Minor
>  Labels: pull-request-available
>
> In our YARN cluster, there are thousands of apps running at the same time, 
> the return result of getApps reaches about 10M, and many requests are the 
> same input parameters, we can add cache for RMWebServices.getApps to reduce 
> processing delay



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org