[jira] [Commented] (YARN-11614) [Federation] Add Federation PolicyManager Validation Rules

2023-11-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17786465#comment-17786465
 ] 

ASF GitHub Bot commented on YARN-11614:
---

goiri commented on code in PR #6271:
URL: https://github.com/apache/hadoop/pull/6271#discussion_r1394634988


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/FederationQueueWeight.java:
##
@@ -170,6 +170,26 @@ public static void checkHeadRoomAlphaValid(String 
headRoomAlpha) throws YarnExce
 }
   }
 
+  public static void checkPolicyManagerValid(String policyManager) throws 
YarnException {
+switch (policyManager) {
+  case 
"org.apache.hadoop.yarn.server.federation.policies.manager.HashBroadcastPolicyManager":

Review Comment:
   Can we use some of the getClass().getSimpleName() or similar methods?



##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/FederationQueueWeight.java:
##
@@ -170,6 +170,26 @@ public static void checkHeadRoomAlphaValid(String 
headRoomAlpha) throws YarnExce
 }
   }
 
+  public static void checkPolicyManagerValid(String policyManager) throws 
YarnException {
+switch (policyManager) {
+  case 
"org.apache.hadoop.yarn.server.federation.policies.manager.HashBroadcastPolicyManager":
+throw new YarnException("HashBroadcastPolicyManager does not support 
the use of queue weights.");
+  case 
"org.apache.hadoop.yarn.server.federation.policies.manager.HomePolicyManager":
+throw new YarnException("HomePolicyManager does not support the use of 
queue weights.");

Review Comment:
   I think we can do a list for the ones that support weights, and if it's not 
there, return the exception,





> [Federation] Add Federation PolicyManager Validation Rules
> --
>
> Key: YARN-11614
> URL: https://issues.apache.org/jira/browse/YARN-11614
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: federation
>Affects Versions: 3.4.0
>Reporter: Shilun Fan
>Assignee: Shilun Fan
>Priority: Major
>  Labels: pull-request-available
>
> When entering queue weights in Federation, it is essential to enhance the 
> validation rules. If a policy manager does not support weights, a prompt 
> should be provided to the user.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11485) [Federation] Router Supports Yarn Admin CLI Cmds.

2023-11-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17786466#comment-17786466
 ] 

ASF GitHub Bot commented on YARN-11485:
---

goiri commented on code in PR #6265:
URL: https://github.com/apache/hadoop/pull/6265#discussion_r1394640065


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java:
##
@@ -1039,6 +1114,54 @@ protected String getUsageString() {
 return "Usage: rmadmin";
   }
 
+  /**
+   * Parse subClusterId.
+   * This method will only parse subClusterId when Yarn Federation mode is 
enabled.
+   *
+   * @param cliParser CommandLine.
+   * @return subClusterId.
+   */
+  protected String parseSubClusterId(CommandLine cliParser) {
+// If YARN Federation mode is not enabled, return empty.
+if (!isYarnFederationEnabled(getConf())) {
+  return StringUtils.EMPTY;
+}
+
+String subClusterId = cliParser.getOptionValue(OPTION_SUBCLUSTERID);
+if(StringUtils.isBlank(subClusterId)) {

Review Comment:
   Space missing.





> [Federation] Router Supports Yarn Admin CLI Cmds.
> -
>
> Key: YARN-11485
> URL: https://issues.apache.org/jira/browse/YARN-11485
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: federation
>Affects Versions: 3.4.0
>Reporter: Shilun Fan
>Assignee: Shilun Fan
>Priority: Major
>  Labels: pull-request-available
>
> This Jira ticket aims to enhance the Router command by adding support for all 
> Yarn Admin CLI options.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11610) [Federation] Add WeightedHomePolicyManager

2023-11-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17786467#comment-17786467
 ] 

ASF GitHub Bot commented on YARN-11610:
---

goiri commented on code in PR #6256:
URL: https://github.com/apache/hadoop/pull/6256#discussion_r1394642086


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/server/federation/policies/manager/TestWeightedHomePolicyManager.java:
##
@@ -0,0 +1,62 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations 
under
+ * the License.
+ */
+package org.apache.hadoop.yarn.server.federation.policies.manager;
+
+import 
org.apache.hadoop.yarn.server.federation.policies.amrmproxy.HomeAMRMProxyPolicy;
+import 
org.apache.hadoop.yarn.server.federation.policies.dao.WeightedPolicyInfo;
+import 
org.apache.hadoop.yarn.server.federation.policies.router.WeightedRandomRouterPolicy;
+import org.apache.hadoop.yarn.server.federation.store.records.SubClusterId;
+import org.apache.hadoop.yarn.server.federation.store.records.SubClusterIdInfo;
+import org.junit.Assert;
+import org.junit.Before;
+import org.junit.Test;
+
+import java.util.HashMap;
+import java.util.Map;
+
+public class TestWeightedHomePolicyManager extends BasePolicyManagerTest {
+  private WeightedPolicyInfo policyInfo;
+
+  @Before
+  public void setup() {
+// configure a policy
+wfp = new WeightedHomePolicyManager();
+wfp.setQueue("queue1");
+SubClusterId sc1 = SubClusterId.newInstance("sc1");
+policyInfo = new WeightedPolicyInfo();
+
+Map routerWeights = new HashMap<>();
+routerWeights.put(new SubClusterIdInfo(sc1), 0.2f);
+policyInfo.setRouterPolicyWeights(routerWeights);
+
+((WeightedHomePolicyManager) wfp).setWeightedPolicyInfo(policyInfo);

Review Comment:
   Can we declare wfp as WeightedHomePolicyManager directly to avoid the 
casting around?





> [Federation] Add WeightedHomePolicyManager
> --
>
> Key: YARN-11610
> URL: https://issues.apache.org/jira/browse/YARN-11610
> Project: Hadoop YARN
>  Issue Type: Improvement
>  Components: federation
>Affects Versions: 3.4.0
>Reporter: Shilun Fan
>Assignee: Shilun Fan
>Priority: Major
>  Labels: pull-request-available
>
> We will add a new PolicyManager. This PolicyManager can select a router based 
> on the weight configured by the user, and then all container requests will be 
> in the home-subcluster.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-11577) Improve FederationInterceptorREST Method Result

2023-11-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17786470#comment-17786470
 ] 

ASF GitHub Bot commented on YARN-11577:
---

goiri commented on code in PR #6190:
URL: https://github.com/apache/hadoop/pull/6190#discussion_r1394645699


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/test/java/org/apache/hadoop/yarn/server/router/subcluster/capacity/TestYarnFederationWithCapacityScheduler.java:
##
@@ -73,4 +147,469 @@ public void testGetClusterInfo() throws 
InterruptedException, IOException {
   assertTrue(subClusters.contains(clusterInfo.getSubClusterId()));
 }
   }
+
+  @Test
+  public void testInfo() throws InterruptedException, IOException {
+FederationClusterInfo federationClusterInfo =
+TestFederationSubCluster.performGetCalls(ROUTER_WEB_ADDRESS, 
RM_WEB_SERVICE_PATH + INFO,
+FederationClusterInfo.class, null, null);
+List clusterInfos = federationClusterInfo.getList();
+assertNotNull(clusterInfos);
+assertEquals(2, clusterInfos.size());
+for (ClusterInfo clusterInfo : clusterInfos) {
+  assertNotNull(clusterInfo);
+  assertTrue(subClusters.contains(clusterInfo.getSubClusterId()));
+}
+  }
+
+  @Test
+  public void testClusterUserInfo() throws Exception {
+FederationClusterUserInfo federationClusterUserInfo =
+TestFederationSubCluster.performGetCalls(ROUTER_WEB_ADDRESS,
+RM_WEB_SERVICE_PATH + CLUSTER_USER_INFO,
+FederationClusterUserInfo.class, null, null);
+List clusterUserInfos = 
federationClusterUserInfo.getList();
+assertNotNull(clusterUserInfos);
+assertEquals(2, clusterUserInfos.size());
+for (ClusterUserInfo clusterUserInfo : clusterUserInfos) {
+  assertNotNull(clusterUserInfo);
+  assertTrue(subClusters.contains(clusterUserInfo.getSubClusterId()));
+}
+  }
+
+  @Test
+  public void testMetricsInfo() throws Exception {
+// It takes time to start the sub-cluster.
+// We need to wait for the sub-cluster to be completely started,
+// so we need to set the waiting time.
+// The resources of the two sub-clusters we registered are 24C and 12G,
+// so the resources that the Router should collect are 48C and 24G.
+GenericTestUtils.waitFor(() -> {
+  try {
+ClusterMetricsInfo clusterMetricsInfo =
+TestFederationSubCluster.performGetCalls(ROUTER_WEB_ADDRESS,
+RM_WEB_SERVICE_PATH + METRICS, ClusterMetricsInfo.class, null, 
null);
+assertNotNull(clusterMetricsInfo);
+return (48 == clusterMetricsInfo.getTotalVirtualCores() &&
+24576 == clusterMetricsInfo.getTotalMB());
+  } catch (Exception e) {
+return false;
+  }
+}, 5000, 50 * 5000);
+  }
+
+  @Test
+  public void testSchedulerInfo() throws Exception {
+FederationSchedulerTypeInfo schedulerTypeInfo =
+TestFederationSubCluster.performGetCalls(ROUTER_WEB_ADDRESS,
+RM_WEB_SERVICE_PATH + SCHEDULER, FederationSchedulerTypeInfo.class, 
null, null);
+assertNotNull(schedulerTypeInfo);
+List schedulerTypeInfos = schedulerTypeInfo.getList();
+assertNotNull(schedulerTypeInfos);
+assertEquals(2, schedulerTypeInfos.size());
+for (SchedulerTypeInfo schedulerTypeInfoItem : schedulerTypeInfos) {
+  assertNotNull(schedulerTypeInfoItem);
+  
assertTrue(subClusters.contains(schedulerTypeInfoItem.getSubClusterId()));
+  CapacitySchedulerInfo schedulerInfo =
+  (CapacitySchedulerInfo) schedulerTypeInfoItem.getSchedulerInfo();
+  assertNotNull(schedulerInfo);
+  assertEquals(3, schedulerInfo.getQueues().getQueueInfoList().size());
+}
+  }
+
+  @Test
+  public void testNodesEmpty() throws Exception {
+// We are in 2 sub-clusters, each with 3 nodes, so our Router should 
correspond to 6 nodes.
+GenericTestUtils.waitFor(() -> {
+  try {
+NodesInfo nodesInfo =
+TestFederationSubCluster.performGetCalls(ROUTER_WEB_ADDRESS,
+RM_WEB_SERVICE_PATH + NODES, NodesInfo.class, null, null);
+assertNotNull(nodesInfo);
+ArrayList nodes = nodesInfo.getNodes();
+assertNotNull(nodes);
+return (6 == nodes.size());
+  } catch (Exception e) {
+return false;
+  }
+}, 5000, 50 * 5000);
+  }
+
+  @Test
+  public void testNodesLost() throws Exception {
+GenericTestUtils.waitFor(() -> {
+  try {
+NodesInfo nodesInfo =
+TestFederationSubCluster.performGetCalls(ROUTER_WEB_ADDRESS,
+RM_WEB_SERVICE_PATH + NODES, NodesInfo.class, STATES, "LOST");
+assertNotNull(nodesInfo);
+ArrayList nodes = nodesInfo.getNodes();
+assertNotNull(nodes);
+return (0 == nodes.size());

Review Comment:
   isEmpty?



##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server

[jira] [Commented] (YARN-11483) [Federation] Router AdminCLI Supports Clean Finish Apps.

2023-11-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-11483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17786559#comment-17786559
 ] 

ASF GitHub Bot commented on YARN-11483:
---

goiri commented on code in PR #6251:
URL: https://github.com/apache/hadoop/pull/6251#discussion_r1395028788


##
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/server/MockResourceManagerFacade.java:
##
@@ -141,46 +141,11 @@
 import org.apache.hadoop.yarn.exceptions.YarnException;
 import org.apache.hadoop.yarn.security.AMRMTokenIdentifier;
 import org.apache.hadoop.yarn.server.api.ResourceManagerAdministrationProtocol;
-import 
org.apache.hadoop.yarn.server.api.protocolrecords.AddToClusterNodeLabelsRequest;
-import 
org.apache.hadoop.yarn.server.api.protocolrecords.AddToClusterNodeLabelsResponse;
-import 
org.apache.hadoop.yarn.server.api.protocolrecords.CheckForDecommissioningNodesRequest;
-import 
org.apache.hadoop.yarn.server.api.protocolrecords.CheckForDecommissioningNodesResponse;
-import 
org.apache.hadoop.yarn.server.api.protocolrecords.RefreshAdminAclsRequest;
-import 
org.apache.hadoop.yarn.server.api.protocolrecords.RefreshAdminAclsResponse;
-import 
org.apache.hadoop.yarn.server.api.protocolrecords.RefreshClusterMaxPriorityRequest;
-import 
org.apache.hadoop.yarn.server.api.protocolrecords.RefreshClusterMaxPriorityResponse;
-import org.apache.hadoop.yarn.server.api.protocolrecords.RefreshNodesRequest;
-import 
org.apache.hadoop.yarn.server.api.protocolrecords.RefreshNodesResourcesRequest;
-import 
org.apache.hadoop.yarn.server.api.protocolrecords.RefreshNodesResourcesResponse;
-import org.apache.hadoop.yarn.server.api.protocolrecords.RefreshNodesResponse;
-import org.apache.hadoop.yarn.server.api.protocolrecords.RefreshQueuesRequest;
-import org.apache.hadoop.yarn.server.api.protocolrecords.RefreshQueuesResponse;
-import 
org.apache.hadoop.yarn.server.api.protocolrecords.RefreshServiceAclsRequest;
-import 
org.apache.hadoop.yarn.server.api.protocolrecords.RefreshServiceAclsResponse;
-import 
org.apache.hadoop.yarn.server.api.protocolrecords.RefreshSuperUserGroupsConfigurationRequest;
-import 
org.apache.hadoop.yarn.server.api.protocolrecords.RefreshSuperUserGroupsConfigurationResponse;
-import 
org.apache.hadoop.yarn.server.api.protocolrecords.RefreshUserToGroupsMappingsRequest;
-import 
org.apache.hadoop.yarn.server.api.protocolrecords.RefreshUserToGroupsMappingsResponse;
-import 
org.apache.hadoop.yarn.server.api.protocolrecords.RemoveFromClusterNodeLabelsRequest;
-import 
org.apache.hadoop.yarn.server.api.protocolrecords.RemoveFromClusterNodeLabelsResponse;
-import 
org.apache.hadoop.yarn.server.api.protocolrecords.ReplaceLabelsOnNodeRequest;
-import 
org.apache.hadoop.yarn.server.api.protocolrecords.ReplaceLabelsOnNodeResponse;
-import 
org.apache.hadoop.yarn.server.api.protocolrecords.UpdateNodeResourceRequest;
-import 
org.apache.hadoop.yarn.server.api.protocolrecords.UpdateNodeResourceResponse;
+import org.apache.hadoop.yarn.server.api.protocolrecords.*;

Review Comment:
   Avoid





> [Federation] Router AdminCLI Supports Clean Finish Apps.
> 
>
> Key: YARN-11483
> URL: https://issues.apache.org/jira/browse/YARN-11483
> Project: Hadoop YARN
>  Issue Type: Sub-task
>  Components: federation
>Affects Versions: 3.4.0
>Reporter: Shilun Fan
>Assignee: Shilun Fan
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org