[jira] [Commented] (YARN-11614) [Federation] Add Federation PolicyManager Validation Rules
[ https://issues.apache.org/jira/browse/YARN-11614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17786465#comment-17786465 ] ASF GitHub Bot commented on YARN-11614: --- goiri commented on code in PR #6271: URL: https://github.com/apache/hadoop/pull/6271#discussion_r1394634988 ## hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/FederationQueueWeight.java: ## @@ -170,6 +170,26 @@ public static void checkHeadRoomAlphaValid(String headRoomAlpha) throws YarnExce } } + public static void checkPolicyManagerValid(String policyManager) throws YarnException { +switch (policyManager) { + case "org.apache.hadoop.yarn.server.federation.policies.manager.HashBroadcastPolicyManager": Review Comment: Can we use some of the getClass().getSimpleName() or similar methods? ## hadoop-yarn-project/hadoop-yarn/hadoop-yarn-api/src/main/java/org/apache/hadoop/yarn/server/api/protocolrecords/FederationQueueWeight.java: ## @@ -170,6 +170,26 @@ public static void checkHeadRoomAlphaValid(String headRoomAlpha) throws YarnExce } } + public static void checkPolicyManagerValid(String policyManager) throws YarnException { +switch (policyManager) { + case "org.apache.hadoop.yarn.server.federation.policies.manager.HashBroadcastPolicyManager": +throw new YarnException("HashBroadcastPolicyManager does not support the use of queue weights."); + case "org.apache.hadoop.yarn.server.federation.policies.manager.HomePolicyManager": +throw new YarnException("HomePolicyManager does not support the use of queue weights."); Review Comment: I think we can do a list for the ones that support weights, and if it's not there, return the exception, > [Federation] Add Federation PolicyManager Validation Rules > -- > > Key: YARN-11614 > URL: https://issues.apache.org/jira/browse/YARN-11614 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > > When entering queue weights in Federation, it is essential to enhance the > validation rules. If a policy manager does not support weights, a prompt > should be provided to the user. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11485) [Federation] Router Supports Yarn Admin CLI Cmds.
[ https://issues.apache.org/jira/browse/YARN-11485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17786466#comment-17786466 ] ASF GitHub Bot commented on YARN-11485: --- goiri commented on code in PR #6265: URL: https://github.com/apache/hadoop/pull/6265#discussion_r1394640065 ## hadoop-yarn-project/hadoop-yarn/hadoop-yarn-client/src/main/java/org/apache/hadoop/yarn/client/cli/RMAdminCLI.java: ## @@ -1039,6 +1114,54 @@ protected String getUsageString() { return "Usage: rmadmin"; } + /** + * Parse subClusterId. + * This method will only parse subClusterId when Yarn Federation mode is enabled. + * + * @param cliParser CommandLine. + * @return subClusterId. + */ + protected String parseSubClusterId(CommandLine cliParser) { +// If YARN Federation mode is not enabled, return empty. +if (!isYarnFederationEnabled(getConf())) { + return StringUtils.EMPTY; +} + +String subClusterId = cliParser.getOptionValue(OPTION_SUBCLUSTERID); +if(StringUtils.isBlank(subClusterId)) { Review Comment: Space missing. > [Federation] Router Supports Yarn Admin CLI Cmds. > - > > Key: YARN-11485 > URL: https://issues.apache.org/jira/browse/YARN-11485 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > > This Jira ticket aims to enhance the Router command by adding support for all > Yarn Admin CLI options. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11610) [Federation] Add WeightedHomePolicyManager
[ https://issues.apache.org/jira/browse/YARN-11610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17786467#comment-17786467 ] ASF GitHub Bot commented on YARN-11610: --- goiri commented on code in PR #6256: URL: https://github.com/apache/hadoop/pull/6256#discussion_r1394642086 ## hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/server/federation/policies/manager/TestWeightedHomePolicyManager.java: ## @@ -0,0 +1,62 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with this + * work for additional information regarding copyright ownership. The ASF + * licenses this file to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT + * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the + * License for the specific language governing permissions and limitations under + * the License. + */ +package org.apache.hadoop.yarn.server.federation.policies.manager; + +import org.apache.hadoop.yarn.server.federation.policies.amrmproxy.HomeAMRMProxyPolicy; +import org.apache.hadoop.yarn.server.federation.policies.dao.WeightedPolicyInfo; +import org.apache.hadoop.yarn.server.federation.policies.router.WeightedRandomRouterPolicy; +import org.apache.hadoop.yarn.server.federation.store.records.SubClusterId; +import org.apache.hadoop.yarn.server.federation.store.records.SubClusterIdInfo; +import org.junit.Assert; +import org.junit.Before; +import org.junit.Test; + +import java.util.HashMap; +import java.util.Map; + +public class TestWeightedHomePolicyManager extends BasePolicyManagerTest { + private WeightedPolicyInfo policyInfo; + + @Before + public void setup() { +// configure a policy +wfp = new WeightedHomePolicyManager(); +wfp.setQueue("queue1"); +SubClusterId sc1 = SubClusterId.newInstance("sc1"); +policyInfo = new WeightedPolicyInfo(); + +Map routerWeights = new HashMap<>(); +routerWeights.put(new SubClusterIdInfo(sc1), 0.2f); +policyInfo.setRouterPolicyWeights(routerWeights); + +((WeightedHomePolicyManager) wfp).setWeightedPolicyInfo(policyInfo); Review Comment: Can we declare wfp as WeightedHomePolicyManager directly to avoid the casting around? > [Federation] Add WeightedHomePolicyManager > -- > > Key: YARN-11610 > URL: https://issues.apache.org/jira/browse/YARN-11610 > Project: Hadoop YARN > Issue Type: Improvement > Components: federation >Affects Versions: 3.4.0 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > > We will add a new PolicyManager. This PolicyManager can select a router based > on the weight configured by the user, and then all container requests will be > in the home-subcluster. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-11577) Improve FederationInterceptorREST Method Result
[ https://issues.apache.org/jira/browse/YARN-11577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17786470#comment-17786470 ] ASF GitHub Bot commented on YARN-11577: --- goiri commented on code in PR #6190: URL: https://github.com/apache/hadoop/pull/6190#discussion_r1394645699 ## hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-router/src/test/java/org/apache/hadoop/yarn/server/router/subcluster/capacity/TestYarnFederationWithCapacityScheduler.java: ## @@ -73,4 +147,469 @@ public void testGetClusterInfo() throws InterruptedException, IOException { assertTrue(subClusters.contains(clusterInfo.getSubClusterId())); } } + + @Test + public void testInfo() throws InterruptedException, IOException { +FederationClusterInfo federationClusterInfo = +TestFederationSubCluster.performGetCalls(ROUTER_WEB_ADDRESS, RM_WEB_SERVICE_PATH + INFO, +FederationClusterInfo.class, null, null); +List clusterInfos = federationClusterInfo.getList(); +assertNotNull(clusterInfos); +assertEquals(2, clusterInfos.size()); +for (ClusterInfo clusterInfo : clusterInfos) { + assertNotNull(clusterInfo); + assertTrue(subClusters.contains(clusterInfo.getSubClusterId())); +} + } + + @Test + public void testClusterUserInfo() throws Exception { +FederationClusterUserInfo federationClusterUserInfo = +TestFederationSubCluster.performGetCalls(ROUTER_WEB_ADDRESS, +RM_WEB_SERVICE_PATH + CLUSTER_USER_INFO, +FederationClusterUserInfo.class, null, null); +List clusterUserInfos = federationClusterUserInfo.getList(); +assertNotNull(clusterUserInfos); +assertEquals(2, clusterUserInfos.size()); +for (ClusterUserInfo clusterUserInfo : clusterUserInfos) { + assertNotNull(clusterUserInfo); + assertTrue(subClusters.contains(clusterUserInfo.getSubClusterId())); +} + } + + @Test + public void testMetricsInfo() throws Exception { +// It takes time to start the sub-cluster. +// We need to wait for the sub-cluster to be completely started, +// so we need to set the waiting time. +// The resources of the two sub-clusters we registered are 24C and 12G, +// so the resources that the Router should collect are 48C and 24G. +GenericTestUtils.waitFor(() -> { + try { +ClusterMetricsInfo clusterMetricsInfo = +TestFederationSubCluster.performGetCalls(ROUTER_WEB_ADDRESS, +RM_WEB_SERVICE_PATH + METRICS, ClusterMetricsInfo.class, null, null); +assertNotNull(clusterMetricsInfo); +return (48 == clusterMetricsInfo.getTotalVirtualCores() && +24576 == clusterMetricsInfo.getTotalMB()); + } catch (Exception e) { +return false; + } +}, 5000, 50 * 5000); + } + + @Test + public void testSchedulerInfo() throws Exception { +FederationSchedulerTypeInfo schedulerTypeInfo = +TestFederationSubCluster.performGetCalls(ROUTER_WEB_ADDRESS, +RM_WEB_SERVICE_PATH + SCHEDULER, FederationSchedulerTypeInfo.class, null, null); +assertNotNull(schedulerTypeInfo); +List schedulerTypeInfos = schedulerTypeInfo.getList(); +assertNotNull(schedulerTypeInfos); +assertEquals(2, schedulerTypeInfos.size()); +for (SchedulerTypeInfo schedulerTypeInfoItem : schedulerTypeInfos) { + assertNotNull(schedulerTypeInfoItem); + assertTrue(subClusters.contains(schedulerTypeInfoItem.getSubClusterId())); + CapacitySchedulerInfo schedulerInfo = + (CapacitySchedulerInfo) schedulerTypeInfoItem.getSchedulerInfo(); + assertNotNull(schedulerInfo); + assertEquals(3, schedulerInfo.getQueues().getQueueInfoList().size()); +} + } + + @Test + public void testNodesEmpty() throws Exception { +// We are in 2 sub-clusters, each with 3 nodes, so our Router should correspond to 6 nodes. +GenericTestUtils.waitFor(() -> { + try { +NodesInfo nodesInfo = +TestFederationSubCluster.performGetCalls(ROUTER_WEB_ADDRESS, +RM_WEB_SERVICE_PATH + NODES, NodesInfo.class, null, null); +assertNotNull(nodesInfo); +ArrayList nodes = nodesInfo.getNodes(); +assertNotNull(nodes); +return (6 == nodes.size()); + } catch (Exception e) { +return false; + } +}, 5000, 50 * 5000); + } + + @Test + public void testNodesLost() throws Exception { +GenericTestUtils.waitFor(() -> { + try { +NodesInfo nodesInfo = +TestFederationSubCluster.performGetCalls(ROUTER_WEB_ADDRESS, +RM_WEB_SERVICE_PATH + NODES, NodesInfo.class, STATES, "LOST"); +assertNotNull(nodesInfo); +ArrayList nodes = nodesInfo.getNodes(); +assertNotNull(nodes); +return (0 == nodes.size()); Review Comment: isEmpty? ## hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server
[jira] [Commented] (YARN-11483) [Federation] Router AdminCLI Supports Clean Finish Apps.
[ https://issues.apache.org/jira/browse/YARN-11483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17786559#comment-17786559 ] ASF GitHub Bot commented on YARN-11483: --- goiri commented on code in PR #6251: URL: https://github.com/apache/hadoop/pull/6251#discussion_r1395028788 ## hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/server/MockResourceManagerFacade.java: ## @@ -141,46 +141,11 @@ import org.apache.hadoop.yarn.exceptions.YarnException; import org.apache.hadoop.yarn.security.AMRMTokenIdentifier; import org.apache.hadoop.yarn.server.api.ResourceManagerAdministrationProtocol; -import org.apache.hadoop.yarn.server.api.protocolrecords.AddToClusterNodeLabelsRequest; -import org.apache.hadoop.yarn.server.api.protocolrecords.AddToClusterNodeLabelsResponse; -import org.apache.hadoop.yarn.server.api.protocolrecords.CheckForDecommissioningNodesRequest; -import org.apache.hadoop.yarn.server.api.protocolrecords.CheckForDecommissioningNodesResponse; -import org.apache.hadoop.yarn.server.api.protocolrecords.RefreshAdminAclsRequest; -import org.apache.hadoop.yarn.server.api.protocolrecords.RefreshAdminAclsResponse; -import org.apache.hadoop.yarn.server.api.protocolrecords.RefreshClusterMaxPriorityRequest; -import org.apache.hadoop.yarn.server.api.protocolrecords.RefreshClusterMaxPriorityResponse; -import org.apache.hadoop.yarn.server.api.protocolrecords.RefreshNodesRequest; -import org.apache.hadoop.yarn.server.api.protocolrecords.RefreshNodesResourcesRequest; -import org.apache.hadoop.yarn.server.api.protocolrecords.RefreshNodesResourcesResponse; -import org.apache.hadoop.yarn.server.api.protocolrecords.RefreshNodesResponse; -import org.apache.hadoop.yarn.server.api.protocolrecords.RefreshQueuesRequest; -import org.apache.hadoop.yarn.server.api.protocolrecords.RefreshQueuesResponse; -import org.apache.hadoop.yarn.server.api.protocolrecords.RefreshServiceAclsRequest; -import org.apache.hadoop.yarn.server.api.protocolrecords.RefreshServiceAclsResponse; -import org.apache.hadoop.yarn.server.api.protocolrecords.RefreshSuperUserGroupsConfigurationRequest; -import org.apache.hadoop.yarn.server.api.protocolrecords.RefreshSuperUserGroupsConfigurationResponse; -import org.apache.hadoop.yarn.server.api.protocolrecords.RefreshUserToGroupsMappingsRequest; -import org.apache.hadoop.yarn.server.api.protocolrecords.RefreshUserToGroupsMappingsResponse; -import org.apache.hadoop.yarn.server.api.protocolrecords.RemoveFromClusterNodeLabelsRequest; -import org.apache.hadoop.yarn.server.api.protocolrecords.RemoveFromClusterNodeLabelsResponse; -import org.apache.hadoop.yarn.server.api.protocolrecords.ReplaceLabelsOnNodeRequest; -import org.apache.hadoop.yarn.server.api.protocolrecords.ReplaceLabelsOnNodeResponse; -import org.apache.hadoop.yarn.server.api.protocolrecords.UpdateNodeResourceRequest; -import org.apache.hadoop.yarn.server.api.protocolrecords.UpdateNodeResourceResponse; +import org.apache.hadoop.yarn.server.api.protocolrecords.*; Review Comment: Avoid > [Federation] Router AdminCLI Supports Clean Finish Apps. > > > Key: YARN-11483 > URL: https://issues.apache.org/jira/browse/YARN-11483 > Project: Hadoop YARN > Issue Type: Sub-task > Components: federation >Affects Versions: 3.4.0 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org