[ https://issues.apache.org/jira/browse/YARN-10589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17276369#comment-17276369 ]
zhuqi commented on YARN-10589: ------------------------------ [~ztang] [~tanu.ajmera] I have reviewed the patch, i don't think the logic is needed, and it break the PRIORITY_SKIPPED logic. In Schedule in priority order: {code:java} // Schedule in priority order for (SchedulerRequestKey schedulerKey : application.getSchedulerKeys()) { ContainerAllocation result = allocate(clusterResource, candidates, schedulingMode, resourceLimits, schedulerKey, null); AllocationState allocationState = result.getAllocationState(); if (allocationState == AllocationState.PRIORITY_SKIPPED) { continue; } return getCSAssignmentFromAllocateResult(clusterResource, result, null, node); } if (reservedContainer == null) { result = preCheckForNodeCandidateSet(clusterResource, node, schedulingMode, resourceLimits, schedulerKey); if (null != result) { continue; } } {code} I think the original logic is right, in preCheckForNodeCandidateSet: {code:java} // Is the nodePartition of pending request matches the node's partition // If not match, jump to next priority. Optional<DiagnosticsCollector> dcOpt = activitiesManager == null ? Optional.empty() : activitiesManager.getOptionalDiagnosticsCollector(); if (!appInfo.precheckNode(schedulerKey, node, schedulingMode, dcOpt)) { ActivitiesLogger.APP.recordSkippedAppActivityWithoutAllocation( activitiesManager, node, application, schedulerKey, ActivityDiagnosticConstant. NODE_DO_NOT_MATCH_PARTITION_OR_PLACEMENT_CONSTRAINTS + ActivitiesManager.getDiagnostics(dcOpt), ActivityLevel.NODE); return ContainerAllocation.PRIORITY_SKIPPED; } {code} We have skipped the priority, and the next priority may need this node partition. We just skipped the priority of application. Any other thoughts? > Improve logic of multi-node allocation > -------------------------------------- > > Key: YARN-10589 > URL: https://issues.apache.org/jira/browse/YARN-10589 > Project: Hadoop YARN > Issue Type: Task > Affects Versions: 3.3.0 > Reporter: Tanu Ajmera > Assignee: Tanu Ajmera > Priority: Major > Attachments: YARN-10589-001.patch > > > {code:java} > for (String partititon : partitions) { > if (current++ > start) { > break; > } > CandidateNodeSet<FiCaSchedulerNode> candidates = > cs.getCandidateNodeSet(partititon); > if (candidates == null) { > continue; > } > cs.allocateContainersToNode(candidates, false); > }{code} > In above logic, if we have thousands of node in one partition, we will still > repeatedly access all nodes of the partition thousands of times. There is no > break point where if the partition is not same for the first node, it should > stop checking other nodes in that partition. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org