[jira] [Updated] (YARN-6948) Invalid event: ATTEMPT_ADDED at FINAL_SAVING
[ https://issues.apache.org/jira/browse/YARN-6948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lujie updated YARN-6948: Attachment: yarn-6948.png > Invalid event: ATTEMPT_ADDED at FINAL_SAVING > > > Key: YARN-6948 > URL: https://issues.apache.org/jira/browse/YARN-6948 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn >Affects Versions: 2.8.0 >Reporter: lujie > Attachments: yarn-6948.png > > > When I send kill command to a running job, I check the logs and find the > Exception: > {code:java} > 2017-08-03 01:35:20,485 ERROR > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: > Can't handle this event at current state > org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: > ATTEMPT_ADDED at FINAL_SAVING > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:757) > at > org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl.handle(RMAppAttemptImpl.java:106) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:834) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$ApplicationAttemptEventDispatcher.handle(ResourceManager.java:815) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7561) Why hasContainerForNode() return false directly when there is no request of ANY locality without considering NODE_LOCAL and RACK_LOCAL?
[ https://issues.apache.org/jira/browse/YARN-7561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16266421#comment-16266421 ] wuchang commented on YARN-7561: --- [~yufeigu] [~templedf] Would you please give me some suggestions?Thank you very much. > Why hasContainerForNode() return false directly when there is no request of > ANY locality without considering NODE_LOCAL and RACK_LOCAL? > --- > > Key: YARN-7561 > URL: https://issues.apache.org/jira/browse/YARN-7561 > Project: Hadoop YARN > Issue Type: Task > Components: fairscheduler >Affects Versions: 2.7.3 >Reporter: wuchang > > I am studying the FairScheduler source cod of yarn 2.7.3. > By the code of class FSAppAttempt: > {code} > public boolean hasContainerForNode(Priority prio, FSSchedulerNode node) { > ResourceRequest anyRequest = getResourceRequest(prio, > ResourceRequest.ANY); > ResourceRequest rackRequest = getResourceRequest(prio, > node.getRackName()); > ResourceRequest nodeRequest = getResourceRequest(prio, > node.getNodeName()); > > return > // There must be outstanding requests at the given priority: > anyRequest != null && anyRequest.getNumContainers() > 0 && > // If locality relaxation is turned off at *-level, there must be > a > // non-zero request for the node's rack: > (anyRequest.getRelaxLocality() || > (rackRequest != null && rackRequest.getNumContainers() > 0)) > && > // If locality relaxation is turned off at rack-level, there must > be a > // non-zero request at the node: > (rackRequest == null || rackRequest.getRelaxLocality() || > (nodeRequest != null && nodeRequest.getNumContainers() > 0)) > && > // The requested container must be able to fit on the node: > Resources.lessThanOrEqual(RESOURCE_CALCULATOR, null, > anyRequest.getCapability(), > node.getRMNode().getTotalCapability()); > } > {code} > I really cannot understand why when there is no anyRequest , > *hasContainerForNode()* return false directly without considering whether > there is NODE_LOCAL or RACK_LOCAL requests. > And , *AppSchedulingInfo.allocateNodeLocal()* and > *AppSchedulingInfo.allocateRackLocal()* will also decrease the number of > containers for *ResourceRequest.ANY*, this is another place where I feel > confused. > Really thanks for some prompt. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7561) Why hasContainerForNode() return false directly when there is no request of ANY locality without considering NODE_LOCAL and RACK_LOCAL?
[ https://issues.apache.org/jira/browse/YARN-7561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuchang updated YARN-7561: -- Description: I am studying the FairScheduler source cod of yarn 2.7.3. By the code of class FSAppAttempt: {code} public boolean hasContainerForNode(Priority prio, FSSchedulerNode node) { ResourceRequest anyRequest = getResourceRequest(prio, ResourceRequest.ANY); ResourceRequest rackRequest = getResourceRequest(prio, node.getRackName()); ResourceRequest nodeRequest = getResourceRequest(prio, node.getNodeName()); return // There must be outstanding requests at the given priority: anyRequest != null && anyRequest.getNumContainers() > 0 && // If locality relaxation is turned off at *-level, there must be a // non-zero request for the node's rack: (anyRequest.getRelaxLocality() || (rackRequest != null && rackRequest.getNumContainers() > 0)) && // If locality relaxation is turned off at rack-level, there must be a // non-zero request at the node: (rackRequest == null || rackRequest.getRelaxLocality() || (nodeRequest != null && nodeRequest.getNumContainers() > 0)) && // The requested container must be able to fit on the node: Resources.lessThanOrEqual(RESOURCE_CALCULATOR, null, anyRequest.getCapability(), node.getRMNode().getTotalCapability()); } {code} I really cannot understand why when there is no anyRequest , *hasContainerForNode()* return false directly without considering whether there is NODE_LOCAL or RACK_LOCAL requests. And , *AppSchedulingInfo.allocateNodeLocal()* and *AppSchedulingInfo.allocateRackLocal()* will also decrease the number of containers for *ResourceRequest.ANY*, this is another place where I feel confused. Really thanks for some prompt. was: I am studying the FairScheduler source cod of yarn 2.7.3. By the code of class FSAppAttempt: {code} public boolean hasContainerForNode(Priority prio, FSSchedulerNode node) { ResourceRequest anyRequest = getResourceRequest(prio, ResourceRequest.ANY); ResourceRequest rackRequest = getResourceRequest(prio, node.getRackName()); ResourceRequest nodeRequest = getResourceRequest(prio, node.getNodeName()); return // There must be outstanding requests at the given priority: anyRequest != null && anyRequest.getNumContainers() > 0 && // If locality relaxation is turned off at *-level, there must be a // non-zero request for the node's rack: (anyRequest.getRelaxLocality() || (rackRequest != null && rackRequest.getNumContainers() > 0)) && // If locality relaxation is turned off at rack-level, there must be a // non-zero request at the node: (rackRequest == null || rackRequest.getRelaxLocality() || (nodeRequest != null && nodeRequest.getNumContainers() > 0)) && // The requested container must be able to fit on the node: Resources.lessThanOrEqual(RESOURCE_CALCULATOR, null, anyRequest.getCapability(), node.getRMNode().getTotalCapability()); } {code} I really cannot understand why when there is no anyRequest , *hasContainerForNode()* return false directly without considering whether there is NODE_LOCAL or RACK_LOCAL requests. And , *AppSchedulingInfo.allocateNodeLocal()* and *AppSchedulingInfo.allocateRackLocal()* will also decrease the number of containers for *ResourceRequest.ANY*, this is another place where I feel confused. Really thanks for some prompt. > Why hasContainerForNode() return false directly when there is no request of > ANY locality without considering NODE_LOCAL and RACK_LOCAL? > --- > > Key: YARN-7561 > URL: https://issues.apache.org/jira/browse/YARN-7561 > Project: Hadoop YARN > Issue Type: Task > Components: fairscheduler >Affects Versions: 2.7.3 >Reporter: wuchang > > I am studying the FairScheduler source cod of yarn 2.7.3. > By the code of class FSAppAttempt: > {code} > public boolean hasContainerForNode(Priority prio, FSSchedulerNode node) { > ResourceRequest anyRequest = getResourceRequest(prio, > ResourceRequest.ANY); > ResourceRequest rackRequest = getResourceRequest(prio, > node.getRackName()); > ResourceRequest nodeRequest = getResourceRequest(prio, > node.getNodeName()); > > return > // There must be outstanding requests at the given priority: > anyRequest != null && anyRequest.getNumContainers() > 0 && > // If locality relaxation is turned off at *-level, there must be > a >
[jira] [Updated] (YARN-7561) Why hasContainerForNode return false directly when there is no request of ANY locality without considering NODE_LOCAL and RACK_LOCAL?
[ https://issues.apache.org/jira/browse/YARN-7561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuchang updated YARN-7561: -- Summary: Why hasContainerForNode return false directly when there is no request of ANY locality without considering NODE_LOCAL and RACK_LOCAL? (was: Why hasContainerForNode return false directly when there is no request of ANY locality?) > Why hasContainerForNode return false directly when there is no request of ANY > locality without considering NODE_LOCAL and RACK_LOCAL? > - > > Key: YARN-7561 > URL: https://issues.apache.org/jira/browse/YARN-7561 > Project: Hadoop YARN > Issue Type: Task > Components: fairscheduler >Affects Versions: 2.7.3 >Reporter: wuchang > > I am studying the FairScheduler source cod of yarn 2.7.3. > By the code of class FSAppAttempt: > {code} > public boolean hasContainerForNode(Priority prio, FSSchedulerNode node) { > ResourceRequest anyRequest = getResourceRequest(prio, > ResourceRequest.ANY); > ResourceRequest rackRequest = getResourceRequest(prio, > node.getRackName()); > ResourceRequest nodeRequest = getResourceRequest(prio, > node.getNodeName()); > > return > // There must be outstanding requests at the given priority: > anyRequest != null && anyRequest.getNumContainers() > 0 && > // If locality relaxation is turned off at *-level, there must be > a > // non-zero request for the node's rack: > (anyRequest.getRelaxLocality() || > (rackRequest != null && rackRequest.getNumContainers() > 0)) > && > // If locality relaxation is turned off at rack-level, there must > be a > // non-zero request at the node: > (rackRequest == null || rackRequest.getRelaxLocality() || > (nodeRequest != null && nodeRequest.getNumContainers() > 0)) > && > // The requested container must be able to fit on the node: > Resources.lessThanOrEqual(RESOURCE_CALCULATOR, null, > anyRequest.getCapability(), > node.getRMNode().getTotalCapability()); > } > {code} > I really cannot understand why when there is no anyRequest , > *hasContainerForNode()* return false directly without considering whether > there is NODE_LOCAL or RACK_LOCAL requests. > And , *AppSchedulingInfo.allocateNodeLocal()* and > *AppSchedulingInfo.allocateRackLocal()* will also decrease the number of > containers for *ResourceRequest.ANY*, this is another place where I feel > confused. > Really thanks for some prompt. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7561) Why hasContainerForNode() return false directly when there is no request of ANY locality without considering NODE_LOCAL and RACK_LOCAL?
[ https://issues.apache.org/jira/browse/YARN-7561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuchang updated YARN-7561: -- Summary: Why hasContainerForNode() return false directly when there is no request of ANY locality without considering NODE_LOCAL and RACK_LOCAL? (was: Why hasContainerForNode return false directly when there is no request of ANY locality without considering NODE_LOCAL and RACK_LOCAL?) > Why hasContainerForNode() return false directly when there is no request of > ANY locality without considering NODE_LOCAL and RACK_LOCAL? > --- > > Key: YARN-7561 > URL: https://issues.apache.org/jira/browse/YARN-7561 > Project: Hadoop YARN > Issue Type: Task > Components: fairscheduler >Affects Versions: 2.7.3 >Reporter: wuchang > > I am studying the FairScheduler source cod of yarn 2.7.3. > By the code of class FSAppAttempt: > {code} > public boolean hasContainerForNode(Priority prio, FSSchedulerNode node) { > ResourceRequest anyRequest = getResourceRequest(prio, > ResourceRequest.ANY); > ResourceRequest rackRequest = getResourceRequest(prio, > node.getRackName()); > ResourceRequest nodeRequest = getResourceRequest(prio, > node.getNodeName()); > > return > // There must be outstanding requests at the given priority: > anyRequest != null && anyRequest.getNumContainers() > 0 && > // If locality relaxation is turned off at *-level, there must be > a > // non-zero request for the node's rack: > (anyRequest.getRelaxLocality() || > (rackRequest != null && rackRequest.getNumContainers() > 0)) > && > // If locality relaxation is turned off at rack-level, there must > be a > // non-zero request at the node: > (rackRequest == null || rackRequest.getRelaxLocality() || > (nodeRequest != null && nodeRequest.getNumContainers() > 0)) > && > // The requested container must be able to fit on the node: > Resources.lessThanOrEqual(RESOURCE_CALCULATOR, null, > anyRequest.getCapability(), > node.getRMNode().getTotalCapability()); > } > {code} > I really cannot understand why when there is no anyRequest , > *hasContainerForNode()* return false directly without considering whether > there is NODE_LOCAL or RACK_LOCAL requests. > And , *AppSchedulingInfo.allocateNodeLocal()* and > *AppSchedulingInfo.allocateRackLocal()* will also decrease the number of > containers for *ResourceRequest.ANY*, this is another place where I feel > confused. > Really thanks for some prompt. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7561) Why hasContainerForNode return false directly when there is no request of ANY locality?
[ https://issues.apache.org/jira/browse/YARN-7561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuchang updated YARN-7561: -- Description: I am studying the FairScheduler source cod of yarn 2.7.3. By the code of class FSAppAttempt: {code} public boolean hasContainerForNode(Priority prio, FSSchedulerNode node) { ResourceRequest anyRequest = getResourceRequest(prio, ResourceRequest.ANY); ResourceRequest rackRequest = getResourceRequest(prio, node.getRackName()); ResourceRequest nodeRequest = getResourceRequest(prio, node.getNodeName()); return // There must be outstanding requests at the given priority: anyRequest != null && anyRequest.getNumContainers() > 0 && // If locality relaxation is turned off at *-level, there must be a // non-zero request for the node's rack: (anyRequest.getRelaxLocality() || (rackRequest != null && rackRequest.getNumContainers() > 0)) && // If locality relaxation is turned off at rack-level, there must be a // non-zero request at the node: (rackRequest == null || rackRequest.getRelaxLocality() || (nodeRequest != null && nodeRequest.getNumContainers() > 0)) && // The requested container must be able to fit on the node: Resources.lessThanOrEqual(RESOURCE_CALCULATOR, null, anyRequest.getCapability(), node.getRMNode().getTotalCapability()); } {code} I really cannot understand why when there is no anyRequest , *hasContainerForNode()* return false directly without considering whether there is NODE_LOCAL or RACK_LOCAL requests. And , *AppSchedulingInfo.allocateNodeLocal()* and *AppSchedulingInfo.allocateRackLocal()* will also decrease the number of containers for *ResourceRequest.ANY*, this is another place where I feel confused. Really thanks for some prompt. was: I am studying the FairScheduler source cod of yarn 2.7.3. By the code of class FSAppAttempt: {quote} public boolean hasContainerForNode(Priority prio, FSSchedulerNode node) { ResourceRequest anyRequest = getResourceRequest(prio, ResourceRequest.ANY); ResourceRequest rackRequest = getResourceRequest(prio, node.getRackName()); ResourceRequest nodeRequest = getResourceRequest(prio, node.getNodeName()); return // There must be outstanding requests at the given priority: anyRequest != null && anyRequest.getNumContainers() > 0 && // If locality relaxation is turned off at *-level, there must be a // non-zero request for the node's rack: (anyRequest.getRelaxLocality() || (rackRequest != null && rackRequest.getNumContainers() > 0)) && // If locality relaxation is turned off at rack-level, there must be a // non-zero request at the node: (rackRequest == null || rackRequest.getRelaxLocality() || (nodeRequest != null && nodeRequest.getNumContainers() > 0)) && // The requested container must be able to fit on the node: Resources.lessThanOrEqual(RESOURCE_CALCULATOR, null, anyRequest.getCapability(), node.getRMNode().getTotalCapability()); } {quote} I really cannot understand why when there is no anyRequest , *hasContainerForNode()* return false directly without considering whether there is NODE_LOCAL or RACK_LOCAL requests. And , *AppSchedulingInfo.allocateNodeLocal()* and *AppSchedulingInfo.allocateRackLocal()* will also decrease the number of containers for *ResourceRequest.ANY*, this is another place where I feel confused. Really thanks for some prompt. > Why hasContainerForNode return false directly when there is no request of ANY > locality? > --- > > Key: YARN-7561 > URL: https://issues.apache.org/jira/browse/YARN-7561 > Project: Hadoop YARN > Issue Type: Task > Components: fairscheduler >Affects Versions: 2.7.3 >Reporter: wuchang > > I am studying the FairScheduler source cod of yarn 2.7.3. > By the code of class FSAppAttempt: > {code} > public boolean hasContainerForNode(Priority prio, FSSchedulerNode node) { > ResourceRequest anyRequest = getResourceRequest(prio, > ResourceRequest.ANY); > ResourceRequest rackRequest = getResourceRequest(prio, > node.getRackName()); > ResourceRequest nodeRequest = getResourceRequest(prio, > node.getNodeName()); > > return > // There must be outstanding requests at the given priority: > anyRequest != null && anyRequest.getNumContainers() > 0 && > // If locality relaxation is turned off at *-level, there must be > a > // non-zero request for the node's rack: > (anyRequest.getRelaxLocality() || >
[jira] [Updated] (YARN-7561) Why hasContainerForNode return false directly when there is no request of ANY locality?
[ https://issues.apache.org/jira/browse/YARN-7561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuchang updated YARN-7561: -- Description: I am studying the FairScheduler source cod of yarn 2.7.3. By the code of class FSAppAttempt: {quote} public boolean hasContainerForNode(Priority prio, FSSchedulerNode node) { ResourceRequest anyRequest = getResourceRequest(prio, ResourceRequest.ANY); ResourceRequest rackRequest = getResourceRequest(prio, node.getRackName()); ResourceRequest nodeRequest = getResourceRequest(prio, node.getNodeName()); return // There must be outstanding requests at the given priority: anyRequest != null && anyRequest.getNumContainers() > 0 && // If locality relaxation is turned off at *-level, there must be a // non-zero request for the node's rack: (anyRequest.getRelaxLocality() || (rackRequest != null && rackRequest.getNumContainers() > 0)) && // If locality relaxation is turned off at rack-level, there must be a // non-zero request at the node: (rackRequest == null || rackRequest.getRelaxLocality() || (nodeRequest != null && nodeRequest.getNumContainers() > 0)) && // The requested container must be able to fit on the node: Resources.lessThanOrEqual(RESOURCE_CALCULATOR, null, anyRequest.getCapability(), node.getRMNode().getTotalCapability()); } {quote} I really cannot understand why when there is no anyRequest , *hasContainerForNode()* return false directly without considering whether there is NODE_LOCAL or RACK_LOCAL requests. And , *AppSchedulingInfo.allocateNodeLocal()* and *AppSchedulingInfo.allocateRackLocal()* will also decrease the number of containers for *ResourceRequest.ANY*, this is another place where I feel confused. Really thanks for some prompt. was: I am studying the FairScheduler source cod of yarn 2.7.3. By the code of class FSAppAttempt: ``` public boolean hasContainerForNode(Priority prio, FSSchedulerNode node) { ResourceRequest anyRequest = getResourceRequest(prio, ResourceRequest.ANY); ResourceRequest rackRequest = getResourceRequest(prio, node.getRackName()); ResourceRequest nodeRequest = getResourceRequest(prio, node.getNodeName()); return // There must be outstanding requests at the given priority: anyRequest != null && anyRequest.getNumContainers() > 0 && // If locality relaxation is turned off at *-level, there must be a // non-zero request for the node's rack: (anyRequest.getRelaxLocality() || (rackRequest != null && rackRequest.getNumContainers() > 0)) && // If locality relaxation is turned off at rack-level, there must be a // non-zero request at the node: (rackRequest == null || rackRequest.getRelaxLocality() || (nodeRequest != null && nodeRequest.getNumContainers() > 0)) && // The requested container must be able to fit on the node: Resources.lessThanOrEqual(RESOURCE_CALCULATOR, null, anyRequest.getCapability(), node.getRMNode().getTotalCapability()); } ``` I really cannot understand why when there is no anyRequest , `hasContainerForNode()` return false directly without considering whether there is NODE_LOCAL or RACK_LOCAL requests. And , `AppSchedulingInfo.allocateNodeLocal()` and `AppSchedulingInfo.allocateRackLocal()` will also decrease the number of containers for `ResourceRequest.ANY`, this is another place where I feel confused. Really thanks for some prompt. > Why hasContainerForNode return false directly when there is no request of ANY > locality? > --- > > Key: YARN-7561 > URL: https://issues.apache.org/jira/browse/YARN-7561 > Project: Hadoop YARN > Issue Type: Task > Components: fairscheduler >Affects Versions: 2.7.3 >Reporter: wuchang > > I am studying the FairScheduler source cod of yarn 2.7.3. > By the code of class FSAppAttempt: > {quote} > public boolean hasContainerForNode(Priority prio, FSSchedulerNode node) { > ResourceRequest anyRequest = getResourceRequest(prio, > ResourceRequest.ANY); > ResourceRequest rackRequest = getResourceRequest(prio, > node.getRackName()); > ResourceRequest nodeRequest = getResourceRequest(prio, > node.getNodeName()); > > return > // There must be outstanding requests at the given priority: > anyRequest != null && anyRequest.getNumContainers() > 0 && > // If locality relaxation is turned off at *-level, there must be > a > // non-zero request for the node's rack: > (anyRequest.getRelaxLocality() || >
[jira] [Updated] (YARN-7561) Why hasContainerForNode return false directly when there is no request of ANY locality?
[ https://issues.apache.org/jira/browse/YARN-7561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuchang updated YARN-7561: -- Description: I am studying the FairScheduler source cod of yarn 2.7.3. By the code of class FSAppAttempt: ``` public boolean hasContainerForNode(Priority prio, FSSchedulerNode node) { ResourceRequest anyRequest = getResourceRequest(prio, ResourceRequest.ANY); ResourceRequest rackRequest = getResourceRequest(prio, node.getRackName()); ResourceRequest nodeRequest = getResourceRequest(prio, node.getNodeName()); return // There must be outstanding requests at the given priority: anyRequest != null && anyRequest.getNumContainers() > 0 && // If locality relaxation is turned off at *-level, there must be a // non-zero request for the node's rack: (anyRequest.getRelaxLocality() || (rackRequest != null && rackRequest.getNumContainers() > 0)) && // If locality relaxation is turned off at rack-level, there must be a // non-zero request at the node: (rackRequest == null || rackRequest.getRelaxLocality() || (nodeRequest != null && nodeRequest.getNumContainers() > 0)) && // The requested container must be able to fit on the node: Resources.lessThanOrEqual(RESOURCE_CALCULATOR, null, anyRequest.getCapability(), node.getRMNode().getTotalCapability()); } ``` I really cannot understand why when there is no anyRequest , `hasContainerForNode()` return false directly without considering whether there is NODE_LOCAL or RACK_LOCAL requests. And , `AppSchedulingInfo.allocateNodeLocal()` and `AppSchedulingInfo.allocateRackLocal()` will also decrease the number of containers for `ResourceRequest.ANY`, this is another place where I feel confused. Really thanks for some prompt. was: I am studying the FairScheduler source cod of yarn 2.7.3. By the code of class FSAppAttempt: {code} public boolean hasContainerForNode(Priority prio, FSSchedulerNode node) { ResourceRequest anyRequest = getResourceRequest(prio, ResourceRequest.ANY); ResourceRequest rackRequest = getResourceRequest(prio, node.getRackName()); ResourceRequest nodeRequest = getResourceRequest(prio, node.getNodeName()); return // There must be outstanding requests at the given priority: anyRequest != null && anyRequest.getNumContainers() > 0 && // If locality relaxation is turned off at *-level, there must be a // non-zero request for the node's rack: (anyRequest.getRelaxLocality() || (rackRequest != null && rackRequest.getNumContainers() > 0)) && // If locality relaxation is turned off at rack-level, there must be a // non-zero request at the node: (rackRequest == null || rackRequest.getRelaxLocality() || (nodeRequest != null && nodeRequest.getNumContainers() > 0)) && // The requested container must be able to fit on the node: Resources.lessThanOrEqual(RESOURCE_CALCULATOR, null, anyRequest.getCapability(), node.getRMNode().getTotalCapability()); } {code} I really cannot understand why when there is no anyRequest , *hasContainerForNode()* return false directly without considering whether there is NODE_LOCAL or RACK_LOCAL requests. And , *AppSchedulingInfo.allocateNodeLocal()* and *AppSchedulingInfo.allocateRackLocal()* will also decrease the number of containers for ResourceRequest.ANY, this is another place where I feel confused. Really thanks for some prompt. > Why hasContainerForNode return false directly when there is no request of ANY > locality? > --- > > Key: YARN-7561 > URL: https://issues.apache.org/jira/browse/YARN-7561 > Project: Hadoop YARN > Issue Type: Task > Components: fairscheduler >Affects Versions: 2.7.3 >Reporter: wuchang > > I am studying the FairScheduler source cod of yarn 2.7.3. > By the code of class FSAppAttempt: > ``` > public boolean hasContainerForNode(Priority prio, FSSchedulerNode node) { > ResourceRequest anyRequest = getResourceRequest(prio, > ResourceRequest.ANY); > ResourceRequest rackRequest = getResourceRequest(prio, > node.getRackName()); > ResourceRequest nodeRequest = getResourceRequest(prio, > node.getNodeName()); > > return > // There must be outstanding requests at the given priority: > anyRequest != null && anyRequest.getNumContainers() > 0 && > // If locality relaxation is turned off at *-level, there must be > a > // non-zero request for the node's rack: > (anyRequest.getRelaxLocality() || >
[jira] [Updated] (YARN-7561) Why hasContainerForNode return false directly when there is no request of ANY locality?
[ https://issues.apache.org/jira/browse/YARN-7561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuchang updated YARN-7561: -- Description: I am studying the FairScheduler source cod of yarn 2.7.3. By the code of class FSAppAttempt: {code} public boolean hasContainerForNode(Priority prio, FSSchedulerNode node) { ResourceRequest anyRequest = getResourceRequest(prio, ResourceRequest.ANY); ResourceRequest rackRequest = getResourceRequest(prio, node.getRackName()); ResourceRequest nodeRequest = getResourceRequest(prio, node.getNodeName()); return // There must be outstanding requests at the given priority: anyRequest != null && anyRequest.getNumContainers() > 0 && // If locality relaxation is turned off at *-level, there must be a // non-zero request for the node's rack: (anyRequest.getRelaxLocality() || (rackRequest != null && rackRequest.getNumContainers() > 0)) && // If locality relaxation is turned off at rack-level, there must be a // non-zero request at the node: (rackRequest == null || rackRequest.getRelaxLocality() || (nodeRequest != null && nodeRequest.getNumContainers() > 0)) && // The requested container must be able to fit on the node: Resources.lessThanOrEqual(RESOURCE_CALCULATOR, null, anyRequest.getCapability(), node.getRMNode().getTotalCapability()); } {code} I really cannot understand why when there is no anyRequest , *hasContainerForNode()* return false directly without considering whether there is NODE_LOCAL or RACK_LOCAL requests. And , *AppSchedulingInfo.allocateNodeLocal()* and *AppSchedulingInfo.allocateRackLocal()* will also decrease the number of containers for ResourceRequest.ANY, this is another place where I feel confused. Really thanks for some prompt. was: I am studying the FairScheduler source cod of yarn 2.7.3. By the code of class FSAppAttempt: {code} public boolean hasContainerForNode(Priority prio, FSSchedulerNode node) { ResourceRequest anyRequest = getResourceRequest(prio, ResourceRequest.ANY); ResourceRequest rackRequest = getResourceRequest(prio, node.getRackName()); ResourceRequest nodeRequest = getResourceRequest(prio, node.getNodeName()); return // There must be outstanding requests at the given priority: anyRequest != null && anyRequest.getNumContainers() > 0 && // If locality relaxation is turned off at *-level, there must be a // non-zero request for the node's rack: (anyRequest.getRelaxLocality() || (rackRequest != null && rackRequest.getNumContainers() > 0)) && // If locality relaxation is turned off at rack-level, there must be a // non-zero request at the node: (rackRequest == null || rackRequest.getRelaxLocality() || (nodeRequest != null && nodeRequest.getNumContainers() > 0)) && // The requested container must be able to fit on the node: Resources.lessThanOrEqual(RESOURCE_CALCULATOR, null, anyRequest.getCapability(), node.getRMNode().getTotalCapability()); } {code} I really cannot understand why when there is no anyRequest , *hasContainerForNode()* return false directly without considering where there is NODE_LOCAL or RACK_LOCAL requests, and , *AppSchedulingInfo.allocateNodeLocal()* and *AppSchedulingInfo.allocateRackLocal()* will also decrease the number of containers for ResourceRequest.ANY. Really thanks for some prompt. > Why hasContainerForNode return false directly when there is no request of ANY > locality? > --- > > Key: YARN-7561 > URL: https://issues.apache.org/jira/browse/YARN-7561 > Project: Hadoop YARN > Issue Type: Task > Components: fairscheduler >Affects Versions: 2.7.3 >Reporter: wuchang > > I am studying the FairScheduler source cod of yarn 2.7.3. > By the code of class FSAppAttempt: > {code} > public boolean hasContainerForNode(Priority prio, FSSchedulerNode node) { > ResourceRequest anyRequest = getResourceRequest(prio, > ResourceRequest.ANY); > ResourceRequest rackRequest = getResourceRequest(prio, > node.getRackName()); > ResourceRequest nodeRequest = getResourceRequest(prio, > node.getNodeName()); > return > // There must be outstanding requests at the given priority: > anyRequest != null && anyRequest.getNumContainers() > 0 && > // If locality relaxation is turned off at *-level, there must be > a > // non-zero request for the node's rack: > (anyRequest.getRelaxLocality() || > (rackRequest != null && rackRequest.getNumContai
[jira] [Updated] (YARN-7561) Why hasContainerForNode return false directly when there is no request of ANY locality?
[ https://issues.apache.org/jira/browse/YARN-7561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuchang updated YARN-7561: -- Description: I am studying the FairScheduler source cod of yarn 2.7.3. By the code of class FSAppAttempt: {code} public boolean hasContainerForNode(Priority prio, FSSchedulerNode node) { ResourceRequest anyRequest = getResourceRequest(prio, ResourceRequest.ANY); ResourceRequest rackRequest = getResourceRequest(prio, node.getRackName()); ResourceRequest nodeRequest = getResourceRequest(prio, node.getNodeName()); return // There must be outstanding requests at the given priority: anyRequest != null && anyRequest.getNumContainers() > 0 && // If locality relaxation is turned off at *-level, there must be a // non-zero request for the node's rack: (anyRequest.getRelaxLocality() || (rackRequest != null && rackRequest.getNumContainers() > 0)) && // If locality relaxation is turned off at rack-level, there must be a // non-zero request at the node: (rackRequest == null || rackRequest.getRelaxLocality() || (nodeRequest != null && nodeRequest.getNumContainers() > 0)) && // The requested container must be able to fit on the node: Resources.lessThanOrEqual(RESOURCE_CALCULATOR, null, anyRequest.getCapability(), node.getRMNode().getTotalCapability()); } {code} I really cannot understand why when there is no anyRequest , *hasContainerForNode()* return false directly without considering where there is NODE_LOCAL or RACK_LOCAL requests, and , *AppSchedulingInfo.allocateNodeLocal()* and *AppSchedulingInfo.allocateRackLocal()* will also decrease the number of containers for ResourceRequest.ANY. Really thanks for some prompt. was: I am studying the FairScheduler source cod of yarn 2.7.3. By the code of in class FSAppAttempt: {code} public boolean hasContainerForNode(Priority prio, FSSchedulerNode node) { ResourceRequest anyRequest = getResourceRequest(prio, ResourceRequest.ANY); ResourceRequest rackRequest = getResourceRequest(prio, node.getRackName()); ResourceRequest nodeRequest = getResourceRequest(prio, node.getNodeName()); return // There must be outstanding requests at the given priority: anyRequest != null && anyRequest.getNumContainers() > 0 && // If locality relaxation is turned off at *-level, there must be a // non-zero request for the node's rack: (anyRequest.getRelaxLocality() || (rackRequest != null && rackRequest.getNumContainers() > 0)) && // If locality relaxation is turned off at rack-level, there must be a // non-zero request at the node: (rackRequest == null || rackRequest.getRelaxLocality() || (nodeRequest != null && nodeRequest.getNumContainers() > 0)) && // The requested container must be able to fit on the node: Resources.lessThanOrEqual(RESOURCE_CALCULATOR, null, anyRequest.getCapability(), node.getRMNode().getTotalCapability()); } {code} I really cannot understand why when there is no anyRequest , *hasContainerForNode()* return false directly without considering where there is NODE_LOCAL or RACK_LOCAL requests, and , *AppSchedulingInfo.allocateNodeLocal()* and *AppSchedulingInfo.allocateRackLocal()* will also decrease the number of containers for ResourceRequest.ANY. Really thanks for some prompt. > Why hasContainerForNode return false directly when there is no request of ANY > locality? > --- > > Key: YARN-7561 > URL: https://issues.apache.org/jira/browse/YARN-7561 > Project: Hadoop YARN > Issue Type: Task > Components: fairscheduler >Affects Versions: 2.7.3 >Reporter: wuchang > > I am studying the FairScheduler source cod of yarn 2.7.3. > By the code of class FSAppAttempt: > {code} > public boolean hasContainerForNode(Priority prio, FSSchedulerNode node) { > ResourceRequest anyRequest = getResourceRequest(prio, > ResourceRequest.ANY); > ResourceRequest rackRequest = getResourceRequest(prio, > node.getRackName()); > ResourceRequest nodeRequest = getResourceRequest(prio, > node.getNodeName()); > return > // There must be outstanding requests at the given priority: > anyRequest != null && anyRequest.getNumContainers() > 0 && > // If locality relaxation is turned off at *-level, there must be > a > // non-zero request for the node's rack: > (anyRequest.getRelaxLocality() || > (rackRequest != null && rackRequest.getNumContainers() > 0)) > && > // If local
[jira] [Updated] (YARN-7561) Why hasContainerForNode return false directly when there is no request of ANY locality?
[ https://issues.apache.org/jira/browse/YARN-7561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuchang updated YARN-7561: -- Description: I am studying the FairScheduler source cod of yarn 2.7.3. By the code of in class FSAppAttempt: {code} public boolean hasContainerForNode(Priority prio, FSSchedulerNode node) { ResourceRequest anyRequest = getResourceRequest(prio, ResourceRequest.ANY); ResourceRequest rackRequest = getResourceRequest(prio, node.getRackName()); ResourceRequest nodeRequest = getResourceRequest(prio, node.getNodeName()); return // There must be outstanding requests at the given priority: anyRequest != null && anyRequest.getNumContainers() > 0 && // If locality relaxation is turned off at *-level, there must be a // non-zero request for the node's rack: (anyRequest.getRelaxLocality() || (rackRequest != null && rackRequest.getNumContainers() > 0)) && // If locality relaxation is turned off at rack-level, there must be a // non-zero request at the node: (rackRequest == null || rackRequest.getRelaxLocality() || (nodeRequest != null && nodeRequest.getNumContainers() > 0)) && // The requested container must be able to fit on the node: Resources.lessThanOrEqual(RESOURCE_CALCULATOR, null, anyRequest.getCapability(), node.getRMNode().getTotalCapability()); } {code} I really cannot understand why when there is no anyRequest , *hasContainerForNode()* return false directly without considering where there is NODE_LOCAL or RACK_LOCAL requests, and , *AppSchedulingInfo.allocateNodeLocal()* and *AppSchedulingInfo.allocateRackLocal()* will also decrease the number of containers for ResourceRequest.ANY. Really thanks for some prompt. was: I am studying the FairScheduler source cod of yarn 2.7.3. By the code of in class FSAppAttempt: {code} public boolean hasContainerForNode(Priority prio, FSSchedulerNode node) { ResourceRequest anyRequest = getResourceRequest(prio, ResourceRequest.ANY); ResourceRequest rackRequest = getResourceRequest(prio, node.getRackName()); ResourceRequest nodeRequest = getResourceRequest(prio, node.getNodeName()); return // There must be outstanding requests at the given priority: anyRequest != null && anyRequest.getNumContainers() > 0 && // If locality relaxation is turned off at *-level, there must be a // non-zero request for the node's rack: (anyRequest.getRelaxLocality() || (rackRequest != null && rackRequest.getNumContainers() > 0)) && // If locality relaxation is turned off at rack-level, there must be a // non-zero request at the node: (rackRequest == null || rackRequest.getRelaxLocality() || (nodeRequest != null && nodeRequest.getNumContainers() > 0)) && // The requested container must be able to fit on the node: Resources.lessThanOrEqual(RESOURCE_CALCULATOR, null, anyRequest.getCapability(), node.getRMNode().getTotalCapability()); } {code} I really cannot understand why when there is no anyRequest , the method return false directly without considering where there is NODE_LOCAL or RACK_LOCAL requests, and , {quote}AppSchedulingInfo.allocateNodeLocal(){quote} and {quote}AppSchedulingInfo.allocateRackLocal(){quote} will also decrease the number of containers for ResourceRequest.ANY. Really thanks for some prompt. > Why hasContainerForNode return false directly when there is no request of ANY > locality? > --- > > Key: YARN-7561 > URL: https://issues.apache.org/jira/browse/YARN-7561 > Project: Hadoop YARN > Issue Type: Task > Components: fairscheduler >Affects Versions: 2.7.3 >Reporter: wuchang > > I am studying the FairScheduler source cod of yarn 2.7.3. > By the code of in class FSAppAttempt: > {code} > public boolean hasContainerForNode(Priority prio, FSSchedulerNode node) { > ResourceRequest anyRequest = getResourceRequest(prio, > ResourceRequest.ANY); > ResourceRequest rackRequest = getResourceRequest(prio, > node.getRackName()); > ResourceRequest nodeRequest = getResourceRequest(prio, > node.getNodeName()); > return > // There must be outstanding requests at the given priority: > anyRequest != null && anyRequest.getNumContainers() > 0 && > // If locality relaxation is turned off at *-level, there must be > a > // non-zero request for the node's rack: > (anyRequest.getRelaxLocality() || > (rackRequest != null && rackRequest.getNumContainers() > 0)) > && >
[jira] [Updated] (YARN-7561) Why hasContainerForNode return false directly when there is no request of ANY locality?
[ https://issues.apache.org/jira/browse/YARN-7561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuchang updated YARN-7561: -- Description: I am studying the FairScheduler source cod of yarn 2.7.3. By the code of in class FSAppAttempt: {code} public boolean hasContainerForNode(Priority prio, FSSchedulerNode node) { ResourceRequest anyRequest = getResourceRequest(prio, ResourceRequest.ANY); ResourceRequest rackRequest = getResourceRequest(prio, node.getRackName()); ResourceRequest nodeRequest = getResourceRequest(prio, node.getNodeName()); return // There must be outstanding requests at the given priority: anyRequest != null && anyRequest.getNumContainers() > 0 && // If locality relaxation is turned off at *-level, there must be a // non-zero request for the node's rack: (anyRequest.getRelaxLocality() || (rackRequest != null && rackRequest.getNumContainers() > 0)) && // If locality relaxation is turned off at rack-level, there must be a // non-zero request at the node: (rackRequest == null || rackRequest.getRelaxLocality() || (nodeRequest != null && nodeRequest.getNumContainers() > 0)) && // The requested container must be able to fit on the node: Resources.lessThanOrEqual(RESOURCE_CALCULATOR, null, anyRequest.getCapability(), node.getRMNode().getTotalCapability()); } {code} I really cannot understand why when there is no anyRequest , the method return false directly without considering where there is NODE_LOCAL or RACK_LOCAL requests, and , {quote}AppSchedulingInfo.allocateNodeLocal(){quote} and {quote}AppSchedulingInfo.allocateRackLocal(){quote} will also decrease the number of containers for ResourceRequest.ANY. Really thanks for some prompt. was: I am studying the FairScheduler source cod of yarn 2.7.3. By the code of in class FSAppAttempt: {code} public boolean hasContainerForNode(Priority prio, FSSchedulerNode node) { ResourceRequest anyRequest = getResourceRequest(prio, ResourceRequest.ANY); ResourceRequest rackRequest = getResourceRequest(prio, node.getRackName()); ResourceRequest nodeRequest = getResourceRequest(prio, node.getNodeName()); return // There must be outstanding requests at the given priority: anyRequest != null && anyRequest.getNumContainers() > 0 && // If locality relaxation is turned off at *-level, there must be a // non-zero request for the node's rack: (anyRequest.getRelaxLocality() || (rackRequest != null && rackRequest.getNumContainers() > 0)) && // If locality relaxation is turned off at rack-level, there must be a // non-zero request at the node: (rackRequest == null || rackRequest.getRelaxLocality() || (nodeRequest != null && nodeRequest.getNumContainers() > 0)) && // The requested container must be able to fit on the node: Resources.lessThanOrEqual(RESOURCE_CALCULATOR, null, anyRequest.getCapability(), node.getRMNode().getTotalCapability()); } {code} I really cannot understand why when there is no anyRequest , the method return false directly without considering where there is NODE_LOCAL or RACK_LOCAL requests, and , *AppSchedulingInfo.allocateNodeLocal()* and *AppSchedulingInfo.allocateRackLocal()* will also decrease the number of containers for ResourceRequest.ANY. Really thanks for some prompt. > Why hasContainerForNode return false directly when there is no request of ANY > locality? > --- > > Key: YARN-7561 > URL: https://issues.apache.org/jira/browse/YARN-7561 > Project: Hadoop YARN > Issue Type: Task > Components: fairscheduler >Affects Versions: 2.7.3 >Reporter: wuchang > > I am studying the FairScheduler source cod of yarn 2.7.3. > By the code of in class FSAppAttempt: > {code} > public boolean hasContainerForNode(Priority prio, FSSchedulerNode node) { > ResourceRequest anyRequest = getResourceRequest(prio, > ResourceRequest.ANY); > ResourceRequest rackRequest = getResourceRequest(prio, > node.getRackName()); > ResourceRequest nodeRequest = getResourceRequest(prio, > node.getNodeName()); > return > // There must be outstanding requests at the given priority: > anyRequest != null && anyRequest.getNumContainers() > 0 && > // If locality relaxation is turned off at *-level, there must be > a > // non-zero request for the node's rack: > (anyRequest.getRelaxLocality() || > (rackRequest != null && rackRequest.getNumContainers() > 0)) > && > // If
[jira] [Created] (YARN-7561) Why hasContainerForNode return false directly when there is no request of ANY locality?
wuchang created YARN-7561: - Summary: Why hasContainerForNode return false directly when there is no request of ANY locality? Key: YARN-7561 URL: https://issues.apache.org/jira/browse/YARN-7561 Project: Hadoop YARN Issue Type: Task Components: fairscheduler Affects Versions: 2.7.3 Reporter: wuchang I am studying the FairScheduler source cod of yarn 2.7.3. By the code of in class FSAppAttempt: {quote} public boolean hasContainerForNode(Priority prio, FSSchedulerNode node) { ResourceRequest anyRequest = getResourceRequest(prio, ResourceRequest.ANY); ResourceRequest rackRequest = getResourceRequest(prio, node.getRackName()); ResourceRequest nodeRequest = getResourceRequest(prio, node.getNodeName()); return // There must be outstanding requests at the given priority: anyRequest != null && anyRequest.getNumContainers() > 0 && // If locality relaxation is turned off at *-level, there must be a // non-zero request for the node's rack: (anyRequest.getRelaxLocality() || (rackRequest != null && rackRequest.getNumContainers() > 0)) && // If locality relaxation is turned off at rack-level, there must be a // non-zero request at the node: (rackRequest == null || rackRequest.getRelaxLocality() || (nodeRequest != null && nodeRequest.getNumContainers() > 0)) && // The requested container must be able to fit on the node: Resources.lessThanOrEqual(RESOURCE_CALCULATOR, null, anyRequest.getCapability(), node.getRMNode().getTotalCapability()); } {quote} I really cannot understand why when there is no anyRequest , the method return false directly without considering where there is NODE_LOCAL or RACK_LOCAL requests, and , *AppSchedulingInfo.allocateNodeLocal()* and *AppSchedulingInfo.allocateRackLocal()* will also decrease the number of containers for ResourceRequest.ANY. Really thanks for some prompt. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-7561) Why hasContainerForNode return false directly when there is no request of ANY locality?
[ https://issues.apache.org/jira/browse/YARN-7561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wuchang updated YARN-7561: -- Description: I am studying the FairScheduler source cod of yarn 2.7.3. By the code of in class FSAppAttempt: {code} public boolean hasContainerForNode(Priority prio, FSSchedulerNode node) { ResourceRequest anyRequest = getResourceRequest(prio, ResourceRequest.ANY); ResourceRequest rackRequest = getResourceRequest(prio, node.getRackName()); ResourceRequest nodeRequest = getResourceRequest(prio, node.getNodeName()); return // There must be outstanding requests at the given priority: anyRequest != null && anyRequest.getNumContainers() > 0 && // If locality relaxation is turned off at *-level, there must be a // non-zero request for the node's rack: (anyRequest.getRelaxLocality() || (rackRequest != null && rackRequest.getNumContainers() > 0)) && // If locality relaxation is turned off at rack-level, there must be a // non-zero request at the node: (rackRequest == null || rackRequest.getRelaxLocality() || (nodeRequest != null && nodeRequest.getNumContainers() > 0)) && // The requested container must be able to fit on the node: Resources.lessThanOrEqual(RESOURCE_CALCULATOR, null, anyRequest.getCapability(), node.getRMNode().getTotalCapability()); } {code} I really cannot understand why when there is no anyRequest , the method return false directly without considering where there is NODE_LOCAL or RACK_LOCAL requests, and , *AppSchedulingInfo.allocateNodeLocal()* and *AppSchedulingInfo.allocateRackLocal()* will also decrease the number of containers for ResourceRequest.ANY. Really thanks for some prompt. was: I am studying the FairScheduler source cod of yarn 2.7.3. By the code of in class FSAppAttempt: {quote} public boolean hasContainerForNode(Priority prio, FSSchedulerNode node) { ResourceRequest anyRequest = getResourceRequest(prio, ResourceRequest.ANY); ResourceRequest rackRequest = getResourceRequest(prio, node.getRackName()); ResourceRequest nodeRequest = getResourceRequest(prio, node.getNodeName()); return // There must be outstanding requests at the given priority: anyRequest != null && anyRequest.getNumContainers() > 0 && // If locality relaxation is turned off at *-level, there must be a // non-zero request for the node's rack: (anyRequest.getRelaxLocality() || (rackRequest != null && rackRequest.getNumContainers() > 0)) && // If locality relaxation is turned off at rack-level, there must be a // non-zero request at the node: (rackRequest == null || rackRequest.getRelaxLocality() || (nodeRequest != null && nodeRequest.getNumContainers() > 0)) && // The requested container must be able to fit on the node: Resources.lessThanOrEqual(RESOURCE_CALCULATOR, null, anyRequest.getCapability(), node.getRMNode().getTotalCapability()); } {quote} I really cannot understand why when there is no anyRequest , the method return false directly without considering where there is NODE_LOCAL or RACK_LOCAL requests, and , *AppSchedulingInfo.allocateNodeLocal()* and *AppSchedulingInfo.allocateRackLocal()* will also decrease the number of containers for ResourceRequest.ANY. Really thanks for some prompt. > Why hasContainerForNode return false directly when there is no request of ANY > locality? > --- > > Key: YARN-7561 > URL: https://issues.apache.org/jira/browse/YARN-7561 > Project: Hadoop YARN > Issue Type: Task > Components: fairscheduler >Affects Versions: 2.7.3 >Reporter: wuchang > > I am studying the FairScheduler source cod of yarn 2.7.3. > By the code of in class FSAppAttempt: > {code} > public boolean hasContainerForNode(Priority prio, FSSchedulerNode node) { > ResourceRequest anyRequest = getResourceRequest(prio, > ResourceRequest.ANY); > ResourceRequest rackRequest = getResourceRequest(prio, > node.getRackName()); > ResourceRequest nodeRequest = getResourceRequest(prio, > node.getNodeName()); > return > // There must be outstanding requests at the given priority: > anyRequest != null && anyRequest.getNumContainers() > 0 && > // If locality relaxation is turned off at *-level, there must be > a > // non-zero request for the node's rack: > (anyRequest.getRelaxLocality() || > (rackRequest != null && rackRequest.getNumContainers() > 0)) > && > // If locality relaxation i
[jira] [Commented] (YARN-7535) We should display origin value of demand in fair scheduler page
[ https://issues.apache.org/jira/browse/YARN-7535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16266313#comment-16266313 ] Wilfred Spiegelenburg commented on YARN-7535: - The code has changed in recent versions, there is no updateDemandForApp any more after YARN-6172. Demand for a queue as [~yufeigu] explained should be limited to the maximum the queue can use. So the existing code should be left as is. Changing the calculation would affect the minimum share starvation and some other calculations that use the demand. Having the extra detail on how high demand really is in a queue is could provide some more detail for tuning. The {{FSAppAttempt}} does not cap it so we have the info already. Some considerations: - We could store the extra detail to the {{leafQueue}}. There would not really be an overhead beside some extra local storage. - Adding it to the {{parentQueue}} to get it for the whole hierarchy would be possible but it does involve overhead. We would then also need to choose if we want the unlimited demand from the child queue or the limited version - The scheduler state dump is easily changed, - Do we want to display this in the web UI? It might be confusing to show the two numbers always and the state dump would be a much better place because it can be seen over time instead of just one instance > We should display origin value of demand in fair scheduler page > --- > > Key: YARN-7535 > URL: https://issues.apache.org/jira/browse/YARN-7535 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Reporter: YunFan Zhou >Assignee: YunFan Zhou > > The value of *demand* of leaf queue that we now view on the fair scheduler > page shows only the value of *maxResources* when the demand value is greater > than *maxResources*. It doesn't reflect the real situation. Most of the time, > when we expand the queue, we often rely on seeing the current demand real > value. > {code:java} > private void updateDemandForApp(FSAppAttempt sched, Resource maxRes) { > sched.updateDemand(); > Resource toAdd = sched.getDemand(); > if (LOG.isDebugEnabled()) { > LOG.debug("Counting resource from " + sched.getName() + " " + toAdd > + "; Total resource consumption for " + getName() + " now " > + demand); > } > demand = Resources.add(demand, toAdd); > demand = Resources.componentwiseMin(demand, maxRes); > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7534) Fair scheduler assign resources may exceed maxResources
[ https://issues.apache.org/jira/browse/YARN-7534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16266302#comment-16266302 ] Wilfred Spiegelenburg commented on YARN-7534: - Based on the current analysis I do not think we have a problem. [~daemon] if you have logs that show this is not working please attach otherwise I will close this as not a problem > Fair scheduler assign resources may exceed maxResources > --- > > Key: YARN-7534 > URL: https://issues.apache.org/jira/browse/YARN-7534 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Reporter: YunFan Zhou >Assignee: Wilfred Spiegelenburg > > The logic we're scheduling now is to check whether the resources used by the > queue has exceeded *maxResources* before assigning the container. This will > leads to the fact that after assigning this container the queue uses more > resources than *maxResources*. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7560) Resourcemanager hangs when resourceUsedWithWeightToResourceRatio return a overflow value
[ https://issues.apache.org/jira/browse/YARN-7560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16266300#comment-16266300 ] Wilfred Spiegelenburg commented on YARN-7560: - looks good to me, +1 (non binding) > Resourcemanager hangs when resourceUsedWithWeightToResourceRatio return a > overflow value > -- > > Key: YARN-7560 > URL: https://issues.apache.org/jira/browse/YARN-7560 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, resourcemanager >Affects Versions: 3.0.0 >Reporter: zhengchenyu >Assignee: zhengchenyu > Fix For: 3.0.0 > > Attachments: YARN-7560.000.patch, YARN-7560.001.patch > > > In our cluster, we changed the configuration, then refreshQueues, we found > the resourcemanager hangs. And the Resourcemanager can't restart > successfully. We got jstack information, always show like this: > {code} > "main" #1 prio=5 os_prio=0 tid=0x7f98e8017000 nid=0x2f5 runnable > [0x7f98eed9a000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.ComputeFairShares.resourceUsedWithWeightToResourceRatio(ComputeFairShares.java:182) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.ComputeFairShares.computeSharesInternal(ComputeFairShares.java:140) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.ComputeFairShares.computeSteadyShares(ComputeFairShares.java:66) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.FairSharePolicy.computeSteadyShares(FairSharePolicy.java:148) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.recomputeSteadyShares(FSParentQueue.java:102) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.getQueue(QueueManager.java:148) > - locked <0x7f8c4a8177a0> (a java.util.HashMap) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.getLeafQueue(QueueManager.java:101) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.updateAllocationConfiguration(QueueManager.java:387) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$AllocationReloadListener.onReload(FairScheduler.java:1728) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:422) > - locked <0x7f8c4a7eb2e0> (a > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1597) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.serviceInit(FairScheduler.java:1621) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > - locked <0x7f8c4a76ac48> (a java.lang.Object) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:569) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > - locked <0x7f8c49254268> (a java.lang.Object) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:997) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:257) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > - locked <0x7f8c467495e0> (a java.lang.Object) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1220) > {code} > When we debug the cluster, we found resourceUsedWithWeightToResourceRatio > return a negative value. So the loop can't return. We found in our cluster, > the sum of all minRes is over int.max, so > resourceUsedWithWeightToResourceRatio return a negative value. > below is the loop. Because totalResource is long, so always postive. But > resourceUsedWithWeightToResourceRatio return int type. Our cluster is so big > that resourceUsedWithWeightToResourceRatio will return a overflow value, just > a negative. So the loop will never break. > {code} > while (resourceUsedWithWeightToResourceRatio(rMax, schedulables, type) > < totalResource) { > rMax *= 2.0; > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) ---
[jira] [Commented] (YARN-7560) Resourcemanager hangs when resourceUsedWithWeightToResourceRatio return a overflow value
[ https://issues.apache.org/jira/browse/YARN-7560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16266297#comment-16266297 ] zhengchenyu commented on YARN-7560: --- [~wilfreds] Thank for your advice, I have revised my patch. The new patch is YARN-7560.001.patch > Resourcemanager hangs when resourceUsedWithWeightToResourceRatio return a > overflow value > -- > > Key: YARN-7560 > URL: https://issues.apache.org/jira/browse/YARN-7560 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, resourcemanager >Affects Versions: 3.0.0 >Reporter: zhengchenyu >Assignee: zhengchenyu > Fix For: 3.0.0 > > Attachments: YARN-7560.000.patch, YARN-7560.001.patch > > > In our cluster, we changed the configuration, then refreshQueues, we found > the resourcemanager hangs. And the Resourcemanager can't restart > successfully. We got jstack information, always show like this: > {code} > "main" #1 prio=5 os_prio=0 tid=0x7f98e8017000 nid=0x2f5 runnable > [0x7f98eed9a000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.ComputeFairShares.resourceUsedWithWeightToResourceRatio(ComputeFairShares.java:182) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.ComputeFairShares.computeSharesInternal(ComputeFairShares.java:140) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.ComputeFairShares.computeSteadyShares(ComputeFairShares.java:66) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.FairSharePolicy.computeSteadyShares(FairSharePolicy.java:148) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.recomputeSteadyShares(FSParentQueue.java:102) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.getQueue(QueueManager.java:148) > - locked <0x7f8c4a8177a0> (a java.util.HashMap) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.getLeafQueue(QueueManager.java:101) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.updateAllocationConfiguration(QueueManager.java:387) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$AllocationReloadListener.onReload(FairScheduler.java:1728) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:422) > - locked <0x7f8c4a7eb2e0> (a > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1597) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.serviceInit(FairScheduler.java:1621) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > - locked <0x7f8c4a76ac48> (a java.lang.Object) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:569) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > - locked <0x7f8c49254268> (a java.lang.Object) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:997) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:257) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > - locked <0x7f8c467495e0> (a java.lang.Object) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1220) > {code} > When we debug the cluster, we found resourceUsedWithWeightToResourceRatio > return a negative value. So the loop can't return. We found in our cluster, > the sum of all minRes is over int.max, so > resourceUsedWithWeightToResourceRatio return a negative value. > below is the loop. Because totalResource is long, so always postive. But > resourceUsedWithWeightToResourceRatio return int type. Our cluster is so big > that resourceUsedWithWeightToResourceRatio will return a overflow value, just > a negative. So the loop will never break. > {code} > while (resourceUsedWithWeightToResourceRatio(rMax, schedulables, type) > < totalResource) { > rMax *= 2.0; > } > {code} -- This message was s
[jira] [Updated] (YARN-7560) Resourcemanager hangs when resourceUsedWithWeightToResourceRatio return a overflow value
[ https://issues.apache.org/jira/browse/YARN-7560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhengchenyu updated YARN-7560: -- Attachment: YARN-7560.001.patch > Resourcemanager hangs when resourceUsedWithWeightToResourceRatio return a > overflow value > -- > > Key: YARN-7560 > URL: https://issues.apache.org/jira/browse/YARN-7560 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, resourcemanager >Affects Versions: 3.0.0 >Reporter: zhengchenyu >Assignee: zhengchenyu > Fix For: 3.0.0 > > Attachments: YARN-7560.000.patch, YARN-7560.001.patch > > > In our cluster, we changed the configuration, then refreshQueues, we found > the resourcemanager hangs. And the Resourcemanager can't restart > successfully. We got jstack information, always show like this: > {code} > "main" #1 prio=5 os_prio=0 tid=0x7f98e8017000 nid=0x2f5 runnable > [0x7f98eed9a000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.ComputeFairShares.resourceUsedWithWeightToResourceRatio(ComputeFairShares.java:182) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.ComputeFairShares.computeSharesInternal(ComputeFairShares.java:140) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.ComputeFairShares.computeSteadyShares(ComputeFairShares.java:66) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.FairSharePolicy.computeSteadyShares(FairSharePolicy.java:148) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.recomputeSteadyShares(FSParentQueue.java:102) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.getQueue(QueueManager.java:148) > - locked <0x7f8c4a8177a0> (a java.util.HashMap) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.getLeafQueue(QueueManager.java:101) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.updateAllocationConfiguration(QueueManager.java:387) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$AllocationReloadListener.onReload(FairScheduler.java:1728) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:422) > - locked <0x7f8c4a7eb2e0> (a > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1597) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.serviceInit(FairScheduler.java:1621) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > - locked <0x7f8c4a76ac48> (a java.lang.Object) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:569) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > - locked <0x7f8c49254268> (a java.lang.Object) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:997) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:257) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > - locked <0x7f8c467495e0> (a java.lang.Object) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1220) > {code} > When we debug the cluster, we found resourceUsedWithWeightToResourceRatio > return a negative value. So the loop can't return. We found in our cluster, > the sum of all minRes is over int.max, so > resourceUsedWithWeightToResourceRatio return a negative value. > below is the loop. Because totalResource is long, so always postive. But > resourceUsedWithWeightToResourceRatio return int type. Our cluster is so big > that resourceUsedWithWeightToResourceRatio will return a overflow value, just > a negative. So the loop will never break. > {code} > while (resourceUsedWithWeightToResourceRatio(rMax, schedulables, type) > < totalResource) { > rMax *= 2.0; > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscr
[jira] [Commented] (YARN-7560) Resourcemanager hangs when resourceUsedWithWeightToResourceRatio return a overflow value
[ https://issues.apache.org/jira/browse/YARN-7560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16266273#comment-16266273 ] Wilfred Spiegelenburg commented on YARN-7560: - Thank you [~zhengchenyu] for the patch Some comments on the patch: * Can you please remove the unneeded casts to long that are left in computeSharesInternal, handleFixedFairShares: {code} 127 totalMaxShare = Math.min(maxShare + (long)totalMaxShare, 128 Long.MAX_VALUE); ... 169 target.setResourceValue(type, (long)computeShare(sched, right, type)); {code} and {code} 224totalResource = Math.min((long)totalResource + (long)fixedShare, 225Long.MAX_VALUE); {code} * In resourceUsedWithWeightToResourceRatio we should not have to create a temporary variable share and could do: {code} resourcesTaken += computeShare(sched, w2rRatio, type); {code} * In {{computeShare}} we should move the cast from double to long to the point where we calculate the share instead of leaving at to after we do the min and max checks and remove the cast at the end of the call that will speed up calculations slightly and won't change the outcome: {code} 192long share = (long)(sched.getWeight() * w2rRatio); {code} > Resourcemanager hangs when resourceUsedWithWeightToResourceRatio return a > overflow value > -- > > Key: YARN-7560 > URL: https://issues.apache.org/jira/browse/YARN-7560 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, resourcemanager >Affects Versions: 3.0.0 >Reporter: zhengchenyu >Assignee: zhengchenyu > Fix For: 3.0.0 > > Attachments: YARN-7560.000.patch > > > In our cluster, we changed the configuration, then refreshQueues, we found > the resourcemanager hangs. And the Resourcemanager can't restart > successfully. We got jstack information, always show like this: > {code} > "main" #1 prio=5 os_prio=0 tid=0x7f98e8017000 nid=0x2f5 runnable > [0x7f98eed9a000] >java.lang.Thread.State: RUNNABLE > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.ComputeFairShares.resourceUsedWithWeightToResourceRatio(ComputeFairShares.java:182) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.ComputeFairShares.computeSharesInternal(ComputeFairShares.java:140) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.ComputeFairShares.computeSteadyShares(ComputeFairShares.java:66) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.policies.FairSharePolicy.computeSteadyShares(FairSharePolicy.java:148) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSParentQueue.recomputeSteadyShares(FSParentQueue.java:102) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.getQueue(QueueManager.java:148) > - locked <0x7f8c4a8177a0> (a java.util.HashMap) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.getLeafQueue(QueueManager.java:101) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueueManager.updateAllocationConfiguration(QueueManager.java:387) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler$AllocationReloadListener.onReload(FairScheduler.java:1728) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService.reloadAllocations(AllocationFileLoaderService.java:422) > - locked <0x7f8c4a7eb2e0> (a > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationFileLoaderService) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.initScheduler(FairScheduler.java:1597) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.serviceInit(FairScheduler.java:1621) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > - locked <0x7f8c4a76ac48> (a java.lang.Object) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:569) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) > - locked <0x7f8c49254268> (a java.lang.Object) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:997) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:257) > at > org.apache.hadoop.service.Abstra
[jira] [Commented] (YARN-7119) yarn rmadmin -updateNodeResource should be updated for resource types
[ https://issues.apache.org/jira/browse/YARN-7119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16266261#comment-16266261 ] Daniel Templeton commented on YARN-7119: I'm on vacation, but I'll try to get to it tomorrow. :) > yarn rmadmin -updateNodeResource should be updated for resource types > - > > Key: YARN-7119 > URL: https://issues.apache.org/jira/browse/YARN-7119 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Affects Versions: YARN-3926 >Reporter: Daniel Templeton >Assignee: Manikandan R > Attachments: YARN-7119.001.patch, YARN-7119.002.patch, > YARN-7119.002.patch, YARN-7119.003.patch, YARN-7119.004.patch, > YARN-7119.004.patch, YARN-7119.005.patch, YARN-7119.006.patch, > YARN-7119.007.patch, YARN-7119.008.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7535) We should display origin value of demand in fair scheduler page
[ https://issues.apache.org/jira/browse/YARN-7535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16266206#comment-16266206 ] Yufei Gu commented on YARN-7535: Max resource is a hard limit, which means a queue can't use more resources than its max resources. Hence, demand and usage shouldn't be greater than max resources. The exiting code makes sense to me in that sense. Should we display the original demand other than normalized one? I am open to suggestions, and would like to hear more about why needs to show the original. cc [~wilfreds] > We should display origin value of demand in fair scheduler page > --- > > Key: YARN-7535 > URL: https://issues.apache.org/jira/browse/YARN-7535 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Reporter: YunFan Zhou >Assignee: YunFan Zhou > > The value of *demand* of leaf queue that we now view on the fair scheduler > page shows only the value of *maxResources* when the demand value is greater > than *maxResources*. It doesn't reflect the real situation. Most of the time, > when we expand the queue, we often rely on seeing the current demand real > value. > {code:java} > private void updateDemandForApp(FSAppAttempt sched, Resource maxRes) { > sched.updateDemand(); > Resource toAdd = sched.getDemand(); > if (LOG.isDebugEnabled()) { > LOG.debug("Counting resource from " + sched.getName() + " " + toAdd > + "; Total resource consumption for " + getName() + " now " > + demand); > } > demand = Resources.add(demand, toAdd); > demand = Resources.componentwiseMin(demand, maxRes); > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7119) yarn rmadmin -updateNodeResource should be updated for resource types
[ https://issues.apache.org/jira/browse/YARN-7119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16266097#comment-16266097 ] Manikandan R commented on YARN-7119: [~templedf] Can you please confirm changes based on your recent comments as jenkins report looks good? > yarn rmadmin -updateNodeResource should be updated for resource types > - > > Key: YARN-7119 > URL: https://issues.apache.org/jira/browse/YARN-7119 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager >Affects Versions: YARN-3926 >Reporter: Daniel Templeton >Assignee: Manikandan R > Attachments: YARN-7119.001.patch, YARN-7119.002.patch, > YARN-7119.002.patch, YARN-7119.003.patch, YARN-7119.004.patch, > YARN-7119.004.patch, YARN-7119.005.patch, YARN-7119.006.patch, > YARN-7119.007.patch, YARN-7119.008.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-7535) We should display origin value of demand in fair scheduler page
[ https://issues.apache.org/jira/browse/YARN-7535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16266004#comment-16266004 ] YunFan Zhou commented on YARN-7535: --- Hi, [~yufeigu] [~templedf] I'm sorry to bother you, but recently I was wondering why the demand of the queue cannot exceed *maxResources*. Is it a scheduling optimization need or a semantic consideration? If it is a scheduling optimization requirement, I think it is necessary to show the real value when the *demand *value of the queue is displayed on the web page. If it's a semantic consideration, can you give me a place where you can find the exact definition? Thanks! YunFan Zhou > We should display origin value of demand in fair scheduler page > --- > > Key: YARN-7535 > URL: https://issues.apache.org/jira/browse/YARN-7535 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Reporter: YunFan Zhou > > The value of *demand* of leaf queue that we now view on the fair scheduler > page shows only the value of *maxResources* when the demand value is greater > than *maxResources*. It doesn't reflect the real situation. Most of the time, > when we expand the queue, we often rely on seeing the current demand real > value. > {code:java} > private void updateDemandForApp(FSAppAttempt sched, Resource maxRes) { > sched.updateDemand(); > Resource toAdd = sched.getDemand(); > if (LOG.isDebugEnabled()) { > LOG.debug("Counting resource from " + sched.getName() + " " + toAdd > + "; Total resource consumption for " + getName() + " now " > + demand); > } > demand = Resources.add(demand, toAdd); > demand = Resources.componentwiseMin(demand, maxRes); > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-7535) We should display origin value of demand in fair scheduler page
[ https://issues.apache.org/jira/browse/YARN-7535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YunFan Zhou reassigned YARN-7535: - Assignee: YunFan Zhou > We should display origin value of demand in fair scheduler page > --- > > Key: YARN-7535 > URL: https://issues.apache.org/jira/browse/YARN-7535 > Project: Hadoop YARN > Issue Type: Bug > Components: scheduler >Reporter: YunFan Zhou >Assignee: YunFan Zhou > > The value of *demand* of leaf queue that we now view on the fair scheduler > page shows only the value of *maxResources* when the demand value is greater > than *maxResources*. It doesn't reflect the real situation. Most of the time, > when we expand the queue, we often rely on seeing the current demand real > value. > {code:java} > private void updateDemandForApp(FSAppAttempt sched, Resource maxRes) { > sched.updateDemand(); > Resource toAdd = sched.getDemand(); > if (LOG.isDebugEnabled()) { > LOG.debug("Counting resource from " + sched.getName() + " " + toAdd > + "; Total resource consumption for " + getName() + " now " > + demand); > } > demand = Resources.add(demand, toAdd); > demand = Resources.componentwiseMin(demand, maxRes); > } > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6507) Add support in NodeManager to isolate FPGA devices with CGroups
[ https://issues.apache.org/jira/browse/YARN-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhankun Tang updated YARN-6507: --- Attachment: YARN-6507-trunk.010.patch > Add support in NodeManager to isolate FPGA devices with CGroups > --- > > Key: YARN-6507 > URL: https://issues.apache.org/jira/browse/YARN-6507 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Zhankun Tang >Assignee: Zhankun Tang > Attachments: YARN-6507-branch-YARN-3926.001.patch, > YARN-6507-branch-YARN-3926.002.patch, YARN-6507-trunk.001.patch, > YARN-6507-trunk.002.patch, YARN-6507-trunk.003.patch, > YARN-6507-trunk.004.patch, YARN-6507-trunk.005.patch, > YARN-6507-trunk.006.patch, YARN-6507-trunk.007.patch, > YARN-6507-trunk.008.patch, YARN-6507-trunk.009.patch, > YARN-6507-trunk.010.patch > > > Support local FPGA resource scheduler to assign/isolate N FPGA slots to a > container. > At the beginning, support one vendor plugin with basic features to serve > OpenCL applications -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6507) Add support in NodeManager to isolate FPGA devices with CGroups
[ https://issues.apache.org/jira/browse/YARN-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhankun Tang updated YARN-6507: --- Attachment: (was: YARN-6507-trunk.0010.patch) > Add support in NodeManager to isolate FPGA devices with CGroups > --- > > Key: YARN-6507 > URL: https://issues.apache.org/jira/browse/YARN-6507 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Zhankun Tang >Assignee: Zhankun Tang > Attachments: YARN-6507-branch-YARN-3926.001.patch, > YARN-6507-branch-YARN-3926.002.patch, YARN-6507-trunk.001.patch, > YARN-6507-trunk.002.patch, YARN-6507-trunk.003.patch, > YARN-6507-trunk.004.patch, YARN-6507-trunk.005.patch, > YARN-6507-trunk.006.patch, YARN-6507-trunk.007.patch, > YARN-6507-trunk.008.patch, YARN-6507-trunk.009.patch > > > Support local FPGA resource scheduler to assign/isolate N FPGA slots to a > container. > At the beginning, support one vendor plugin with basic features to serve > OpenCL applications -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-6507) Add support in NodeManager to isolate FPGA devices with CGroups
[ https://issues.apache.org/jira/browse/YARN-6507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhankun Tang updated YARN-6507: --- Attachment: YARN-6507-trunk.0010.patch Rebased on trunk (2bde3aedf139368fc71f053d8dd6580b498ff46d) > Add support in NodeManager to isolate FPGA devices with CGroups > --- > > Key: YARN-6507 > URL: https://issues.apache.org/jira/browse/YARN-6507 > Project: Hadoop YARN > Issue Type: Sub-task > Components: yarn >Reporter: Zhankun Tang >Assignee: Zhankun Tang > Attachments: YARN-6507-branch-YARN-3926.001.patch, > YARN-6507-branch-YARN-3926.002.patch, YARN-6507-trunk.001.patch, > YARN-6507-trunk.0010.patch, YARN-6507-trunk.002.patch, > YARN-6507-trunk.003.patch, YARN-6507-trunk.004.patch, > YARN-6507-trunk.005.patch, YARN-6507-trunk.006.patch, > YARN-6507-trunk.007.patch, YARN-6507-trunk.008.patch, > YARN-6507-trunk.009.patch > > > Support local FPGA resource scheduler to assign/isolate N FPGA slots to a > container. > At the beginning, support one vendor plugin with basic features to serve > OpenCL applications -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org