[jira] [Updated] (YARN-4133) Containers to be preempted leaks in FairScheduler preemption logic.
[ https://issues.apache.org/jira/browse/YARN-4133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated YARN-4133: Attachment: YARN-4133.000.patch > Containers to be preempted leaks in FairScheduler preemption logic. > --- > > Key: YARN-4133 > URL: https://issues.apache.org/jira/browse/YARN-4133 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.1 >Reporter: zhihai xu >Assignee: zhihai xu > Attachments: YARN-4133.000.patch > > > Containers to be preempted leaks in FairScheduler preemption logic. It may > cause missing preemption due to containers in {{warnedContainers}} wrongly > removed. The problem is in {{preemptResources}}: > There are two issues which can cause containers wrongly removed from > {{warnedContainers}}: > Firstly missing the container state {{RMContainerState.ACQUIRED}} in the > condition check: > {code} > (container.getState() == RMContainerState.RUNNING || > container.getState() == RMContainerState.ALLOCATED) > {code} > Secondly if {{isResourceGreaterThanNone(toPreempt)}} return false, we > shouldn't remove container from {{warnedContainers}}. We should only remove > container from {{warnedContainers}}, if container is not in state > {{RMContainerState.RUNNING}}, {{RMContainerState.ALLOCATED}} and > {{RMContainerState.ACQUIRED}}. > {code} > if ((container.getState() == RMContainerState.RUNNING || > container.getState() == RMContainerState.ALLOCATED) && > isResourceGreaterThanNone(toPreempt)) { > warnOrKillContainer(container); > Resources.subtractFrom(toPreempt, > container.getContainer().getResource()); > } else { > warnedIter.remove(); > } > {code} > Also once the containers in {{warnedContainers}} are wrongly removed, it will > never be preempted. Because these containers are already in > {{FSAppAttempt#preemptionMap}} and {{FSAppAttempt#preemptContainer}} won't > return the containers in {{FSAppAttempt#preemptionMap}}. > {code} > public RMContainer preemptContainer() { > if (LOG.isDebugEnabled()) { > LOG.debug("App " + getName() + " is going to preempt a running " + > "container"); > } > RMContainer toBePreempted = null; > for (RMContainer container : getLiveContainers()) { > if (!getPreemptionContainers().contains(container) && > (toBePreempted == null || > comparator.compare(toBePreempted, container) > 0)) { > toBePreempted = container; > } > } > return toBePreempted; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4133) Containers to be preempted leaks in FairScheduler preemption logic.
[ https://issues.apache.org/jira/browse/YARN-4133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated YARN-4133: Attachment: (was: YARN-4133.000.patch) > Containers to be preempted leaks in FairScheduler preemption logic. > --- > > Key: YARN-4133 > URL: https://issues.apache.org/jira/browse/YARN-4133 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.1 >Reporter: zhihai xu >Assignee: zhihai xu > Attachments: YARN-4133.000.patch > > > Containers to be preempted leaks in FairScheduler preemption logic. It may > cause missing preemption due to containers in {{warnedContainers}} wrongly > removed. The problem is in {{preemptResources}}: > There are two issues which can cause containers wrongly removed from > {{warnedContainers}}: > Firstly missing the container state {{RMContainerState.ACQUIRED}} in the > condition check: > {code} > (container.getState() == RMContainerState.RUNNING || > container.getState() == RMContainerState.ALLOCATED) > {code} > Secondly if {{isResourceGreaterThanNone(toPreempt)}} return false, we > shouldn't remove container from {{warnedContainers}}. We should only remove > container from {{warnedContainers}}, if container is not in state > {{RMContainerState.RUNNING}}, {{RMContainerState.ALLOCATED}} and > {{RMContainerState.ACQUIRED}}. > {code} > if ((container.getState() == RMContainerState.RUNNING || > container.getState() == RMContainerState.ALLOCATED) && > isResourceGreaterThanNone(toPreempt)) { > warnOrKillContainer(container); > Resources.subtractFrom(toPreempt, > container.getContainer().getResource()); > } else { > warnedIter.remove(); > } > {code} > Also once the containers in {{warnedContainers}} are wrongly removed, it will > never be preempted. Because these containers are already in > {{FSAppAttempt#preemptionMap}} and {{FSAppAttempt#preemptContainer}} won't > return the containers in {{FSAppAttempt#preemptionMap}}. > {code} > public RMContainer preemptContainer() { > if (LOG.isDebugEnabled()) { > LOG.debug("App " + getName() + " is going to preempt a running " + > "container"); > } > RMContainer toBePreempted = null; > for (RMContainer container : getLiveContainers()) { > if (!getPreemptionContainers().contains(container) && > (toBePreempted == null || > comparator.compare(toBePreempted, container) > 0)) { > toBePreempted = container; > } > } > return toBePreempted; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-4133) Containers to be preempted leaks in FairScheduler preemption logic.
[ https://issues.apache.org/jira/browse/YARN-4133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated YARN-4133: Description: Containers to be preempted leaks in FairScheduler preemption logic. It may cause missing preemption due to containers in {{warnedContainers}} wrongly removed. The problem is in {{preemptResources}}: There are two issues which can cause containers wrongly removed from {{warnedContainers}}: Firstly missing the container state {{RMContainerState.ACQUIRED}} in the condition check: {code} (container.getState() == RMContainerState.RUNNING || container.getState() == RMContainerState.ALLOCATED) {code} Secondly if {{isResourceGreaterThanNone(toPreempt)}} return false, we shouldn't remove container from {{warnedContainers}}. We should only remove container from {{warnedContainers}}, if container is not in state {{RMContainerState.RUNNING}}, {{RMContainerState.ALLOCATED}} and {{RMContainerState.ACQUIRED}}. {code} if ((container.getState() == RMContainerState.RUNNING || container.getState() == RMContainerState.ALLOCATED) && isResourceGreaterThanNone(toPreempt)) { warnOrKillContainer(container); Resources.subtractFrom(toPreempt, container.getContainer().getResource()); } else { warnedIter.remove(); } {code} Also once the containers in {{warnedContainers}} are wrongly removed, it will never be preempted. Because these containers are already in {{FSAppAttempt#preemptionMap}} and {{FSAppAttempt#preemptContainer}} won't return the containers in {{FSAppAttempt#preemptionMap}}. {code} public RMContainer preemptContainer() { if (LOG.isDebugEnabled()) { LOG.debug("App " + getName() + " is going to preempt a running " + "container"); } RMContainer toBePreempted = null; for (RMContainer container : getLiveContainers()) { if (!getPreemptionContainers().contains(container) && (toBePreempted == null || comparator.compare(toBePreempted, container) > 0)) { toBePreempted = container; } } return toBePreempted; } {code} was: Containers to be preempted leaks in FairScheduler preemption logic. It may cause missing preemption due to containers in {{warnedContainers}} wrongly removed. The problem is in {{preemptResources}}: There are two issues which can cause containers wrongly removed from {{warnedContainers}}: Firstly missing the container state {{RMContainerState.ACQUIRED}} in the condition check: {code} (container.getState() == RMContainerState.RUNNING || container.getState() == RMContainerState.ALLOCATED) {code} Secondly if {{isResourceGreaterThanNone(toPreempt)}} return false, we shouldn't remove container from {{warnedContainers}}, We should only remove container from {{warnedContainers}}, if container is not in state {{RMContainerState.RUNNING}}, {{RMContainerState.ALLOCATED}} and {{RMContainerState.ACQUIRED}}. {code} if ((container.getState() == RMContainerState.RUNNING || container.getState() == RMContainerState.ALLOCATED) && isResourceGreaterThanNone(toPreempt)) { warnOrKillContainer(container); Resources.subtractFrom(toPreempt, container.getContainer().getResource()); } else { warnedIter.remove(); } {code} > Containers to be preempted leaks in FairScheduler preemption logic. > --- > > Key: YARN-4133 > URL: https://issues.apache.org/jira/browse/YARN-4133 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.1 >Reporter: zhihai xu >Assignee: zhihai xu > Attachments: YARN-4133.000.patch > > > Containers to be preempted leaks in FairScheduler preemption logic. It may > cause missing preemption due to containers in {{warnedContainers}} wrongly > removed. The problem is in {{preemptResources}}: > There are two issues which can cause containers wrongly removed from > {{warnedContainers}}: > Firstly missing the container state {{RMContainerState.ACQUIRED}} in the > condition check: > {code} > (container.getState() == RMContainerState.RUNNING || > container.getState() == RMContainerState.ALLOCATED) > {code} > Secondly if {{isResourceGreaterThanNone(toPreempt)}} return false, we > shouldn't remove container from {{warnedContainers}}. We should only remove > container from {{warnedContainers}}, if container is not in state > {{RMContainerState.RUNNING}}, {{RMContainerState.ALLOCATED}} and > {{RMContainerState.ACQUIRED}}. > {code} > if ((container.getState() == RMContainerState.RUNNING || > container.getState() == RMContainerState.ALLOCATED) && > isResourceGreaterThanNone(toPreempt)) { > warnOrKill
[jira] [Updated] (YARN-4133) Containers to be preempted leaks in FairScheduler preemption logic.
[ https://issues.apache.org/jira/browse/YARN-4133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhihai xu updated YARN-4133: Attachment: YARN-4133.000.patch > Containers to be preempted leaks in FairScheduler preemption logic. > --- > > Key: YARN-4133 > URL: https://issues.apache.org/jira/browse/YARN-4133 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler >Affects Versions: 2.7.1 >Reporter: zhihai xu >Assignee: zhihai xu > Attachments: YARN-4133.000.patch > > > Containers to be preempted leaks in FairScheduler preemption logic. It may > cause missing preemption due to containers in {{warnedContainers}} wrongly > removed. The problem is in {{preemptResources}}: > There are two issues which can cause containers wrongly removed from > {{warnedContainers}}: > Firstly missing the container state {{RMContainerState.ACQUIRED}} in the > condition check: > {code} > (container.getState() == RMContainerState.RUNNING || > container.getState() == RMContainerState.ALLOCATED) > {code} > Secondly if {{isResourceGreaterThanNone(toPreempt)}} return false, we > shouldn't remove container from {{warnedContainers}}, We should only remove > container from {{warnedContainers}}, if container is not in state > {{RMContainerState.RUNNING}}, {{RMContainerState.ALLOCATED}} and > {{RMContainerState.ACQUIRED}}. > {code} > if ((container.getState() == RMContainerState.RUNNING || > container.getState() == RMContainerState.ALLOCATED) && > isResourceGreaterThanNone(toPreempt)) { > warnOrKillContainer(container); > Resources.subtractFrom(toPreempt, > container.getContainer().getResource()); > } else { > warnedIter.remove(); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)