Tao Yang created YARN-7591:
------------------------------

             Summary: NPE in async-scheduling mode of CapacityScheduler
                 Key: YARN-7591
                 URL: https://issues.apache.org/jira/browse/YARN-7591
             Project: Hadoop YARN
          Issue Type: Bug
          Components: capacityscheduler
    Affects Versions: 3.0.0-alpha4, 2.9.1
            Reporter: Tao Yang
            Assignee: Tao Yang


Currently in async-scheduling mode of CapacityScheduler, NPE may be raised in 
special scenarios as below.
(1) The user should be removed after its last application finished, NPE may be 
raised if getting something from user object without the null check in 
async-scheduling threads.
(2) NPE may be raised when trying fulfill reservation for a finished 
application in {{CapacityScheduler#allocateContainerOnSingleNode}}.
{code}
    RMContainer reservedContainer = node.getReservedContainer();
    if (reservedContainer != null) {
      FiCaSchedulerApp reservedApplication = getCurrentAttemptForContainer(
          reservedContainer.getContainerId());

      // NPE here: reservedApplication could be null after this application 
finished
      // Try to fulfill the reservation
      LOG.info(
          "Trying to fulfill reservation for application " + reservedApplication
              .getApplicationId() + " on node: " + node.getNodeID());
{code}
(3) If proposal1 (allocate containerX on node1) and proposal2 (reserve 
containerY on node1) were generated by different async-scheduling threads 
around the same time and proposal2 was submitted in front of proposal1, NPE is 
raised when trying to submit proposal2 in 
{{FiCaSchedulerApp#commonCheckContainerAllocation}}.
{code}
    if (reservedContainerOnNode != null) {
      // NPE here: allocation.getAllocateFromReservedContainer() should be null 
for proposal2 in this case
      RMContainer fromReservedContainer =
          allocation.getAllocateFromReservedContainer().getRmContainer();

      if (fromReservedContainer != reservedContainerOnNode) {
        if (LOG.isDebugEnabled()) {
          LOG.debug(
              "Try to allocate from a non-existed reserved container");
        }
        return false;
      }
    }
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to