[ 
https://issues.apache.org/jira/browse/YARN-6172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15881712#comment-15881712
 ] 

Miklos Szegedi edited comment on YARN-6172 at 2/24/17 1:23 AM:
---------------------------------------------------------------

Thank you, [~varun_saxena] for reporting this.
I was able to repro the scenario above. There are two issues here.
First, the update thread resets the queue demand and adds each application 
demand to it one by one every time it runs without locking. Whenever this value 
is sampled, the test compares it with the expected value. However, if we have 
not finished with the update, this can be 0 or anything less than the actual 
demand.
A different unrelated issue is that the test actually calls {{Thread.yield()}} 
instead of properly waiting for the expected application count value to 
propagate. I will send out a patch soon.

{code}
@Override
  public void updateDemand() {
    // Compute demand by iterating through apps in the queue
    // Limit demand to maxResources
    demand = Resources.createResource(0);
    readLock.lock();
    try {
      for (FSAppAttempt sched : runnableApps) {
        updateDemandForApp(sched);
      }
      for (FSAppAttempt sched : nonRunnableApps) {
        updateDemandForApp(sched);
      }
    } finally {
      readLock.unlock();
    }
    // Cap demand to maxShare to limit allocation to maxShare
    demand = Resources.componentwiseMin(demand, maxShare);
    if (LOG.isDebugEnabled()) {
      LOG.debug("The updated demand for " + getName() + " is " + demand
          + "; the max is " + maxShare);
      LOG.debug("The updated fairshare for " + getName() + " is "
          + getFairShare());
    }
  }
  
  private void updateDemandForApp(FSAppAttempt sched) {
    sched.updateDemand();
    Resource toAdd = sched.getDemand();
    if (LOG.isDebugEnabled()) {
      LOG.debug("Counting resource from " + sched.getName() + " " + toAdd
          + "; Total resource demand for " + getName() + " now "
          + demand);
    }
    demand = Resources.add(demand, toAdd);
  }
{code}


was (Author: miklos.szeg...@cloudera.com):
I was able to repro the scenario above. There are two issues here.
First, the update thread resets the queue demand and adds each application 
demand to it one by one every time it runs without locking. Whenever this value 
is sampled, the test compares it with the expected value. However, if we have 
not finished with the update, this can be 0 or anything less than the actual 
demand.
A different unrelated issue is that the test actually calls {{Thread.yield()}} 
instead of properly waiting for the expected application count value to 
propagate. I will send out a patch soon.

{code}
@Override
  public void updateDemand() {
    // Compute demand by iterating through apps in the queue
    // Limit demand to maxResources
    demand = Resources.createResource(0);
    readLock.lock();
    try {
      for (FSAppAttempt sched : runnableApps) {
        updateDemandForApp(sched);
      }
      for (FSAppAttempt sched : nonRunnableApps) {
        updateDemandForApp(sched);
      }
    } finally {
      readLock.unlock();
    }
    // Cap demand to maxShare to limit allocation to maxShare
    demand = Resources.componentwiseMin(demand, maxShare);
    if (LOG.isDebugEnabled()) {
      LOG.debug("The updated demand for " + getName() + " is " + demand
          + "; the max is " + maxShare);
      LOG.debug("The updated fairshare for " + getName() + " is "
          + getFairShare());
    }
  }
  
  private void updateDemandForApp(FSAppAttempt sched) {
    sched.updateDemand();
    Resource toAdd = sched.getDemand();
    if (LOG.isDebugEnabled()) {
      LOG.debug("Counting resource from " + sched.getName() + " " + toAdd
          + "; Total resource demand for " + getName() + " now "
          + demand);
    }
    demand = Resources.add(demand, toAdd);
  }
{code}

> TestFSAppStarvation fails on trunk
> ----------------------------------
>
>                 Key: YARN-6172
>                 URL: https://issues.apache.org/jira/browse/YARN-6172
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>            Reporter: Varun Saxena
>         Attachments: YARN-6172.000.patch
>
>
> Refer to test report 
> https://builds.apache.org/job/PreCommit-YARN-Build/14882/testReport/
> {noformat}
> java.lang.AssertionError: null
>       at org.junit.Assert.fail(Assert.java:86)
>       at org.junit.Assert.assertTrue(Assert.java:41)
>       at org.junit.Assert.assertTrue(Assert.java:52)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFSAppStarvation.verifyLeafQueueStarvation(TestFSAppStarvation.java:133)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFSAppStarvation.testPreemptionEnabled(TestFSAppStarvation.java:106)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to