[ 
https://issues.apache.org/jira/browse/YARN-6737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tao Yang updated YARN-6737:
---------------------------
    Attachment: YARN-6737.001.patch

Upload v1 patch for trunk. 
Sorry to be late for this update. I have scanned all the usages of 
AbstractYarnScheduler#getApplicationAttempt and 
CapacityScheduler#getApplicationAttempt and found one potential problem in 
QueuePriorityContainerCandidateSelector#preChecksForMovingReservedContainerToNode.
{code}
    FiCaSchedulerApp app =
        preemptionContext.getScheduler().getCurrentApplicationAttempt(
            reservedContainer.getApplicationAttemptId());
    if (!app.getAppSchedulingInfo().canDelayTo(
        reservedContainer.getAllocatedSchedulerKey(), ResourceRequest.ANY)) {
      // This is a hard locality request
      return false;
    }
{code}
NPE should happen here if app is no longer exist, I think we can correct it 
through adding null check for app like this (the outer caller will skip this 
invalid reservedContainer):
{code}
    FiCaSchedulerApp app =
        preemptionContext.getScheduler().getCurrentApplicationAttempt(
            reservedContainer.getApplicationAttemptId());
    if (app == null || !app.getAppSchedulingInfo().canDelayTo(
        reservedContainer.getAllocatedSchedulerKey(), ResourceRequest.ANY)) {
      // This is a hard locality request
      return false;
    }
{code}
[~sunilg] Please help to review this patch. Thanks!

> Rename getApplicationAttempt to getCurrentAttempt in 
> AbstractYarnScheduler/CapacityScheduler
> --------------------------------------------------------------------------------------------
>
>                 Key: YARN-6737
>                 URL: https://issues.apache.org/jira/browse/YARN-6737
>             Project: Hadoop YARN
>          Issue Type: Improvement
>    Affects Versions: 2.9.0, 3.0.0-alpha3
>            Reporter: Tao Yang
>            Priority: Minor
>         Attachments: YARN-6737.001.patch
>
>
> As discussed in YARN-6714 
> (https://issues.apache.org/jira/browse/YARN-6714?focusedCommentId=16052158&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16052158)
> AbstractYarnScheduler#getApplicationAttempt is inconsistent to its name, it 
> discarded application_attempt_id and always return the latest attempt. We 
> should: 1) Rename it to getCurrentAttempt, 2) Change parameter from attemptId 
> to applicationId. 3) Took a scan of all usages to see if any similar issue 
> could happen.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to