[ 
https://issues.apache.org/jira/browse/YARN-7458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated YARN-7458:
--------------------------------
    Attachment: YARN-7458.001.patch

There's a few problems here:
# This will never fail.  It checks that the {{containsKey}} method never 
returns {{null}}, but it returns a {{boolean}}.  It probably should have been 
{{assertTrue}}.
{code:java}
Assert.assertNotNull(nmContext.getContainers().containsKey(containerId));
{code}
# Ignoring the above issue, this code has a race condition:
{code:java}
Assert.assertNotNull(nmContext.getContainers().containsKey(containerId));
// Get the container first, as it may be removed from the Context
// by asynchronous calls.
// This was leading to a flakey test as otherwise the container could
// be removed and end up null.
Container waitContainer = nmContext.getContainers().get(containerId);
{code}
The container could be removed between the two statements, resulting in 
{{waitContainer}} being {{null}}.  This is what is causing the NPE we're seeing.

This method is simply trying to wait for the container to finish.  If it's 
already finished, then we don't need to do the waiting part, so I think the fix 
here is to simply try to get the container, and if we do, then do the wait 
loop.  If we don't, then no need to bother.  

> TestContainerManagerSecurity is still flakey
> --------------------------------------------
>
>                 Key: YARN-7458
>                 URL: https://issues.apache.org/jira/browse/YARN-7458
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: test
>    Affects Versions: 2.9.0, 3.0.0-beta1
>            Reporter: Robert Kanter
>            Assignee: Robert Kanter
>         Attachments: YARN-7458.001.patch
>
>
> YARN-6150 made this less flakey, but we're still seeing an occasional issue 
> here:
> {noformat}
> java.lang.NullPointerException
>       at 
> org.apache.hadoop.yarn.server.TestContainerManagerSecurity.waitForContainerToFinishOnNM(TestContainerManagerSecurity.java:420)
>       at 
> org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testNMTokens(TestContainerManagerSecurity.java:356)
>       at 
> org.apache.hadoop.yarn.server.TestContainerManagerSecurity.testContainerManager(TestContainerManagerSecurity.java:167)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to