[ https://issues.apache.org/jira/browse/TEZ-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15553445#comment-15553445 ]
Jonathan Eagles commented on TEZ-3368: -------------------------------------- Findbugs warning is pre-existing. [~jlowe], is this patch still ready to go in? {code:xml} <BugInstance rank="8" category="MT_CORRECTNESS" instanceHash="1610d6e467b95e03ffdc2eb056397eb4" instanceOccurrenceNum="0" priority="2" abbrev="JLM" type="JLM_JSR166_UTILCONCURRENT_MONITORENTER" instanceOccurrenceMax="0"> <ShortMessage> Synchronization performed on util.concurrent instance </ShortMessage> <LongMessage> Synchronization performed on java.util.concurrent.atomic.AtomicBoolean in org.apache.tez.dag.app.rm.YarnTaskSchedulerService$DelayedContainerManager.mainLoop() </LongMessage> <Class classname="org.apache.tez.dag.app.rm.YarnTaskSchedulerService$DelayedContainerManager" primary="true"> <SourceLine start="1933" classname="org.apache.tez.dag.app.rm.YarnTaskSchedulerService$DelayedContainerManager" sourcepath="org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java" sourcefile="YarnTaskSchedulerService.java" end="2155"> <Message>At YarnTaskSchedulerService.java:[lines 1933-2155]</Message> </SourceLine> <Message> In class org.apache.tez.dag.app.rm.YarnTaskSchedulerService$DelayedContainerManager </Message> </Class> <Method isStatic="false" classname="org.apache.tez.dag.app.rm.YarnTaskSchedulerService$DelayedContainerManager" name="mainLoop" primary="true" signature="()V"> <SourceLine endBytecode="995" startBytecode="0" start="1970" classname="org.apache.tez.dag.app.rm.YarnTaskSchedulerService$DelayedContainerManager" sourcepath="org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java" sourcefile="YarnTaskSchedulerService.java" end="2057"/> <Message> In method org.apache.tez.dag.app.rm.YarnTaskSchedulerService$DelayedContainerManager.mainLoop() </Message> </Method> <Type descriptor="Ljava/util/concurrent/atomic/AtomicBoolean;"> <SourceLine start="53" classname="java.util.concurrent.atomic.AtomicBoolean" sourcepath="java/util/concurrent/atomic/AtomicBoolean.java" sourcefile="AtomicBoolean.java" end="161"> <Message>At AtomicBoolean.java:[lines 53-161]</Message> </SourceLine> <Message>Type java.util.concurrent.atomic.AtomicBoolean</Message> </Type> <Field isStatic="false" classname="org.apache.tez.dag.app.rm.YarnTaskSchedulerService$DelayedContainerManager" name="drainedDelayedContainersForTest" primary="true" role="FIELD_VALUE_OF" signature="Ljava/util/concurrent/atomic/AtomicBoolean;"> <SourceLine classname="org.apache.tez.dag.app.rm.YarnTaskSchedulerService$DelayedContainerManager" sourcepath="org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java" sourcefile="YarnTaskSchedulerService.java"> <Message>In YarnTaskSchedulerService.java</Message> </SourceLine> <Message> Value loaded from field org.apache.tez.dag.app.rm.YarnTaskSchedulerService$DelayedContainerManager.drainedDelayedContainersForTest </Message> </Field> <SourceLine endBytecode="50" startBytecode="50" start="1985" classname="org.apache.tez.dag.app.rm.YarnTaskSchedulerService$DelayedContainerManager" primary="true" sourcepath="org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java" sourcefile="YarnTaskSchedulerService.java" end="1985"> <Message>At YarnTaskSchedulerService.java:[line 1985]</Message> </SourceLine> </BugInstance> <BugCategory category="MT_CORRECTNESS"> <Description>Multithreaded correctness</Description> </BugCategory> <BugPattern category="MT_CORRECTNESS" abbrev="JLM" type="JLM_JSR166_UTILCONCURRENT_MONITORENTER"> <ShortDescription> Synchronization performed on util.concurrent instance </ShortDescription> <Details> <p> This method performs synchronization an object that is an instance of a class from the java.util.concurrent package (or its subclasses). Instances of these classes have their own concurrency control mechanisms that are orthogonal to the synchronization provided by the Java keyword <code>synchronized</code>. For example, synchronizing on an <code>AtomicBoolean</code> will not prevent other threads from modifying the <code>AtomicBoolean</code>.</p> <p>Such code may be correct, but should be carefully reviewed and documented, and may confuse people who have to maintain the code at a later date. </p> </Details> </BugPattern> <BugCode abbrev="JLM"> <Description>Synchronization on java.util.concurrent objects</Description> </BugCode>{code} > NPE in DelayedContainerManager > ------------------------------ > > Key: TEZ-3368 > URL: https://issues.apache.org/jira/browse/TEZ-3368 > Project: Apache Tez > Issue Type: Bug > Affects Versions: 0.7.1 > Reporter: Jason Lowe > Assignee: Jason Lowe > Attachments: TEZ-3368.001.patch > > > Saw a Tez AM hang due to an NPE in the DelayedContainerManager: > {noformat} > 2016-07-17 01:53:23,157 [ERROR] [DelayedContainerManager] > |yarn.YarnUncaughtExceptionHandler|: Thread > Thread[DelayedContainerManager,5,main] threw an Exception. > java.lang.NullPointerException > at > org.apache.tez.dag.app.rm.TezAMRMClientAsync.getMatchingRequestsForTopPriority(TezAMRMClientAsync.java:142) > at > org.apache.tez.dag.app.rm.YarnTaskSchedulerService.getMatchingRequestWithoutPriority(YarnTaskSchedulerService.java:1474) > at > org.apache.tez.dag.app.rm.YarnTaskSchedulerService.access$500(YarnTaskSchedulerService.java:84) > at > org.apache.tez.dag.app.rm.YarnTaskSchedulerService$NodeLocalContainerAssigner.assignReUsedContainer(YarnTaskSchedulerService.java:1869) > at > org.apache.tez.dag.app.rm.YarnTaskSchedulerService.assignReUsedContainerWithLocation(YarnTaskSchedulerService.java:1753) > at > org.apache.tez.dag.app.rm.YarnTaskSchedulerService.assignDelayedContainer(YarnTaskSchedulerService.java:733) > at > org.apache.tez.dag.app.rm.YarnTaskSchedulerService.access$600(YarnTaskSchedulerService.java:84) > at > org.apache.tez.dag.app.rm.YarnTaskSchedulerService$DelayedContainerManager.run(YarnTaskSchedulerService.java:2030) > {noformat} > After the DelayedContainerManager thread exited the AM proceeded to receive > requested containers that would go unused until the container allocations > expired. Then they would be re-requested, and the cycle repeated > indefinitely. -- This message was sent by Atlassian JIRA (v6.3.4#6332)