[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15892808#comment-15892808
 ] 

Hudson commented on MAPREDUCE-6852:
-----------------------------------

FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #11331 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11331/])
MAPREDUCE-6852. Job#updateStatus() failed with NPE due to race (jianhe: rev 
747bafaf969857b66233a8b4660590bdd712ed7d)
* (edit) 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/Job.java


> Job#updateStatus() failed with NPE due to race condition
> --------------------------------------------------------
>
>                 Key: MAPREDUCE-6852
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6852
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Junping Du
>            Assignee: Junping Du
>             Fix For: 2.9.0
>
>         Attachments: MAPREDUCE-6852.patch, MAPREDUCE-6852-v2.patch
>
>
> Like MAPREDUCE-6762, we found this issue in a cluster where Pig query 
> occasionally failed on NPE - "Pig uses JobControl API to track MR job status, 
> but sometimes Job History Server failed to flush job meta files to HDFS which 
> caused the status update failed." Beside NPE in 
> o.a.h.mapreduce.Job.getJobName, we also get NPE in Job.updateStatus() and the 
> exception is as following:
> {noformat}
> Caused by: java.lang.NullPointerException
>       at org.apache.hadoop.mapreduce.Job$1.run(Job.java:323)
>       at org.apache.hadoop.mapreduce.Job$1.run(Job.java:320)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:422)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1833)
>       at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:320)
>       at org.apache.hadoop.mapreduce.Job.isComplete(Job.java:604)
> {noformat}
> We found state here is null. However, we already check the job state to be 
> RUNNING as code below:
> {noformat}
>   public boolean isComplete() throws IOException {
>     ensureState(JobState.RUNNING);
>     updateStatus();
>     return status.isJobComplete();
>   }
> {noformat}
> The only possible reason here is two threads are calling here for the same 
> time: ensure state first, then one thread update the state to null while the 
> other thread hit NPE issue here.
> We should fix this NPE exception.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-h...@hadoop.apache.org

Reply via email to