[ https://issues.apache.org/jira/browse/PIG-4967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15424807#comment-15424807 ]
Xiang Li commented on PIG-4967: ------------------------------- Hi Daniel, thanks for the explanation! Regarding bq. I am not sure what's the nature of status=null Something I found so far: In the class of Job of Hadoop, JobStatus status is updated by the function called updateStatus(), starting from line 318 {code} synchronized void updateStatus() throws IOException { try { this.status = ugi.doAs(new PrivilegedExceptionAction<JobStatus>() { @Override public JobStatus run() throws IOException, InterruptedException { return cluster.getClient().getJobStatus(status.getJobID()); } }); } catch (InterruptedException ie) { throw new IOException(ie); } if (this.status == null) { throw new IOException("Job status not available "); } this.statustime = System.currentTimeMillis(); } {code} I think it is not safe, because this.status will be set no matter what is returned by ugi.doAs(). Even if it returns null (maybe due to some network problems), this.status will be set to null directly. Another thread calling getJobName() has status=null. The code followed will check if this.status is null and throw IOException. But it is weird that I did not this IOException in the hadoop log. We also found the following message in app-master log bq.INFO org.apache.hadoop.service.AbstractService: Service org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.SocketTimeoutException: 70000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=xxxx remote=xxxx but so far I could not tell if status=null has something to do with that timeout. Anyway, I will upload the patch soon. Thanks! > NPE in PigJobControl.run() when job status is null > -------------------------------------------------- > > Key: PIG-4967 > URL: https://issues.apache.org/jira/browse/PIG-4967 > Project: Pig > Issue Type: Bug > Reporter: Xiang Li > Assignee: Xiang Li > Priority: Critical > > {code} > [JobControl] ERROR org.apache.pig.backend.hadoop23.PigJobControl - Error > while trying to run jobs. > java.lang.NullPointerException > at org.apache.hadoop.mapreduce.Job.getJobName(Job.java:426) > at > org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.toString(ControlledJob.java:93) > at java.lang.String.valueOf(String.java:2982) > at java.lang.StringBuilder.append(StringBuilder.java:131) > at > org.apache.pig.backend.hadoop23.PigJobControl.run(PigJobControl.java:182) > at java.lang.Thread.run(Thread.java:745) > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:276) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)