Can anyone shed some light on this exception?  We are on a 60 node, 260 core, 
0.19.0 cluster, and everything hums along fine, but every 1 or two weeks we see 
a bunch of these in the logs, the first on heartbeat, and then one for 
submitJob for every job we try to start.  The job tracker becomes unresponsive 
and has to be killed and restarted, but the tasktrackers all appear fine and in 
fact we never stop/start those during this.  After the restart the same job 
submits without issue.  The only thing I noticed in the logs leading up to it 
was that a job finished just before the job that triggered this error for the 
first time started up.  Here is the full trace leading up to it, and thanks for 
the help:

2009-06-07 01:12:12,578 INFO org.apache.hadoop.mapred.JobInProgress: Job 
job_200906022136_0106 has completed successfully.
2009-06-07 01:12:12,599 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 
on 54311, call heartbeat(org.apache.hadoop.mapred.tasktrackersta...@8df3ca7, 
false, true, -22339) from 172.21.30.46:60840: error: java.io.IOException: 
java.lang.NullPointerException
java.io.IOException: java.lang.NullPointerException
        at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:459)
        at org.apache.hadoop.ipc.Client.call(Client.java:686)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
        at $Proxy4.complete(Unknown Source)
        at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
        at $Proxy4.complete(Unknown Source)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3129)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3053)
        at 
org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:59)
        at 
org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:79)
        at sun.nio.cs.StreamEncoder.implClose(StreamEncoder.java:301)
        at sun.nio.cs.StreamEncoder.close(StreamEncoder.java:130)
        at java.io.OutputStreamWriter.close(OutputStreamWriter.java:216)
        at java.io.BufferedWriter.close(BufferedWriter.java:248)
        at java.io.PrintWriter.close(PrintWriter.java:295)
        at 
org.apache.hadoop.mapred.JobHistory$JobInfo.logFinished(JobHistory.java:1024)
        at 
org.apache.hadoop.mapred.JobInProgress.jobComplete(JobInProgress.java:1906)
        at 
org.apache.hadoop.mapred.JobInProgress.completedTask(JobInProgress.java:1855)
        at 
org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:786)
        at 
org.apache.hadoop.mapred.JobTracker.updateTaskStatuses(JobTracker.java:2613)
        at 
org.apache.hadoop.mapred.JobTracker.processHeartbeat(JobTracker.java:2056)
        at org.apache.hadoop.mapred.JobTracker.heartbeat(JobTracker.java:1866)
        at sun.reflect.GeneratedMethodAccessor23.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:892)
2009-06-07 01:12:14,632 INFO org.apache.hadoop.mapred.JobTracker: Removed 
completed task 'attempt_200906022136_0106_r_000028_1' from 
'tracker_dup041.iad.hadoop.net:localhost.localdomain/127.0.0.1:56041'
2009-06-07 01:12:15,064 INFO org.apache.hadoop.mapred.JobTracker: Removed 
completed task 'attempt_200906022136_0106_r_000019_2' from 
'tracker_dup022.iad.hadoop.net:localhost.localdomain/127.0.0.1:44525'
2009-06-07 01:12:15,463 INFO org.apache.hadoop.mapred.JobTracker: Removed 
completed task 'attempt_200906022136_0106_r_000009_2' from 
'tracker_dup042.iad.hadoop.net:localhost.localdomain/127.0.0.1:47675'
2009-06-07 01:12:15,927 INFO org.apache.hadoop.mapred.JobTracker: Removed 
completed task 'attempt_200906022136_0106_r_000016_2' from 
'tracker_dup008.iad.hadoop.net:localhost.localdomain/127.0.0.1:46724'
2009-06-07 01:12:15,986 INFO org.apache.hadoop.mapred.JobTracker: Removed 
completed task 'attempt_200906022136_0106_r_000007_2' from 
'tracker_dup003.iad.hadoop.net:localhost.localdomain/127.0.0.1:44460'
2009-06-07 01:12:16,300 INFO org.apache.hadoop.mapred.JobTracker: Removed 
completed task 'attempt_200906022136_0106_r_000014_2' from 
'tracker_dup034.iad.hadoop.net:localhost.localdomain/127.0.0.1:35848'
2009-06-07 01:12:16,400 INFO org.apache.hadoop.mapred.JobTracker: Removed 
completed task 'attempt_200906022136_0106_r_000025_2' from 
'tracker_dup009.iad.hadoop.net:localhost.localdomain/127.0.0.1:53461'
2009-06-07 01:12:16,409 INFO org.apache.hadoop.mapred.JobTracker: Removed 
completed task 'attempt_200906022136_0106_m_003000_0' from 
'tracker_dup046.iad.hadoop.net:localhost.localdomain/127.0.0.1:53340'
2009-06-07 01:12:16,494 INFO org.apache.hadoop.mapred.JobTracker: Removed 
completed task 'attempt_200906022136_0106_r_000018_2' from 
'tracker_dup048.iad.hadoop.net:localhost.localdomain/127.0.0.1:51644'
2009-06-07 01:12:16,773 INFO org.apache.hadoop.mapred.JobTracker: Removed 
completed task 'attempt_200906022136_0106_r_000005_2' from 
'tracker_dup060.iad.hadoop.net:localhost.localdomain/127.0.0.1:52454'
2009-06-07 01:12:19,717 INFO org.apache.hadoop.mapred.JobTracker: Removed 
completed task 'attempt_200906022136_0106_r_000032_1' from 
'tracker_dup016.iad.hadoop.net:localhost.localdomain/127.0.0.1:41496'
2009-06-07 01:12:33,121 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 
on 54311, call submitJob(job_200906022136_0110) from 172.21.30.1:50229: error: 
java.io.IOException: java.lang.NullPointerException
java.io.IOException: java.lang.NullPointerException
        at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:459)
        at org.apache.hadoop.ipc.Client.call(Client.java:686)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
        at $Proxy4.getFileInfo(Unknown Source)
        at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
        at $Proxy4.getFileInfo(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:578)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:390)
        at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:192)
        at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:142)
        at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1214)
        at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1195)
        at org.apache.hadoop.mapred.JobInProgress.<init>(JobInProgress.java:212)
        at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:2230)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:892)
2009-06-07 02:00:07,348 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 
on 54311, call submitJob(job_200906022136_0111) from 172.21.31.248:39823: 
error: java.io.IOException: java.lang.NullPointerException
java.io.IOException: java.lang.NullPointerException
        at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:459)
        at org.apache.hadoop.ipc.Client.call(Client.java:686)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
        at $Proxy4.getFileInfo(Unknown Source)
        at sun.reflect.GeneratedMethodAccessor10.invoke(Unknown Source)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
        at $Proxy4.getFileInfo(Unknown Source)
        at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:578)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:390)
        at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:192)
        at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:142)
        at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1214)
        at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1195)
        at org.apache.hadoop.mapred.JobInProgress.<init>(JobInProgress.java:212)
        at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:2230)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:892)
2009-06-07 02:00:20,664 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 
on 54311, call submitJob(job_200906022136_0112) from 172.21.31.249:33716: 
error: java.io.IOException: java.lang.NullPointerException
java.io.IOException: java.lang.NullPointerException
        at org.apache.hadoop.ipc.Client$Connection.sendParam(Client.java:459)


The information transmitted in this email is intended only for the person(s) or 
entity to which it is addressed and may contain confidential and/or privileged 
material. Any review, retransmission, dissemination or other use of, or taking 
of any action in reliance upon, this information by persons or entities other 
than the intended recipient is prohibited. If you received this email in error, 
please contact the sender and permanently delete the email from any computer.

Reply via email to