Re: mapper java process not exiting
Is there a reason for using OpenJDK and not Sun's JDK? The cluster we are seeing the problem in uses Sun's JDK java version 1.6.0_21,Java(TM) SE Runtime Environment (build 1.6.0_21-b06),Java HotSpot(TM) 64-Bit Server VM (build 17.0-b16, mixed mode) The standalone node where I tried to reproduce the issue uses OpenJDK and this one does not see this issue as it is able to reuse JVMs. -Adi Also... I believe there were noted issues with the .17 JDK. I will look for a link and post if I can find. Otherwise, the behaviour I have seen before. Hadoop is detaching from the JVM and stops seeing it. I think your problem lies in the JDK and not Hadoop. On May 12, 2011 at 8:12 PM, Adi adi.pan...@gmail.com wrote: 2011-05-12 13:52:04,147 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -12545: No such process Your logs showed that Hadoop tried to kill processes but the kill command claimed they didn't exist. The next time you see this problem, can you check the logs and see if any of the PIDs that appear in the logs are in fact still running? A more likely scenario is that Hadoop's tracking of child VMs is getting out of sync, but I'm not sure what would cause that. Yes those java processes are in fact running. And those error messages do not always show up. Just sometimes. But the processes never get cleaned up. -Adi
Re: mapper java process not exiting
You posted system specifics earlier; would you mind posting again? can't find them in the thread. Sent from my iPhone On May 13, 2011, at 8:05 AM, Adi adi.pan...@gmail.com wrote: Is there a reason for using OpenJDK and not Sun's JDK? The cluster we are seeing the problem in uses Sun's JDK java version 1.6.0_21,Java(TM) SE Runtime Environment (build 1.6.0_21-b06),Java HotSpot(TM) 64-Bit Server VM (build 17.0-b16, mixed mode) The standalone node where I tried to reproduce the issue uses OpenJDK and this one does not see this issue as it is able to reuse JVMs. -Adi Also... I believe there were noted issues with the .17 JDK. I will look for a link and post if I can find. Otherwise, the behaviour I have seen before. Hadoop is detaching from the JVM and stops seeing it. I think your problem lies in the JDK and not Hadoop. On May 12, 2011 at 8:12 PM, Adi adi.pan...@gmail.com wrote: 2011-05-12 13:52:04,147 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -12545: No such process Your logs showed that Hadoop tried to kill processes but the kill command claimed they didn't exist. The next time you see this problem, can you check the logs and see if any of the PIDs that appear in the logs are in fact still running? A more likely scenario is that Hadoop's tracking of child VMs is getting out of sync, but I'm not sure what would cause that. Yes those java processes are in fact running. And those error messages do not always show up. Just sometimes. But the processes never get cleaned up. -Adi
Re: mapper java process not exiting
Which version of hadoop are you running? Are you running on linux? -Joey On Thu, May 12, 2011 at 1:39 PM, Adi adi.pan...@gmail.com wrote: For one long running job we are noticing that the mapper jvms do not exit even after the mapper is done. Any suggestions on why this could be happening. The java processes get cleaned up if I do a hadoop job -kill job_id. The java processes get cleaned up of I run in it in a smaller batch and the job gets done fairly quickly(say half an hour). For larger inputs the nodes eventually run out of memory because of these java processes that the cluster thinks are gone but they haven't been cleaned up yet. I am suspecting the TaskTrackers are failing to kill JVMs for some reason by themselves. The following exceptions can be seen in the hadoop logs. 2011-05-12 13:52:04,147 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -12545: No such process 2011-05-12 13:52:08,071 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -11061: No such process 2011-05-12 13:52:09,009 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -11151: No such process 2011-05-12 13:52:12,009 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -25057: No such process 2011-05-12 13:52:13,306 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -19805: No such process 2011-05-12 13:52:14,996 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -11103: No such process 2011-05-12 15:51:41,105 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -17202: No such process 2011-05-12 15:51:43,481 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -15981: No such process 2011-05-12 15:51:45,916 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -17931: No such process 2011-05-12 15:52:06,328 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -14867: No such process 2011-05-12 15:52:34,503 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -29376: No such process 2011-05-12 15:52:38,607 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -32491: No such process 2011-05-12 15:52:39,292 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -31529: No such process 2011-05-12 15:52:46,547 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -15140: No such process Some other exceptions also seen in the logs may or may not be related to the above problem. 2011-05-12 16:01:20,534 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 33465 caught: java.nio.channels.ClosedChannelException 2011-05-12 16:01:48,869 INFO org.apache.hadoop.ipc.Server: IPC Server handler 80 on 33465 caught: java.nio.channels.ClosedChannelException 2011-05-12 16:01:53,922 INFO org.apache.hadoop.ipc.Server: IPC Server handler 59 on 33465 caught: java.nio.channels.ClosedChannelException 2011-05-12 16:01:58,977 INFO org.apache.hadoop.ipc.Server: IPC Server handler 28 on 33465 caught: java.nio.channels.ClosedChannelException 2011-05-12 16:02:04,040 INFO org.apache.hadoop.ipc.Server: IPC Server handler 37 on 33465 caught: java.nio.channels.ClosedChannelException 2011-05-12 16:02:09,095 INFO org.apache.hadoop.ipc.Server: IPC Server handler 100 on 33465 caught: java.nio.channels.ClosedChannelException Thanks. -Adi -- Joseph Echeverria Cloudera, Inc. 443.305.9434
Re: mapper java process not exiting
Which version of hadoop are you running? Hadoop 0.21.0 with some patches. Are you running on linux? Yes Linux 2.6.18-238.9.1.el5 #1 SMP x86_64 x86_64 x86_64 GNU/Linux java version 1.6.0_21 Java(TM) SE Runtime Environment (build 1.6.0_21-b06) Java HotSpot(TM) 64-Bit Server VM (build 17.0-b16, mixed mode) I set up 0.21.0 on another linux box and am not seeing this issue as hadoop is reusing JVMs(as configured). In the production cluster it is not re-using JVMs and runs out of memory because of mapper JVMs staying alive even after they have ended according to hadoop. The production node is a 64 bit OS/JVM. Is there any known issue workaround for enabling JVM reuse in 64 bit environments. Test node is 32 bit: Linux 2.6.18-194.32.1.el5.centos.plus #1 SMP i686 i686 i386 GNU/Linux java version 1.6.0_17 OpenJDK Runtime Environment (IcedTea6 1.7.5) (rhel-1.16.b17.el5-i386) OpenJDK Server VM (build 14.0-b16, mixed mode) Even if I can get it to reuse JVM it will be grrreat. -Adi -Joey On Thu, May 12, 2011 at 1:39 PM, Adi adi.pan...@gmail.com wrote: For one long running job we are noticing that the mapper jvms do not exit even after the mapper is done. Any suggestions on why this could be happening. The java processes get cleaned up if I do a hadoop job -kill job_id. The java processes get cleaned up of I run in it in a smaller batch and the job gets done fairly quickly(say half an hour). For larger inputs the nodes eventually run out of memory because of these java processes that the cluster thinks are gone but they haven't been cleaned up yet. I am suspecting the TaskTrackers are failing to kill JVMs for some reason by themselves. The following exceptions can be seen in the hadoop logs. 2011-05-12 13:52:04,147 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -12545: No such process 2011-05-12 13:52:08,071 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -11061: No such process 2011-05-12 13:52:09,009 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -11151: No such process 2011-05-12 13:52:12,009 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -25057: No such process 2011-05-12 13:52:13,306 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -19805: No such process 2011-05-12 13:52:14,996 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -11103: No such process 2011-05-12 15:51:41,105 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -17202: No such process 2011-05-12 15:51:43,481 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -15981: No such process 2011-05-12 15:51:45,916 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -17931: No such process 2011-05-12 15:52:06,328 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -14867: No such process 2011-05-12 15:52:34,503 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -29376: No such process 2011-05-12 15:52:38,607 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -32491: No such process 2011-05-12 15:52:39,292 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -31529: No such process 2011-05-12 15:52:46,547 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -15140: No such process Some other exceptions also seen in the logs may or may not be related to the above problem. 2011-05-12 16:01:20,534 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 33465 caught: java.nio.channels.ClosedChannelException 2011-05-12 16:01:48,869 INFO org.apache.hadoop.ipc.Server: IPC Server handler 80 on 33465 caught: java.nio.channels.ClosedChannelException 2011-05-12 16:01:53,922 INFO org.apache.hadoop.ipc.Server: IPC Server handler 59 on 33465 caught: java.nio.channels.ClosedChannelException 2011-05-12 16:01:58,977 INFO
Re: mapper java process not exiting
Hadoop 0.21.0 with some patches. Hadoop 0.21.0 doesn't get much use, so I'm not sure how much help I can be. 2011-05-12 13:52:04,147 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -12545: No such process Your logs showed that Hadoop tried to kill processes but the kill command claimed they didn't exist. The next time you see this problem, can you check the logs and see if any of the PIDs that appear in the logs are in fact still running? A more likely scenario is that Hadoop's tracking of child VMs is getting out of sync, but I'm not sure what would cause that. -Joey -- Joseph Echeverria Cloudera, Inc. 443.305.9434
Re: mapper java process not exiting
2011-05-12 13:52:04,147 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -12545: No such process Your logs showed that Hadoop tried to kill processes but the kill command claimed they didn't exist. The next time you see this problem, can you check the logs and see if any of the PIDs that appear in the logs are in fact still running? A more likely scenario is that Hadoop's tracking of child VMs is getting out of sync, but I'm not sure what would cause that. Yes those java processes are in fact running. And those error messages do not always show up. Just sometimes. But the processes never get cleaned up. -Adi
Re: mapper java process not exiting
Is there a reason for using OpenJDK and not Sun's JDK? Also... I believe there were noted issues with the .17 JDK. I will look for a link and post if I can find. Otherwise, the behaviour I have seen before. Hadoop is detaching from the JVM and stops seeing it. I think your problem lies in the JDK and not Hadoop. On May 12, 2011 at 8:12 PM, Adi adi.pan...@gmail.com wrote: 2011-05-12 13:52:04,147 WARN org.apache.hadoop.mapreduce.util.ProcessTree: Error executing shell command org.apache.hadoop.util.Shell$ExitCodeException: kill -12545: No such process Your logs showed that Hadoop tried to kill processes but the kill command claimed they didn't exist. The next time you see this problem, can you check the logs and see if any of the PIDs that appear in the logs are in fact still running? A more likely scenario is that Hadoop's tracking of child VMs is getting out of sync, but I'm not sure what would cause that. Yes those java processes are in fact running. And those error messages do not always show up. Just sometimes. But the processes never get cleaned up. -Adi