[ https://issues.apache.org/jira/browse/HDFS-162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Harsh J resolved HDFS-162. -------------------------- Resolution: Cannot Reproduce This has gone stale. We aren't seeing such a problem with the stable versions today. > DataNode fails to deliver blocks, holds thousands of open socket connections > ---------------------------------------------------------------------------- > > Key: HDFS-162 > URL: https://issues.apache.org/jira/browse/HDFS-162 > Project: Hadoop HDFS > Issue Type: Bug > Environment: Hadoop-0.18.0, 7 node Linux grid (6 DataNodes, 1 master > node) > Hadoop-0.18.0, 20 EC2 Linux grid (19 DataNodes, 1 master node) > Reporter: Christopher Gillett > Priority: Critical > Attachments: logs.tar.gz, logsforHadoopTeam.tar.gz > > > 9/27 update: uploaded the logs, with hopefully all the bits that should be > examined. If other things are needed, just let me know. Note that all the > paths refer to 0.18.1. This is still an 18.0 installation using the 18.0 core > jar, just installed to a non-standard location. > 9/26 update: we have successfully reproduced this using Hadoop 0.18 as well. > The problem happens on both our own network infrastructure as well as on an > Amazon EC2 cluster running CentOS5 images. I'll be attaching the logs Raghu > asked for shortly. > A job that used to run correctly on our grid (in 0.15.0) now fails. The > failure occurs after the map phase is complete, and about 2/3rds of the way > through the reduce phase. This job is processing a modest amount of input > data (approximately 220G) > When the error occurs the nodes hosting DataNodes have literally thousands of > open socket connections on them. The DataNode instances are holding large > amounts of memory. Sometimes the DataNodes crash or exit, other times they > continue to run. > The error which gets kicked out from the application perspective is: > 08/05/27 11:30:08 INFO mapred.JobClient: map 100% reduce 89% > 08/05/27 11:30:41 INFO mapred.JobClient: map 100% reduce 90% > 08/05/27 11:32:45 INFO mapred.JobClient: map 100% reduce 86% > 08/05/27 11:32:45 INFO mapred.JobClient: Task Id : > task_200805271056_0001_r_000007_0, Status : FAILED > java.io.IOException: Could not get block locations. Aborting... > at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanode > Error(DFSClient.java:1832) > at > > org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1100(DFSClient.java:1487) > at > > org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1579) > I then discovered that 1 or more DataNode instances on the slave nodes > are down (we run 1 DataNode instance per machine). The cause for at > least some of the DataNode failures is a JVM internal error that gets > raised due to a complete out-of-memory scenario (on a 4G, 4-way machine). > Watching the DataNodes run, I can see them consuming more and more > memory. For those failures for which there is a JVM traceback, I see (in > part...NOTE 0.16.4 TRACEBACK): > # > # java.lang.OutOfMemoryError: requested 16 bytes for CHeapObj-new. Out > of swap space? > # > # Internal Error (414C4C4F434154494F4E0E494E4C494E450E4850500017), > pid=4246, tid=2283883408 > # > # Java VM: Java HotSpot(TM) Server VM (1.6.0_02-b05 mixed mode) > # If you would like to submit a bug report, please visit: > # http://java.sun.com/webapps/bugreport/crash.jsp > # > --------------- T H R E A D --------------- > Current thread (0x8a942000): JavaThread > "org.apache.hadoop.dfs.DataNode$DataXceiver@3f4f44" daemon [_thread_in_Java, > id=15064] > Stack: [0x881c4000,0x88215000), sp=0x882139e0, free space=318k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, > C=native code) > V [libjvm.so+0x53b707] > V [libjvm.so+0x225fe1] > V [libjvm.so+0x16fdc5] > V [libjvm.so+0x22aef3] > Java frames: (J=compiled Java code, j=interpreted, Vv=VM code) > v blob 0xf4f235a7 > J java.io.DataInputStream.readInt()I > j > > org.apache.hadoop.dfs.DataNode$BlockReceiver.receiveBlock(Ljava/io/DataOutputStream;Ljava/io/DataInputStream;Ljava/io/DataOutputStream;Ljava/lang/String;Lorg/a > pache/hadoop/dfs/DataNode$Throttler;I)V+126 > j > > org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(Ljava/io/DataInputStream;)V+746 > j org.apache.hadoop.dfs.DataNode$DataXceiver.run()V+174 > j java.lang.Thread.run()V+11 > v ~StubRoutines::call_stub > --------------- P R O C E S S --------------- > Java Threads: ( => current thread ) > 0x0ae3f400 JavaThread "process reaper" daemon [_thread_blocked, > id=26870] > 0x852e6000 JavaThread > "org.apache.hadoop.dfs.DataNode$DataXceiver@e5dce1" daemon [_thread_in_vm, > id=26869] > 0x08a1cc00 JavaThread "PacketResponder 0 for Block > blk_-6186975972786687394" daemon [_thread_blocked, id=26769] > 0x852e5000 JavaThread > "org.apache.hadoop.dfs.DataNode$DataXceiver@c40bf8" daemon > [_thread_in_native, id=26768] > 0x0956e000 JavaThread "PacketResponder 0 for Block > blk_-2322514873363546651" daemon [_thread_blocked, id=26767] > 0x852e4400 JavaThread > "org.apache.hadoop.dfs.DataNode$DataXceiver@1ca61f9" daemon > [_thread_in_native, id=26766] > 0x09d3a400 JavaThread "PacketResponder 0 for Block > blk_8926941945313450801" daemon [_thread_blocked, id=26764] > 0x852e3c00 JavaThread > "org.apache.hadoop.dfs.DataNode$DataXceiver@1e186d9" daemon > [_thread_in_native, id=26763] > 0x0953d000 JavaThread "PacketResponder 0 for Block > blk_4785883052769066976" daemon [_thread_blocked, id=26762] > 0xb13a5c00 JavaThread > "org.apache.hadoop.dfs.DataNode$DataXceiver@13d62aa" daemon > [_thread_in_native, id=26761] > The interesting part here is that if I count the number of JavaThreads > running org.apache.hadoop.dfs.DataNode I see 4,538 (!) in the > traceback. The number of threads was surprising. > Other DataNodes just exit without panicking the JVM. In either failure > mode, the last few lines of the DataNode log file is apparently > innocuous: > 2008-05-27 11:31:47,663 INFO org.apache.hadoop.dfs.DataNode: Datanode 2 > got response for connect ack from downstream datanode with > firstbadlink as > 2008-05-27 11:31:47,663 INFO org.apache.hadoop.dfs.DataNode: Datanode 2 > forwarding connect ack to upstream firstbadlink is > 2008-05-27 11:31:48,268 INFO org.apache.hadoop.dfs.DataNode: Receiving > block blk_-2241766430103062484 src: /10.2.14.10:33626 dest: > /10.2.14.10:50010 > 2008-05-27 11:31:48,740 INFO org.apache.hadoop.dfs.DataNode: Receiving > block blk_313239508245918539 src: /10.2.14.24:37836 dest: > /10.2.14.24:50010 > 2008-05-27 11:31:48,740 INFO org.apache.hadoop.dfs.DataNode: Datanode 0 > forwarding connect ack to upstream firstbadlink is > 2008-05-27 11:31:49,044 INFO org.apache.hadoop.dfs.DataNode: Receiving > block blk_1684581399908730353 src: /10.2.14.16:51605 dest: > /10.2.14.16:50010 > 2008-05-27 11:31:49,044 INFO org.apache.hadoop.dfs.DataNode: Datanode 0 > forwarding connect ack to upstream firstbadlink is > 2008-05-27 11:31:49,509 INFO org.apache.hadoop.dfs.DataNode: Receiving > block blk_2493969670086107736 src: /10.2.14.18:47557 dest: > /10.2.14.18:50010 > 2008-05-27 11:31:49,513 INFO org.apache.hadoop.dfs.DataNode: Datanode 1 > got response for connect ack from downstream datanode with > firstbadlink as > 2008-05-27 11:31:49,513 INFO org.apache.hadoop.dfs.DataNode: Datanode 1 > forwarding connect ack to upstream firstbadlink is > Finally, the task-level output (in userlogs) doesn't reveal much > either: > 2008-05-27 11:38:30,724 INFO org.apache.hadoop.mapred.ReduceTask: > task_200805271056_0001_r_000007_1 Need 34 map output(s) > 2008-05-27 11:38:30,753 INFO org.apache.hadoop.mapred.ReduceTask: > task_200805271056_0001_r_000007_1 done copying > task_200805271056_0001_m_001976_0 output from worker9. > 2008-05-27 11:38:31,727 INFO org.apache.hadoop.mapred.ReduceTask: > task_200805271056_0001_r_000007_1: Got 0 new map-outputs & 0 obsolete > map-outputs from tasktracker and 0 map-outputs from previous failures > 2008-05-27 11:38:31,727 INFO org.apache.hadoop.mapred.ReduceTask: > task_200805271056_0001_r_000007_1 Got 33 known map output location(s); > scheduling... > 2008-05-27 11:38:31,727 INFO org.apache.hadoop.mapred.ReduceTask: > task_200805271056_0001_r_000007_1 Scheduled 1 of 33 known outputs (0 slow > hosts and 32 dup hosts) > 2008-05-27 11:38:31,727 INFO org.apache.hadoop.mapred.ReduceTask: > task_200805271056_0001_r_000007_1 Copying task_200805271056_0001_m_001248_0 > output from worker8. > 2008-05-27 11:38:31,727 INFO org.apache.hadoop.mapred.ReduceTask: > task_200805271056_0001_r_000007_1 Need 33 map output(s) > 2008-05-27 11:38:31,752 INFO org.apache.hadoop.mapred.ReduceTask: > task_200805271056_0001_r_000007_1 done copying > task_200805271056_0001_m_001248_0 output from worker8. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira