previously I have a cluster containing 8 nodes and it woks well. I add 24 new datanodes to the cluster, tasktracker and datanode deamons can start but when I shutdown the cluster I find those errors on these new added datanodes. Can anyone explain it?
log from tasktracker 2010-09-26 09:52:21,672 INFO org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting TaskTracker STARTUP_MSG: host = localhost/127.0.0.1 STARTUP_MSG: args = [] STARTUP_MSG: version = 0.20.2 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010 ************************************************************/ 2010-09-26 09:52:21,876 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog 2010-09-26 09:52:22,006 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50060 2010-09-26 09:52:22,014 INFO org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned 50060 webServer.getConnectors()[0].getLocalPort() returned 50060 2010-09-26 09:52:22,014 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50060 2010-09-26 09:52:22,014 INFO org.mortbay.log: jetty-6.1.14 2010-09-26 09:52:42,715 INFO org.mortbay.log: Started selectchannelconnec...@0.0.0.0:50060 2010-09-26 09:52:42,722 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=TaskTracker, sessionId= 2010-09-26 09:52:42,737 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics with hostName=TaskTracker, port=28404 2010-09-26 09:52:42,793 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting 2010-09-26 09:52:42,793 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 28404: starting 2010-09-26 09:52:42,794 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 28404: starting 2010-09-26 09:52:42,795 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 28404: starting 2010-09-26 09:52:42,795 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 28404: starting 2010-09-26 09:52:42,795 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 28404: starting 2010-09-26 09:52:42,795 INFO org.apache.hadoop.ipc.Server: IPC Server handler 4 on 28404: starting 2010-09-26 09:52:42,795 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 28404: starting 2010-09-26 09:52:42,796 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 28404: starting 2010-09-26 09:52:42,796 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 on 28404: starting 2010-09-26 09:52:42,796 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 28404: starting 2010-09-26 09:52:42,796 INFO org.apache.hadoop.ipc.Server: IPC Server handler 9 on 28404: starting 2010-09-26 09:52:42,796 INFO org.apache.hadoop.ipc.Server: IPC Server handler 10 on 28404: starting 2010-09-26 09:52:42,796 INFO org.apache.hadoop.ipc.Server: IPC Server handler 11 on 28404: starting 2010-09-26 09:52:42,796 INFO org.apache.hadoop.ipc.Server: IPC Server handler 12 on 28404: starting 2010-09-26 09:52:42,797 INFO org.apache.hadoop.ipc.Server: IPC Server handler 13 on 28404: starting 2010-09-26 09:52:42,797 INFO org.apache.hadoop.ipc.Server: IPC Server handler 14 on 28404: starting 2010-09-26 09:52:42,797 INFO org.apache.hadoop.ipc.Server: IPC Server handler 15 on 28404: starting 2010-09-26 09:52:42,797 INFO org.apache.hadoop.mapred.TaskTracker: TaskTracker up at: localhost/127.0.0.1:28404 2010-09-26 09:52:42,797 INFO org.apache.hadoop.mapred.TaskTracker: Starting tracker tracker_localhost:localhost/127.0.0.1:28404 2010-09-26 09:52:55,025 INFO org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events fetcher for all reduce tasks on tracker_localhost:localhost/127.0.0.1:28404 2010-09-26 09:52:55,027 INFO org.apache.hadoop.mapred.TaskTracker: Using MemoryCalculatorPlugin : org.apache.hadoop.util.linuxmemorycalculatorplu...@2de12f6d 2010-09-26 09:52:55,031 WARN org.apache.hadoop.mapred.TaskTracker: TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is disabled. 2010-09-26 09:52:55,032 INFO org.apache.hadoop.mapred.IndexCache: IndexCache created with max memory = 10485760 2010-09-26 09:54:13,298 ERROR org.apache.hadoop.mapred.TaskTracker: Caught exception: java.io.IOException: Call to vm221/10.11.2.221:9001 failed on local exception: java.io.EOFException at org.apache.hadoop.ipc.Client.wrapException(Client.java:775) at org.apache.hadoop.ipc.Client.call(Client.java:743) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) at org.apache.hadoop.mapred.$Proxy4.heartbeat(Unknown Source) at org.apache.hadoop.mapred.TaskTracker.transmitHeartBeat(TaskTracker.java:1215) at org.apache.hadoop.mapred.TaskTracker.offerService(TaskTracker.java:1037) at org.apache.hadoop.mapred.TaskTracker.run(TaskTracker.java:1720) at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:2833) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:501) at org.apache.hadoop.ipc.Client$Connection.run(Client.java:446) 2010-09-26 09:54:13,298 INFO org.apache.hadoop.mapred.TaskTracker: Resending 'status' to 'vm221' with reponseId '25 2010-09-26 09:54:14,303 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: vm221/10.11.2.221:9001. Already tried 0 time(s). 2010-09-26 09:54:14,379 INFO org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down TaskTracker at localhost/127.0.0.1 ************************************************************/ datanode log: 2010-09-26 09:54:16,234 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Call to vm221/10.11.2.221:9000 failed on local exception: java.io.EOFException at org.apache.hadoop.ipc.Client.wrapException(Client.java:775) at org.apache.hadoop.ipc.Client.call(Client.java:743) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) at $Proxy4.sendHeartbeat(Unknown Source) at org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:702) at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1186) at java.lang.Thread.run(Thread.java:619) Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:501) at org.apache.hadoop.ipc.Client$Connection.run(Client.java:446) 2010-09-26 09:54:17,660 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down DataNode at localhost/127.0.0.1 ************************************************************/ 2010-09-26 09:55:43,072 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting DataNode STARTUP_MSG: host = localhost/127.0.0.1 STARTUP_MSG: args = [] STARTUP_MSG: version = 0.20.2 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010 ************************************************************/ 2010-09-26 09:56:39,903 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Registered FSDatasetStatusMBean 2010-09-26 09:56:39,905 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Opened info server at 50010 2010-09-26 09:56:39,908 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Balancing bandwith is 10485760 bytes/s 2010-09-26 09:56:39,969 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog 2010-09-26 09:56:40,040 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50075 2010-09-26 09:56:40,040 INFO org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned 50075 webServer.getConnectors()[0].getLocalPort() returned 50075 2010-09-26 09:56:40,040 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50075 2010-09-26 09:56:40,040 INFO org.mortbay.log: jetty-6.1.14 2010-09-26 09:56:55,697 INFO org.mortbay.log: Started selectchannelconnec...@0.0.0.0:50075 2010-09-26 09:56:55,705 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=DataNode, sessionId=null 2010-09-26 09:56:55,724 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC Metrics with hostName=DataNode, port=50020 2010-09-26 09:56:55,727 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting 2010-09-26 09:56:55,728 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 50020: starting 2010-09-26 09:56:55,729 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 50020: starting 2010-09-26 09:56:55,730 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 50020: starting 2010-09-26 09:56:55,730 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 50020: starting 2010-09-26 09:56:55,730 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: dnRegistration = DatanodeRegistration(localhost:50010, storageID=DS-626517105-127.0.0.1-50010-1285405012557, infoPort=50075, ipcPort=50020) 2010-09-26 09:56:55,737 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.11.3.207:50010, storageID=DS-626517105-127.0.0.1-50010-1285405012557, infoPort=50075, ipcPort=50020)In DataNode.run, data = FSDataset{dirpath='/data0/hadoop-root/dfs/data/current'} 2010-09-26 09:56:55,738 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: using BLOCKREPORT_INTERVAL of 3600000msec Initial delay: 0msec 2010-09-26 shangan