Hello all, I was trying to upgrade hadoop 0.13.1 to 0.14.1, but when I follow the instruction at http://wiki.apache.org/lucene-hadoop/Hadoop_0.14_Upgrade, running "./start-dfs.sh -upgrad", I found no progress with the upgrading process.
I tried to check the status with ./hadoop dfsadmin -upgradeProgress status and get Distributed upgrade for version -6 is in progress. Status = 0% Last Block Level Stats updated at : Thu Sep 13 02:04:43 GMT+08:00 2007 Last Block Level Stats : Total Blocks : 1661833 Fully Upgragraded : 0.00% Minimally Upgraded : 0.00% Under Upgraded : 100.00% (includes Un-upgraded blocks) Un-upgraded : 100.00% Errors : 0 Brief Datanode Status : Avg completion of all Datanodes: 0.00% with 0 errors. then ./hadoop dfsadmin -upgradeProgress details and get Last Block Level Stats updated at : Thu Sep 13 02:09:47 GMT+08:00 2007 Last Block Level Stats : Total Blocks : 1661833 Fully Upgragraded : 0.00% Minimally Upgraded : 0.00% Under Upgraded : 100.00% (includes Un-upgraded blocks) Un-upgraded : 100.00% Errors : 0 Brief Datanode Status : Avg completion of all Datanodes: 0.00% with 0 errors. Datanode Stats (total: 0): pct Completion(%) blocks upgraded (u) blocks remaining (r) errors (e) There are no known Datanodes Also I checked the log of the name node, and found one exception as followed 2007-09-13 02:17:25,324 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 9000: starting 2007-09-13 02:17:25,324 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 on 9000: starting 2007-09-13 02:17:25,324 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 9000: starting 2007-09-13 02:17:25,325 INFO org.apache.hadoop.ipc.Server: IPC Server handler 9 on 9000: starting 2007-09-13 02:17:25,400 INFO org.apache.hadoop.dfs.BlockCrcUpgradeNamenode: Block CRC Upgrade is still running. Avg completion of all Datanodes: 0.00% with 0 errors. 2007-09-13 02:17:25,406 WARN org.apache.hadoop.ipc.Server: IPC Server handler 5 on 9000, call getProtocolVersion(org.apache.hado op.dfs.ClientProtocol, 14) from 192.168.2.1:53211: output error java.nio.channels.ClosedChannelException at sun.nio.ch.SocketChannelImpl.ensureWriteOpen( SocketChannelImpl.java:125) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:294) at org.apache.hadoop.ipc.SocketChannelOutputStream.flushBuffer( SocketChannelOutputStream.java:108) at org.apache.hadoop.ipc.SocketChannelOutputStream.write( SocketChannelOutputStream.java:89) at java.io.BufferedOutputStream.flushBuffer( BufferedOutputStream.java:65) at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123) at java.io.DataOutputStream.flush(DataOutputStream.java:106) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:585) 2007-09-13 02:18:24,921 INFO org.apache.hadoop.dfs.BlockCrcUpgradeNamenode: Block CRC Upgrade is still running. Avg completion of all Datanodes: 0.00% with 0 errors. It seems some thing was going wong on data node side, however the log of one of the data nodes show it was started, and it was still running as I can find from the processes list, but some how lost connection with the name-node. ************************************************************/ 2007-09-12 22:23:35,319 INFO org.apache.hadoop.dfs.DataNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting DataNode STARTUP_MSG: host = TE-DN-002/192.168.2.102 STARTUP_MSG: args = [] ************************************************************/ 2007-09-12 22:23:35,533 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=DataNode, sessi onId=null 2007-09-12 22:24:35,619 INFO org.apache.hadoop.ipc.RPC: Problem connecting to server: /192.168.2.1:9000 2007-09-12 22:25:34,878 INFO org.apache.hadoop.dfs.Storage: Recovering storage directory /home/textd/data/fs/data from previous upgrade. 2007-09-12 22:25:49,586 INFO org.apache.hadoop.dfs.DataNode: Distributed upgrade for DataNode version -6 to current LV -7 is initialized. 2007-09-12 22:25:49,586 INFO org.apache.hadoop.dfs.Storage: Upgrading storage directory /home/textd/data/fs/data. old LV = -4; old CTime = 0. new LV = -7; new CTime = 1189616555276 The hardware configuration was Namenode: P4D, 3G RAM 3 Datanodes: AMD 64 4000x2, 1G RAM They worked with hadoop 0.13.1 Any idea or suggestion?