HI,

Some mr tasks upon small files had run on our hadoop cluster these
days, with not such high
work load. While when i check it tonight, the cluster refused to
response. So i restart the hdfs
& mapred.

BUT unexcepted exceptions were thrown when starting~ and even the
"hadoop fs -ls" command
could not get any response. The system seems to "hang" there.

btw: at start, there is a piece of message telling "22 port connection
timeout". I checked the
$HADOOP_SSH_OPT in hadoop-env.sh. There, connection time out is set to
1, so i changed it
to 1000. Then the error disappeared, but the following exception goes on~

Thanks for help !

2011-04-11 21:39:57,028 INFO org.apache.hadoop.mapred.JobTracker:
problem cleaning system directory: null
java.io.IOException: Call to job-tracker/192.168.14.210:8020 failed on
local exception: java.io.IOException: Connection reset by peer
        at org.apache.hadoop.ipc.Client.wrapException(Client.java:774)
        at org.apache.hadoop.ipc.Client.call(Client.java:742)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
        at $Proxy4.getProtocolVersion(Unknown Source)
        at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)
        at 
org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:105)
        at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:177)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82)
        at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1373)
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1385)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:191)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:95)
        at org.apache.hadoop.mapred.JobTracker.<init>(JobTracker.java:1607)
        at org.apache.hadoop.mapred.JobTracker.startTracker(JobTracker.java:174)
        at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:3528)
Caused by: java.io.IOException: Connection reset by peer
        at sun.nio.ch.FileDispatcher.read0(Native Method)
        at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:21)
        at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:233)
        at sun.nio.ch.IOUtil.read(IOUtil.java:206)
        at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:236)
        at 
org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:55)
        at 
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
        at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
        at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
        at java.io.FilterInputStream.read(FilterInputStream.java:116)
        at 
org.apache.hadoop.ipc.Client$Connection$PingInputStream.read(Client.java:276)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
        at java.io.DataInputStream.readInt(DataInputStream.java:370)
        at 
org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:501)
        at org.apache.hadoop.ipc.Client$Connection.run(Client.java:446)


-- 
Yang Jie(杨杰)
hi.baidu.com/thinkdifferent

Group of CLOUD, Xi'an Jiaotong University
Department of Computer Science and Technology, Xi’an Jiaotong University

PHONE: 86 1346888 3723
TEL: 86 29 82665263 EXT. 608
MSN: xtyangjie2...@yahoo.com.cn

freedom is not free.

Reply via email to