Looking at the code it seems like the node is getting data that it is not supposed to. The write operation on the DataReceiver fails because there already is such a block existing on that node. It just bails out and closes the connection. Now question is: how can that happen? ...and how bad is it?

Any feedback would be great.

cheers
--
Torsten

On 20.08.2007, at 00:47, Torsten Curdt wrote:


I think might have come up before (but couldn't find something in the archives) and was meant to be not serious but we just had a node spit out 5000 errors per hour of these

2007-08-19T20:50:37+00:00 [192.168.165.136] [user.err] [hadoop.org.apache.hadoop.dfs.DataNode] DataXCeiver 2007-08-19T20:50:37+00:00 [192.168.165.136] [user.err] java.io.IOException: Block blk_-6080554387210237245 is valid, and cannot be written to. 2007-08-19T20:50:37+00:00 [192.168.165.136] [user.err] at org.apache.hadoop.dfs.FSDataset.writeToBlock(FSDataset.java:490) 2007-08-19T20:50:37+00:00 [192.168.165.136] [user.err] at org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(DataNode.java: 734) 2007-08-19T20:50:37+00:00 [192.168.165.136] [user.err] at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:563) 2007-08-19T20:50:37+00:00 [192.168.165.136] [user.err] at java.lang.Thread.run(Thread.java:595)

which cannot be good. We are still running version 0.10.1 (I know - we need to upgrade!) ...but could someone please explain what this really means?

The node was still listed as in service and the fsck also confirmed the hdfs to be OK.

cheers
--
Torsten

Reply via email to