Hi, There are 20 threads which put file into HDFS ceaseless, every file is 2k. When 1 million files have finished, client begin throw "coulod not complete file" exception ceaseless. At that time, datanode is hang-up.
I think maybe heart beat is lost, so namenode does not know the state of datanode. But I do not know why heart beat have lost. Is there any info can be found from log when datanode can not send heart beat? Thanks and regards! bourne