[ https://issues.apache.org/jira/browse/HDFS-7815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chris Nauroth resolved HDFS-7815. --------------------------------- Resolution: Duplicate > Loop on 'blocks does not belong to any file' > -------------------------------------------- > > Key: HDFS-7815 > URL: https://issues.apache.org/jira/browse/HDFS-7815 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, namenode > Affects Versions: 2.6.0 > Environment: small cluster on RetHat. 2 namenodes (HA), 6 datanodes > with 19TB disk for hdfs. > Reporter: Frode Halvorsen > > I am currently experincing a looping situation; > The namenode uses appx 1:50 (min:sec) to log a massive amount of lines > stating that some blocks don't belong to any file. During this time, it's > unresponsive to any requests from datanodes, and if the zoo-keper had been > running, it would have taken the name-node down (ssh-fencing : kill). > When it has finished the 'round', it starts to do some normal work, and among > other things, telling the datanode to delete the blocks. But before the > datanode has gotten around to delete the blocks, and is about to report back > to the namenode, the namenode has stared on the next round of reporing the > same blocks that don't belong to anly file. Thus, the datanode gets a timout > when reporing block-updates for the deleted blocks, And this, of course > repeats itself over and over again... > There is actually two issues , I think,; > 1- the namenode gets totally unresponsive when reporing the blocks (could > this be a debug-line instead of a INFO-line) > 2 - the namenode seems to 'forget' that it has already reported those blocks > just 2-3 minutes ago... -- This message was sent by Atlassian JIRA (v6.3.4#6332)