[ 
https://issues.apache.org/jira/browse/HDFS-7815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Nauroth resolved HDFS-7815.
---------------------------------
    Resolution: Duplicate

> Loop on 'blocks does not belong to any file'
> --------------------------------------------
>
>                 Key: HDFS-7815
>                 URL: https://issues.apache.org/jira/browse/HDFS-7815
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode, namenode
>    Affects Versions: 2.6.0
>         Environment: small cluster on RetHat. 2 namenodes (HA),  6 datanodes 
> with 19TB disk for hdfs.
>            Reporter: Frode Halvorsen
>
> I am currently experincing a looping situation;
> The namenode uses appx 1:50 (min:sec) to log a massive amount of lines 
> stating that some blocks don't belong to any file. During this time, it's 
> unresponsive to any requests from datanodes, and if the zoo-keper had been 
> running, it would have taken the name-node down (ssh-fencing : kill).
> When it has finished the 'round', it starts to do some normal work, and among 
> other things, telling the datanode to delete the blocks. But before the 
> datanode has gotten around to delete the blocks, and is about to report back 
> to the namenode, the namenode  has stared on the next round of reporing the 
> same blocks that don't belong to anly file. Thus, the datanode gets a timout 
> when reporing block-updates for the deleted blocks, And this, of course 
> repeats itself over and over again... 
> There is actually two issues , I think,;
> 1- the namenode gets totally unresponsive when reporing the blocks (could 
> this be a debug-line instead of a INFO-line)
> 2 - the namenode seems to 'forget' that it has already reported those blocks 
> just 2-3 minutes ago...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to