[jira] Updated: (HADOOP-994) DFS Scalability : a BlockReport that returns large number of blocks-to-be-deleted cause datanode to lost connectivity to namenode

Doug Cutting (JIRA) Fri, 02 Mar 2007 14:05:11 -0800

     [ 
https://issues.apache.org/jira/browse/HADOOP-994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Doug Cutting updated HADOOP-994:
--------------------------------

       Resolution: Fixed
    Fix Version/s: 0.12.0
           Status: Resolved  (was: Patch Available)

I committed this.  Thanks, Dhruba!

> DFS Scalability : a BlockReport that returns large number of 
> blocks-to-be-deleted cause datanode to lost connectivity to namenode
> ---------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-994
>                 URL: https://issues.apache.org/jira/browse/HADOOP-994
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: dhruba borthakur
>         Assigned To: dhruba borthakur
>             Fix For: 0.12.0
>
>         Attachments: blockReportInvalidateBlock.patch
>
>
> The Datanode periodically invokes a block report RPC to the Namenode. This 
> RPC returns the number of blocks that are to be invalidated by the Datanode. 
> The Datanode then starts to delete all the corresponding files. This block 
> deletion is done by the heartbeat thread in the Datanode. If the number of 
> files to be deleted is large, the Datanode stops sending heartbeats for this 
> entire duration. The Namenode declares the Datanode as "dead" and starts 
> replicating its blocks.
> In my observed case, the block report returns 1669 blocks that were to be 
> invalidated. The Datanode was running on a RAID5 ext3 filesystem and 4 active 
> tasks were running on it. The deletion of  these 1669 files took about 30 
> minutes, Wow! The average disk service time during this period was less than 
> 10 ms. The Datanode was using about 30% CPU during this time. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-994) DFS Scalability : a BlockReport that returns large number of blocks-to-be-deleted cause datanode to lost connectivity to namenode

Reply via email to