[ 
https://issues.apache.org/jira/browse/HADOOP-994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HADOOP-994:
------------------------------------

    Status: Patch Available  (was: Open)

Code reviewied by Milind. 

> DFS Scalability : a BlockReport that returns large number of 
> blocks-to-be-deleted cause datanode to lost connectivity to namenode
> ---------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-994
>                 URL: https://issues.apache.org/jira/browse/HADOOP-994
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: dhruba borthakur
>         Assigned To: dhruba borthakur
>         Attachments: blockReportInvalidateBlock.patch
>
>
> The Datanode periodically invokes a block report RPC to the Namenode. This 
> RPC returns the number of blocks that are to be invalidated by the Datanode. 
> The Datanode then starts to delete all the corresponding files. This block 
> deletion is done by the heartbeat thread in the Datanode. If the number of 
> files to be deleted is large, the Datanode stops sending heartbeats for this 
> entire duration. The Namenode declares the Datanode as "dead" and starts 
> replicating its blocks.
> In my observed case, the block report returns 1669 blocks that were to be 
> invalidated. The Datanode was running on a RAID5 ext3 filesystem and 4 active 
> tasks were running on it. The deletion of  these 1669 files took about 30 
> minutes, Wow! The average disk service time during this period was less than 
> 10 ms. The Datanode was using about 30% CPU during this time. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to