[
https://issues.apache.org/jira/browse/HADOOP-994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Doug Cutting updated HADOOP-994:
--------------------------------
Resolution: Fixed
Fix Version/s: 0.12.0
Status: Resolved (was: Patch Available)
I committed this. Thanks, Dhruba!
> DFS Scalability : a BlockReport that returns large number of
> blocks-to-be-deleted cause datanode to lost connectivity to namenode
> ---------------------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-994
> URL: https://issues.apache.org/jira/browse/HADOOP-994
> Project: Hadoop
> Issue Type: Bug
> Components: dfs
> Reporter: dhruba borthakur
> Assigned To: dhruba borthakur
> Fix For: 0.12.0
>
> Attachments: blockReportInvalidateBlock.patch
>
>
> The Datanode periodically invokes a block report RPC to the Namenode. This
> RPC returns the number of blocks that are to be invalidated by the Datanode.
> The Datanode then starts to delete all the corresponding files. This block
> deletion is done by the heartbeat thread in the Datanode. If the number of
> files to be deleted is large, the Datanode stops sending heartbeats for this
> entire duration. The Namenode declares the Datanode as "dead" and starts
> replicating its blocks.
> In my observed case, the block report returns 1669 blocks that were to be
> invalidated. The Datanode was running on a RAID5 ext3 filesystem and 4 active
> tasks were running on it. The deletion of these 1669 files took about 30
> minutes, Wow! The average disk service time during this period was less than
> 10 ms. The Datanode was using about 30% CPU during this time.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.