[
https://issues.apache.org/jira/browse/HADOOP-1297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Marco Nicosia updated HADOOP-1297:
----------------------------------
Please include this in 0.12.4
> datanode sending block reports to namenode once every second
> ------------------------------------------------------------
>
> Key: HADOOP-1297
> URL: https://issues.apache.org/jira/browse/HADOOP-1297
> Project: Hadoop
> Issue Type: Bug
> Components: dfs
> Reporter: dhruba borthakur
> Assigned To: dhruba borthakur
> Fix For: 0.13.0
>
> Attachments: datanodeDeleteBlocks2.patch
>
>
> The namenode is requesting a block to be deleted. The datanode tries this
> operation and encounters an error because the block is not in the blockMap.
> The processCommand() method raises an exception. The code is such that the
> variable lastBlockReport is not set if processCommand() raises an exception.
> This means that the datanode immediately send another block report to the
> namenode. The eats up quite a bit of CPU on namenode.
> In short, the above condition causes the datanode to send blockReports almost
> once every second!
> I propose that we do the following:
> 1. in Datanode.offerService, replace the following piece of code
> DatanodeCommand cmd = namenode.blockReport(dnRegistration,
> data.getBlockReport());
> processCommand(cmd);
> lastBlockReport = now;
> with
> DatanodeCommand cmd = namenode.blockReport(dnRegistration,
> data.getBlockReport());
> lastBlockReport = now;
> processCommand(cmd);
> 2. In FSDataSet.invalidate:
> a) continue to process all blocks in invalidBlks[] even if one in the middle
> encounters a problem.
> b) if getFile() returns null, still invoke volumeMap.get() and print whether
> we found the block in
> volumes or not. The volumeMap is used to generate the blockReport and this
> might help in debugging.
> [
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.