[jira] Updated: (HADOOP-1349) Corrupted blocks get deleted but not replicated

Hairong Kuang (JIRA) Mon, 14 May 2007 17:19:42 -0700

     [ 
https://issues.apache.org/jira/browse/HADOOP-1349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Hairong Kuang updated HADOOP-1349:
----------------------------------

    Attachment: blockInvalidate.patch

This patch makes sure that blocks are removed from namespace only after they 
are removed from a datanode.

It adds a new data structure to FSNamesystem, pendingDeleteSets, which keeps 
track of all the blocks that are deleted from datanodes but have not been 
removed from the namespace yet.

Functionally it makes 4 changes:
1. InvalideateBlock does not remove a block from its namespace.
2. When process a heartbeat, if the namenode instrcutes the datanode to remove 
blocks, all these blocks are moved to pendingDeleteSets.
3. When ReplicationMonitor, the background computation thread, wakes up to 
work, it removes blocks in pendingDeleteSets from the namespace if there is any.
4. This patch exposed a bug in the ChecksumException handling. Currently the 
code calls seekToNewSource to select a different replica. But it turned out 
that a following seek/read still tried to select a replica. Sometimes it 
happens to be the problematic replica. So this patch makes sure that a 
seek/read following seekToNewSource does not select a new source.


> Corrupted blocks get deleted but not replicated
> -----------------------------------------------
>
>                 Key: HADOOP-1349
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1349
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Hairong Kuang
>         Assigned To: Hairong Kuang
>             Fix For: 0.14.0
>
>         Attachments: blockInvalidate.patch
>
>
> When I test the patch to HADOOP-1345 on a two node dfs cluster, I see that 
> dfs correctly delete the corrupted replica and successfully retry reading 
> from the other correct replica, but the block does not get replicated. The 
> block remains with only 1 replica until the next block report comes in.
> In my testcase, since the dfs cluster has only 2 datanodes, the target of 
> replication is the same as the target of block invalidation.  After poking 
> the logs, I found out that the namenode sent the replication request before 
> the block invalidation request. 
> This is because the namenode does not invalidate a block well. In 
> FSNamesystem.invalidateBlock, it first puts the invalidate request in a queue 
> and then immediately removes the replica from its state, which triggers the 
> choosing a target for the block. When requests are sent back to the target 
> datanode as a reply to a heartbeat message, the replication requests have 
> higher priority than the invalidate requests.
> This problem could be solved if a namenode removes an invalidated replica 
> from its state only after the invalidate request is sent to the datanode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-1349) Corrupted blocks get deleted but not replicated

Reply via email to