[ 
https://issues.apache.org/jira/browse/HDFS-7235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14177853#comment-14177853
 ] 

Yongjun Zhang commented on HDFS-7235:
-------------------------------------

HI [~cmccabe],

Thanks again for the review. Please see my answer below.
{quote}
We shouldn't log a message saying that "the block file doesn't exist" if the 
block file exists, but is not finalized.
{quote}
We are not, we only log when the state is finalized, and the block file doesn't 
exist. 

{quote}
I also don't see why we need to call FSDatasetSpi#getLength, if we already have 
access to the replica length here.
{quote}
The new fix we are introducing here is to handle a special case that when 
{{isValidBlock()}} returns false, so I tried to limit the change in the special 
handling block. If we remove the pre-exiisting {{FSDatasetSpi#getLength}}, we 
need to move the call {{getReplica()}} out of the false block.
The {{getReplica()}} was marked {{@Deprecated}}, I consider calling it is a bit 
hack here already, Plus, we need to synchronize the whole block of code, so I 
hope we can limit the impact to within the false block. I wonder if this 
explanation makes sense to you.

{quote}
I would suggest having your synchronized section set a string named 
replicaProblem. Then if the string is null at the end, there is no problem-- 
otherwise, the problem is contained in replicaProblem. Then you can check 
existence, replica state, and length all at once.
{quote}
I am not sure I follow what you said, will check in person.
{quote}
We don't even need to call isValidBlock. getReplica gives you all the info you 
need. Please take out this call, since it's unnecessary.
{quote}
The {isValidBlock}} is an interface defined in FsDatasetSpi, and has its 
methods defined in derived classes FsDatasetImpl, and SimulatedFSDataset etc, 
which might have different implementation of the methods. It'd be nice to stick 
to the interface of FsDatasetSpi. 

Will discuss with you more.

Thanks again.



> Can not decommission DN which has invalid block due to bad disk
> ---------------------------------------------------------------
>
>                 Key: HDFS-7235
>                 URL: https://issues.apache.org/jira/browse/HDFS-7235
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode, namenode
>    Affects Versions: 2.6.0
>            Reporter: Yongjun Zhang
>            Assignee: Yongjun Zhang
>         Attachments: HDFS-7235.001.patch, HDFS-7235.002.patch, 
> HDFS-7235.003.patch
>
>
> When to decommission a DN, the process hangs. 
> What happens is, when NN chooses a replica as a source to replicate data on 
> the to-be-decommissioned DN to other DNs, it favors choosing this DN 
> to-be-decommissioned as the source of transfer (see BlockManager.java).  
> However, because of the bad disk, the DN would detect the source block to be 
> transfered as invalidBlock with the following logic in FsDatasetImpl.java:
> {code}
> /** Does the block exist and have the given state? */
>   private boolean isValid(final ExtendedBlock b, final ReplicaState state) {
>     final ReplicaInfo replicaInfo = volumeMap.get(b.getBlockPoolId(), 
>         b.getLocalBlock());
>     return replicaInfo != null
>         && replicaInfo.getState() == state
>         && replicaInfo.getBlockFile().exists();
>   }
> {code}
> The reason that this method returns false (detecting invalid block) is 
> because the block file doesn't exist due to bad disk in this case. 
> The key issue we found here is, after DN detects an invalid block for the 
> above reason, it doesn't report the invalid block back to NN, thus NN doesn't 
> know that the block is corrupted, and keeps sending the data transfer request 
> to the same DN to be decommissioned, again and again. This caused an infinite 
> loop, so the decommission process hangs.
> Thanks [~qwertymaniac] for reporting the issue and initial analysis.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to