[ 
https://issues.apache.org/jira/browse/HDFS-3119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13239737#comment-13239737
 ] 

Uma Maheswara Rao G commented on HDFS-3119:
-------------------------------------------

Actual problem is, we set the replication factor down from 2 to 1 and close the 
file.

if complete call success with min replication factor 1 and after this only 
other DN's addStored blocks request comes, then that call can process the 
OverReplicated blocks. Because file might have moved already from 
FileUnderConstruction to finalized.

The other case is, if the complete call success with 2 addStored blocks 
immediately before moving fileInodeUnderConstruction to finalized one, then no 
one will be there to process the overreplicated blocks.

I feel the solution for this problem should be that, we have to add 
overreplicated check in BlockManager#checkReplication method. This will be 
called on complete file.

current code is checking only neededReplications.
{code}
 public void checkReplication(Block block, int numExpectedReplicas) {
    // filter out containingNodes that are marked for decommission.
    NumberReplicas number = countNodes(block);
    if (isNeededReplication(block, numExpectedReplicas, number.liveReplicas())) 
{ 
      neededReplications.add(block,
                             number.liveReplicas(),
                             number.decommissionedReplicas(),
                             numExpectedReplicas);
    }
  }
{code}
                
> Overreplicated block is not deleted even after the replication factor is 
> reduced after sync follwed by closing that file
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-3119
>                 URL: https://issues.apache.org/jira/browse/HDFS-3119
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.24.0
>            Reporter: J.Andreina
>            Priority: Minor
>             Fix For: 0.24.0, 0.23.2
>
>
> cluster setup:
> --------------
> 1NN,2 DN,replication factor 2,block report interval 3sec ,block size-256MB
> step1: write a file "filewrite.txt" of size 90bytes with sync(not closed) 
> step2: change the replication factor to 1  using the command: "./hdfs dfs 
> -setrep 1 /filewrite.txt"
> step3: close the file
> * At the NN side the file "Decreasing replication from 2 to 1 for 
> /filewrite.txt" , logs has occured but the overreplicated blocks are not 
> deleted even after the block report is sent from DN
> * while listing the file in the console using "./hdfs dfs -ls " the 
> replication factor for that file is mentioned as 1
> * In fsck report for that files displays that the file is replicated to 2 
> datanodes

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to