[ 
https://issues.apache.org/jira/browse/HDFS-15875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shilun Fan updated HDFS-15875:
------------------------------
         Component/s: datanode
                      fs
                      namanode
        Hadoop Flags: Reviewed
    Target Version/s: 3.2.3, 3.3.1, 3.4.0

> Check whether file is being truncated before truncate
> -----------------------------------------------------
>
>                 Key: HDFS-15875
>                 URL: https://issues.apache.org/jira/browse/HDFS-15875
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode, fs, namanode
>    Affects Versions: 3.3.0, 3.1.4, 3.2.2
>            Reporter: Hui Fei
>            Assignee: Hui Fei
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 3.3.1, 3.4.0, 3.2.3
>
>          Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> We have got this problem.
>  * A job sends truncate to namenode, and the block recovery goes.
>  * DataNode D is timeout while it connects another datanode (60s), so block 
> recovery costs 60+s
>  * A job tails, and B job starts and it sends truncate to namenode. New 
> recoveryId generates during recovery lease.
>  * DataNode D commitBlockSynchronization and get errors "does not match 
> current recovery id"
> So truncate will not complete forever. Datanode D has replica with new length 
> and two other datanodes have replica old length. 
> DN has the error messages "Inconsistent size of finalized replicas"
> the related code is in BlockRecoveryWorker.java
> {code}
> for (BlockRecord r : syncList) {
>  assert r.rInfo.getNumBytes() > 0 : "zero length replica";
>  ReplicaState rState = r.rInfo.getOriginalReplicaState();
>  if (rState.getValue() < bestState.getValue()) {
>  bestState = rState;
>  }
>  if(rState == ReplicaState.FINALIZED) {
>  if (finalizedLength > 0 && finalizedLength != r.rInfo.getNumBytes()) {
>  throw new IOException("Inconsistent size of finalized replicas. " +
>  "Replica " + r.rInfo + " expected size: " + finalizedLength);
>  }
>  finalizedLength = r.rInfo.getNumBytes();
>  }
> }
> {code}
>  
>  
> {code:java}
>  {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to