[ 
https://issues.apache.org/jira/browse/HDFS-6527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee updated HDFS-6527:
-----------------------------

    Status: Open  (was: Patch Available)

> Edit log corruption due to defered INode removal
> ------------------------------------------------
>
>                 Key: HDFS-6527
>                 URL: https://issues.apache.org/jira/browse/HDFS-6527
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.4.0
>            Reporter: Kihwal Lee
>            Assignee: Kihwal Lee
>            Priority: Blocker
>         Attachments: HDFS-6527.branch-2.4.patch, HDFS-6527.trunk.patch, 
> HDFS-6527.v2.patch
>
>
> We have seen a SBN crashing with the following error:
> {panel}
> \[Edit log tailer\] ERROR namenode.FSEditLogLoader:
> Encountered exception on operation AddBlockOp
> [path=/xxx,
> penultimateBlock=NULL, lastBlock=blk_111_111, RpcClientId=,
> RpcCallId=-2]
> java.io.FileNotFoundException: File does not exist: /xxx
> {panel}
> This was caused by the deferred removal of deleted inodes from the inode map. 
> Since getAdditionalBlock() acquires FSN read lock and then write lock, a 
> deletion can happen in between. Because of deferred inode removal outside FSN 
> write lock, getAdditionalBlock() can get the deleted inode from the inode map 
> with FSN write lock held. This allow addition of a block to a deleted file.
> As a result, the edit log will contain OP_ADD, OP_DELETE, followed by
>  OP_ADD_BLOCK.  This cannot be replayed by NN, so NN doesn't start up or SBN 
> crashes.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to