[ 
https://issues.apache.org/jira/browse/HDFS-5283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13795500#comment-13795500
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-5283:
----------------------------------------------

- In FSNamesystem.isInSnapshot(..), we can safely assume the blockUC is 
non-null and put blockUC.getBlockCollection() in a local variable in the very 
beginning.  Also the assert should be hasReadLock() instead of hasWriteLock() 
since the method does not write anything.
{code}
//FSNamesystem
  public boolean isInSnapshot(BlockInfoUnderConstruction blockUC) {
    assert hasReadLock();
    final BlockCollection bc = blockUC.getBlockCollection();
    if (bc == null || !(bc instanceof INodeFileUnderConstruction)) {
      return false;
    }

    final INodeFileUnderConstruction inodeUC = (INodeFileUnderConstruction) bc;
    ...
  }
{code}

- For DFSOutputStream.abort(), it is better to add DFSTestUtil.abortStream(..) 
than change it to public.  Sorry that I did not see this previously.


> NN not coming out of startup safemode due to under construction blocks only 
> inside snapshots also counted in safemode threshhold
> --------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-5283
>                 URL: https://issues.apache.org/jira/browse/HDFS-5283
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: snapshots
>    Affects Versions: 3.0.0, 2.1.1-beta
>            Reporter: Vinay
>            Assignee: Vinay
>            Priority: Blocker
>         Attachments: HDFS-5283.000.patch, HDFS-5283.patch, HDFS-5283.patch, 
> HDFS-5283.patch, HDFS-5283.patch
>
>
> This is observed in one of our env:
> 1. A MR Job was running which has created some temporary files and was 
> writing to them.
> 2. Snapshot was taken
> 3. And Job was killed and temporary files were deleted.
> 4. Namenode restarted.
> 5. After restart Namenode was in safemode waiting for blocks
> Analysis
> ---------
> 1. Since the snapshot taken also includes the temporary files which were 
> open, and later original files are deleted.
> 2. UnderConstruction blocks count was taken from leases. not considered the 
> UC blocks only inside snapshots
> 3. So safemode threshold count was more and NN did not come out of safemode



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to