[jira] [Updated] (HDFS-12985) NameNode crashes during restart after an OpenForWrite file present in the Snapshot got deleted

2018-02-07 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-12985:
-
Fix Version/s: 3.0.1

> NameNode crashes during restart after an OpenForWrite file present in the 
> Snapshot got deleted
> --
>
> Key: HDFS-12985
> URL: https://issues.apache.org/jira/browse/HDFS-12985
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.8.0
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
>Priority: Major
> Fix For: 3.1.0, 2.10.0, 3.0.1
>
> Attachments: HDFS-12985.01.patch
>
>
> NameNode crashes repeatedly with NPE at the startup when trying to find the 
> total number of under construction blocks. This crash happens after an open 
> file, which was also part of a snapshot gets deleted along with the snapshot.
> {noformat}
> Failed to start namenode.
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager.getNumUnderConstructionBlocks(LeaseManager.java:146)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getCompleteBlocksTotal(FSNamesystem.java:6537)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startCommonServices(FSNamesystem.java:1232)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:706)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:692)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12985) NameNode crashes during restart after an OpenForWrite file present in the Snapshot got deleted

2018-01-08 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12985:
--
  Resolution: Fixed
   Fix Version/s: 2.10.0
  3.1.0
Target Version/s: 3.1.0, 2.10.0  (was: 3.1.0)
  Status: Resolved  (was: Patch Available)

> NameNode crashes during restart after an OpenForWrite file present in the 
> Snapshot got deleted
> --
>
> Key: HDFS-12985
> URL: https://issues.apache.org/jira/browse/HDFS-12985
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.8.0
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Fix For: 3.1.0, 2.10.0
>
> Attachments: HDFS-12985.01.patch
>
>
> NameNode crashes repeatedly with NPE at the startup when trying to find the 
> total number of under construction blocks. This crash happens after an open 
> file, which was also part of a snapshot gets deleted along with the snapshot.
> {noformat}
> Failed to start namenode.
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager.getNumUnderConstructionBlocks(LeaseManager.java:146)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getCompleteBlocksTotal(FSNamesystem.java:6537)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startCommonServices(FSNamesystem.java:1232)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:706)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:692)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12985) NameNode crashes during restart after an OpenForWrite file present in the Snapshot got deleted

2018-01-04 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12985:
--
Status: Patch Available  (was: Open)

> NameNode crashes during restart after an OpenForWrite file present in the 
> Snapshot got deleted
> --
>
> Key: HDFS-12985
> URL: https://issues.apache.org/jira/browse/HDFS-12985
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.8.0
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-12985.01.patch
>
>
> NameNode crashes repeatedly with NPE at the startup when trying to find the 
> total number of under construction blocks. This crash happens after an open 
> file, which was also part of a snapshot gets deleted along with the snapshot.
> {noformat}
> Failed to start namenode.
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager.getNumUnderConstructionBlocks(LeaseManager.java:146)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getCompleteBlocksTotal(FSNamesystem.java:6537)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startCommonServices(FSNamesystem.java:1232)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:706)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:692)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12985) NameNode crashes during restart after an OpenForWrite file present in the Snapshot got deleted

2018-01-04 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12985:
--
Attachment: HDFS-12985.01.patch

Attached v01 to address the following:
1. {{INodeFile#cleanSubtree()}} updates {{ReclaimContext#removedUCFiles}} after 
deleting the snapshot file.
2. {{FSDirDeleteOp#deleteInternal}} already take care of removing the leases 
for removedUCFiles and removedINodes.
3. New unit test {{TestOpenFilesWithSnapshot#testOpenFileDeletionAndNNRestart}} 
added to show the problem and the fix solving the same.

> NameNode crashes during restart after an OpenForWrite file present in the 
> Snapshot got deleted
> --
>
> Key: HDFS-12985
> URL: https://issues.apache.org/jira/browse/HDFS-12985
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.8.0
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
> Attachments: HDFS-12985.01.patch
>
>
> NameNode crashes repeatedly with NPE at the startup when trying to find the 
> total number of under construction blocks. This crash happens after an open 
> file, which was also part of a snapshot gets deleted along with the snapshot.
> {noformat}
> Failed to start namenode.
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager.getNumUnderConstructionBlocks(LeaseManager.java:146)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getCompleteBlocksTotal(FSNamesystem.java:6537)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startCommonServices(FSNamesystem.java:1232)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:706)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:692)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12985) NameNode crashes during restart after an OpenForWrite file present in the Snapshot got deleted

2018-01-03 Thread Manoj Govindassamy (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-12985:
--
Description: 
NameNode crashes repeatedly with NPE at the startup when trying to find the 
total number of under construction blocks. This crash happens after an open 
file, which was also part of a snapshot gets deleted along with the snapshot.

{noformat}
Failed to start namenode.
java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.server.namenode.LeaseManager.getNumUnderConstructionBlocks(LeaseManager.java:146)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getCompleteBlocksTotal(FSNamesystem.java:6537)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startCommonServices(FSNamesystem.java:1232)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:706)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:692)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615)
{noformat}




  was:
NameNode crashes repeatedly with NPE at the startup when trying to find the 
total number of under construction blocks. This crash happens after an open 
file, which was also part of a snapshot gets deleted along with the snapshot.

{noformat}
java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.server.namenode.LeaseManager.getNumUnderConstructionBlocks(LeaseManager.java:144)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getCompleteBlocksTotal(FSNamesystem.java:4456)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startCommonServices(FSNamesystem.java:1158)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:825)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:751)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:968)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:947)
at 
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1674)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:2110)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:2075)
at 
org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testSnapshotsForOpenFilesAndDeletion3(TestOpenFilesWithSnapshot.java:747)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
{noformat}





> NameNode crashes during restart after an OpenForWrite file present in the 
> Snapshot got deleted
> --
>
> Key: HDFS-12985
> URL: https://issues.apache.org/jira/browse/HDFS-12985
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 2.8.0
>Reporter: Manoj Govindassamy
>Assignee: Manoj Govindassamy
>
> NameNode crashes repeatedly with NPE at the startup when trying to find the 
> total number of under construction blocks. This crash happens after an open 
> file, which was also part of a snapshot gets deleted along with the snapshot.
> {noformat}
> Failed to start namenode.
> java.lang.NullPointerException
>   at 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager.getNumUnderConstructionBlocks(LeaseManager.java:146)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getCompleteBlocksTotal(FSNamesystem.java:6537)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startCommonServices(FSNamesystem.java:1232)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:706)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:692)
>