[jira] [Updated] (HDFS-12985) NameNode crashes during restart after an OpenForWrite file present in the Snapshot got deleted
[ https://issues.apache.org/jira/browse/HDFS-12985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongjun Zhang updated HDFS-12985: - Fix Version/s: 3.0.1 > NameNode crashes during restart after an OpenForWrite file present in the > Snapshot got deleted > -- > > Key: HDFS-12985 > URL: https://issues.apache.org/jira/browse/HDFS-12985 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.0 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy >Priority: Major > Fix For: 3.1.0, 2.10.0, 3.0.1 > > Attachments: HDFS-12985.01.patch > > > NameNode crashes repeatedly with NPE at the startup when trying to find the > total number of under construction blocks. This crash happens after an open > file, which was also part of a snapshot gets deleted along with the snapshot. > {noformat} > Failed to start namenode. > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.namenode.LeaseManager.getNumUnderConstructionBlocks(LeaseManager.java:146) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getCompleteBlocksTotal(FSNamesystem.java:6537) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startCommonServices(FSNamesystem.java:1232) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:706) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:692) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12985) NameNode crashes during restart after an OpenForWrite file present in the Snapshot got deleted
[ https://issues.apache.org/jira/browse/HDFS-12985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HDFS-12985: -- Resolution: Fixed Fix Version/s: 2.10.0 3.1.0 Target Version/s: 3.1.0, 2.10.0 (was: 3.1.0) Status: Resolved (was: Patch Available) > NameNode crashes during restart after an OpenForWrite file present in the > Snapshot got deleted > -- > > Key: HDFS-12985 > URL: https://issues.apache.org/jira/browse/HDFS-12985 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.0 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > Fix For: 3.1.0, 2.10.0 > > Attachments: HDFS-12985.01.patch > > > NameNode crashes repeatedly with NPE at the startup when trying to find the > total number of under construction blocks. This crash happens after an open > file, which was also part of a snapshot gets deleted along with the snapshot. > {noformat} > Failed to start namenode. > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.namenode.LeaseManager.getNumUnderConstructionBlocks(LeaseManager.java:146) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getCompleteBlocksTotal(FSNamesystem.java:6537) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startCommonServices(FSNamesystem.java:1232) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:706) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:692) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12985) NameNode crashes during restart after an OpenForWrite file present in the Snapshot got deleted
[ https://issues.apache.org/jira/browse/HDFS-12985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HDFS-12985: -- Status: Patch Available (was: Open) > NameNode crashes during restart after an OpenForWrite file present in the > Snapshot got deleted > -- > > Key: HDFS-12985 > URL: https://issues.apache.org/jira/browse/HDFS-12985 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.0 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > Attachments: HDFS-12985.01.patch > > > NameNode crashes repeatedly with NPE at the startup when trying to find the > total number of under construction blocks. This crash happens after an open > file, which was also part of a snapshot gets deleted along with the snapshot. > {noformat} > Failed to start namenode. > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.namenode.LeaseManager.getNumUnderConstructionBlocks(LeaseManager.java:146) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getCompleteBlocksTotal(FSNamesystem.java:6537) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startCommonServices(FSNamesystem.java:1232) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:706) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:692) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12985) NameNode crashes during restart after an OpenForWrite file present in the Snapshot got deleted
[ https://issues.apache.org/jira/browse/HDFS-12985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HDFS-12985: -- Attachment: HDFS-12985.01.patch Attached v01 to address the following: 1. {{INodeFile#cleanSubtree()}} updates {{ReclaimContext#removedUCFiles}} after deleting the snapshot file. 2. {{FSDirDeleteOp#deleteInternal}} already take care of removing the leases for removedUCFiles and removedINodes. 3. New unit test {{TestOpenFilesWithSnapshot#testOpenFileDeletionAndNNRestart}} added to show the problem and the fix solving the same. > NameNode crashes during restart after an OpenForWrite file present in the > Snapshot got deleted > -- > > Key: HDFS-12985 > URL: https://issues.apache.org/jira/browse/HDFS-12985 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.0 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > Attachments: HDFS-12985.01.patch > > > NameNode crashes repeatedly with NPE at the startup when trying to find the > total number of under construction blocks. This crash happens after an open > file, which was also part of a snapshot gets deleted along with the snapshot. > {noformat} > Failed to start namenode. > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.namenode.LeaseManager.getNumUnderConstructionBlocks(LeaseManager.java:146) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getCompleteBlocksTotal(FSNamesystem.java:6537) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startCommonServices(FSNamesystem.java:1232) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:706) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:692) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-12985) NameNode crashes during restart after an OpenForWrite file present in the Snapshot got deleted
[ https://issues.apache.org/jira/browse/HDFS-12985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manoj Govindassamy updated HDFS-12985: -- Description: NameNode crashes repeatedly with NPE at the startup when trying to find the total number of under construction blocks. This crash happens after an open file, which was also part of a snapshot gets deleted along with the snapshot. {noformat} Failed to start namenode. java.lang.NullPointerException at org.apache.hadoop.hdfs.server.namenode.LeaseManager.getNumUnderConstructionBlocks(LeaseManager.java:146) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getCompleteBlocksTotal(FSNamesystem.java:6537) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startCommonServices(FSNamesystem.java:1232) at org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:706) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:692) at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:844) at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:823) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1547) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1615) {noformat} was: NameNode crashes repeatedly with NPE at the startup when trying to find the total number of under construction blocks. This crash happens after an open file, which was also part of a snapshot gets deleted along with the snapshot. {noformat} java.lang.NullPointerException at org.apache.hadoop.hdfs.server.namenode.LeaseManager.getNumUnderConstructionBlocks(LeaseManager.java:144) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getCompleteBlocksTotal(FSNamesystem.java:4456) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startCommonServices(FSNamesystem.java:1158) at org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:825) at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:751) at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:968) at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:947) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1674) at org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:2110) at org.apache.hadoop.hdfs.MiniDFSCluster.restartNameNode(MiniDFSCluster.java:2075) at org.apache.hadoop.hdfs.server.namenode.snapshot.TestOpenFilesWithSnapshot.testSnapshotsForOpenFilesAndDeletion3(TestOpenFilesWithSnapshot.java:747) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) {noformat} > NameNode crashes during restart after an OpenForWrite file present in the > Snapshot got deleted > -- > > Key: HDFS-12985 > URL: https://issues.apache.org/jira/browse/HDFS-12985 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs >Affects Versions: 2.8.0 >Reporter: Manoj Govindassamy >Assignee: Manoj Govindassamy > > NameNode crashes repeatedly with NPE at the startup when trying to find the > total number of under construction blocks. This crash happens after an open > file, which was also part of a snapshot gets deleted along with the snapshot. > {noformat} > Failed to start namenode. > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.namenode.LeaseManager.getNumUnderConstructionBlocks(LeaseManager.java:146) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getCompleteBlocksTotal(FSNamesystem.java:6537) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startCommonServices(FSNamesystem.java:1232) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:706) > at > org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:692) >