[jira] [Updated] (HDFS-3597) SNN can fail to start on upgrade

2012-10-01 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HDFS-3597:
--

Component/s: name-node

> SNN can fail to start on upgrade
> 
>
> Key: HDFS-3597
> URL: https://issues.apache.org/jira/browse/HDFS-3597
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 2.0.0-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
>Priority: Minor
>  Labels: upgrade
> Fix For: 0.23.3, 2.0.2-alpha
>
> Attachments: hdfs-3597-2.txt, hdfs-3597-3.txt, hdfs-3597-4.txt, 
> hdfs-3597.txt
>
>
> When upgrading from 1.x to 2.0.0, the SecondaryNameNode can fail to start up:
> {code}
> 2012-06-16 09:52:33,812 ERROR 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in 
> doCheckpoint
> java.io.IOException: Inconsistent checkpoint fields.
> LV = -40 namespaceID = 64415959 cTime = 1339813974990 ; clusterId = 
> CID-07a82b97-8d04-4fdd-b3a1-f40650163245 ; blockpoolId = 
> BP-1792677198-172.29.121.67-1339813967723.
> Expecting respectively: -19; 64415959; 0; ; .
> at 
> org.apache.hadoop.hdfs.server.namenode.CheckpointSignature.validateStorageInfo(CheckpointSignature.java:120)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:454)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:334)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$2.run(SecondaryNameNode.java:301)
> at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:438)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:297)
> at java.lang.Thread.run(Thread.java:662)
> {code}
> The error check we're hitting came from HDFS-1073, and it's intended to 
> verify that we're connecting to the correct NN.  But the check is too strict 
> and considers "different metadata version" to be the same as "different 
> clusterID".
> I believe the check in {{doCheckpoint}} simply needs to explicitly check for 
> and handle the update case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-3597) SNN can fail to start on upgrade

2012-10-01 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HDFS-3597:
--

Labels: upgrade  (was: )

> SNN can fail to start on upgrade
> 
>
> Key: HDFS-3597
> URL: https://issues.apache.org/jira/browse/HDFS-3597
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 2.0.0-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
>Priority: Minor
>  Labels: upgrade
> Fix For: 0.23.3, 2.0.2-alpha
>
> Attachments: hdfs-3597-2.txt, hdfs-3597-3.txt, hdfs-3597-4.txt, 
> hdfs-3597.txt
>
>
> When upgrading from 1.x to 2.0.0, the SecondaryNameNode can fail to start up:
> {code}
> 2012-06-16 09:52:33,812 ERROR 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in 
> doCheckpoint
> java.io.IOException: Inconsistent checkpoint fields.
> LV = -40 namespaceID = 64415959 cTime = 1339813974990 ; clusterId = 
> CID-07a82b97-8d04-4fdd-b3a1-f40650163245 ; blockpoolId = 
> BP-1792677198-172.29.121.67-1339813967723.
> Expecting respectively: -19; 64415959; 0; ; .
> at 
> org.apache.hadoop.hdfs.server.namenode.CheckpointSignature.validateStorageInfo(CheckpointSignature.java:120)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:454)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:334)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$2.run(SecondaryNameNode.java:301)
> at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:438)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:297)
> at java.lang.Thread.run(Thread.java:662)
> {code}
> The error check we're hitting came from HDFS-1073, and it's intended to 
> verify that we're connecting to the correct NN.  But the check is too strict 
> and considers "different metadata version" to be the same as "different 
> clusterID".
> I believe the check in {{doCheckpoint}} simply needs to explicitly check for 
> and handle the update case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-3597) SNN can fail to start on upgrade

2012-08-14 Thread Daryn Sharp (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp updated HDFS-3597:
--

Fix Version/s: 0.23.3

I've committed to 23.

> SNN can fail to start on upgrade
> 
>
> Key: HDFS-3597
> URL: https://issues.apache.org/jira/browse/HDFS-3597
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
>Priority: Minor
> Fix For: 0.23.3, 3.0.0, 2.2.0-alpha
>
> Attachments: hdfs-3597-2.txt, hdfs-3597-3.txt, hdfs-3597-4.txt, 
> hdfs-3597.txt
>
>
> When upgrading from 1.x to 2.0.0, the SecondaryNameNode can fail to start up:
> {code}
> 2012-06-16 09:52:33,812 ERROR 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in 
> doCheckpoint
> java.io.IOException: Inconsistent checkpoint fields.
> LV = -40 namespaceID = 64415959 cTime = 1339813974990 ; clusterId = 
> CID-07a82b97-8d04-4fdd-b3a1-f40650163245 ; blockpoolId = 
> BP-1792677198-172.29.121.67-1339813967723.
> Expecting respectively: -19; 64415959; 0; ; .
> at 
> org.apache.hadoop.hdfs.server.namenode.CheckpointSignature.validateStorageInfo(CheckpointSignature.java:120)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:454)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:334)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$2.run(SecondaryNameNode.java:301)
> at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:438)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:297)
> at java.lang.Thread.run(Thread.java:662)
> {code}
> The error check we're hitting came from HDFS-1073, and it's intended to 
> verify that we're connecting to the correct NN.  But the check is too strict 
> and considers "different metadata version" to be the same as "different 
> clusterID".
> I believe the check in {{doCheckpoint}} simply needs to explicitly check for 
> and handle the update case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3597) SNN can fail to start on upgrade

2012-07-20 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HDFS-3597:
--

   Resolution: Fixed
Fix Version/s: 2.2.0-alpha
   3.0.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

> SNN can fail to start on upgrade
> 
>
> Key: HDFS-3597
> URL: https://issues.apache.org/jira/browse/HDFS-3597
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
>Priority: Minor
> Fix For: 3.0.0, 2.2.0-alpha
>
> Attachments: hdfs-3597-2.txt, hdfs-3597-3.txt, hdfs-3597-4.txt, 
> hdfs-3597.txt
>
>
> When upgrading from 1.x to 2.0.0, the SecondaryNameNode can fail to start up:
> {code}
> 2012-06-16 09:52:33,812 ERROR 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in 
> doCheckpoint
> java.io.IOException: Inconsistent checkpoint fields.
> LV = -40 namespaceID = 64415959 cTime = 1339813974990 ; clusterId = 
> CID-07a82b97-8d04-4fdd-b3a1-f40650163245 ; blockpoolId = 
> BP-1792677198-172.29.121.67-1339813967723.
> Expecting respectively: -19; 64415959; 0; ; .
> at 
> org.apache.hadoop.hdfs.server.namenode.CheckpointSignature.validateStorageInfo(CheckpointSignature.java:120)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:454)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:334)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$2.run(SecondaryNameNode.java:301)
> at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:438)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:297)
> at java.lang.Thread.run(Thread.java:662)
> {code}
> The error check we're hitting came from HDFS-1073, and it's intended to 
> verify that we're connecting to the correct NN.  But the check is too strict 
> and considers "different metadata version" to be the same as "different 
> clusterID".
> I believe the check in {{doCheckpoint}} simply needs to explicitly check for 
> and handle the update case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3597) SNN can fail to start on upgrade

2012-07-18 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HDFS-3597:


Attachment: hdfs-3597-4.txt

Attaching hdfs-3597-4.txt, addressing review feedback.

> SNN can fail to start on upgrade
> 
>
> Key: HDFS-3597
> URL: https://issues.apache.org/jira/browse/HDFS-3597
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
>Priority: Minor
> Attachments: hdfs-3597-2.txt, hdfs-3597-3.txt, hdfs-3597-4.txt, 
> hdfs-3597.txt
>
>
> When upgrading from 1.x to 2.0.0, the SecondaryNameNode can fail to start up:
> {code}
> 2012-06-16 09:52:33,812 ERROR 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in 
> doCheckpoint
> java.io.IOException: Inconsistent checkpoint fields.
> LV = -40 namespaceID = 64415959 cTime = 1339813974990 ; clusterId = 
> CID-07a82b97-8d04-4fdd-b3a1-f40650163245 ; blockpoolId = 
> BP-1792677198-172.29.121.67-1339813967723.
> Expecting respectively: -19; 64415959; 0; ; .
> at 
> org.apache.hadoop.hdfs.server.namenode.CheckpointSignature.validateStorageInfo(CheckpointSignature.java:120)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:454)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:334)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$2.run(SecondaryNameNode.java:301)
> at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:438)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:297)
> at java.lang.Thread.run(Thread.java:662)
> {code}
> The error check we're hitting came from HDFS-1073, and it's intended to 
> verify that we're connecting to the correct NN.  But the check is too strict 
> and considers "different metadata version" to be the same as "different 
> clusterID".
> I believe the check in {{doCheckpoint}} simply needs to explicitly check for 
> and handle the update case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3597) SNN can fail to start on upgrade

2012-07-09 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-3597:
-

Target Version/s: 2.0.1-alpha
  Status: Patch Available  (was: Open)

> SNN can fail to start on upgrade
> 
>
> Key: HDFS-3597
> URL: https://issues.apache.org/jira/browse/HDFS-3597
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
>Priority: Minor
> Attachments: hdfs-3597-2.txt, hdfs-3597-3.txt, hdfs-3597.txt
>
>
> When upgrading from 1.x to 2.0.0, the SecondaryNameNode can fail to start up:
> {code}
> 2012-06-16 09:52:33,812 ERROR 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in 
> doCheckpoint
> java.io.IOException: Inconsistent checkpoint fields.
> LV = -40 namespaceID = 64415959 cTime = 1339813974990 ; clusterId = 
> CID-07a82b97-8d04-4fdd-b3a1-f40650163245 ; blockpoolId = 
> BP-1792677198-172.29.121.67-1339813967723.
> Expecting respectively: -19; 64415959; 0; ; .
> at 
> org.apache.hadoop.hdfs.server.namenode.CheckpointSignature.validateStorageInfo(CheckpointSignature.java:120)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:454)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:334)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$2.run(SecondaryNameNode.java:301)
> at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:438)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:297)
> at java.lang.Thread.run(Thread.java:662)
> {code}
> The error check we're hitting came from HDFS-1073, and it's intended to 
> verify that we're connecting to the correct NN.  But the check is too strict 
> and considers "different metadata version" to be the same as "different 
> clusterID".
> I believe the check in {{doCheckpoint}} simply needs to explicitly check for 
> and handle the update case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3597) SNN can fail to start on upgrade

2012-07-06 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HDFS-3597:


Attachment: hdfs-3597-3.txt

Address review feedback and adjust test to more accurately test the upgrade 
scenario.
# we now corrupt all 2NN directories
# we now test upgrade from -39 which fixes some unexplained test failures
# clean up the test
# drop the datanodes and use mkdir instead of writefile for quicker test 
startup.

> SNN can fail to start on upgrade
> 
>
> Key: HDFS-3597
> URL: https://issues.apache.org/jira/browse/HDFS-3597
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
>Priority: Minor
> Attachments: hdfs-3597-2.txt, hdfs-3597-3.txt, hdfs-3597.txt
>
>
> When upgrading from 1.x to 2.0.0, the SecondaryNameNode can fail to start up:
> {code}
> 2012-06-16 09:52:33,812 ERROR 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in 
> doCheckpoint
> java.io.IOException: Inconsistent checkpoint fields.
> LV = -40 namespaceID = 64415959 cTime = 1339813974990 ; clusterId = 
> CID-07a82b97-8d04-4fdd-b3a1-f40650163245 ; blockpoolId = 
> BP-1792677198-172.29.121.67-1339813967723.
> Expecting respectively: -19; 64415959; 0; ; .
> at 
> org.apache.hadoop.hdfs.server.namenode.CheckpointSignature.validateStorageInfo(CheckpointSignature.java:120)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:454)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:334)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$2.run(SecondaryNameNode.java:301)
> at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:438)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:297)
> at java.lang.Thread.run(Thread.java:662)
> {code}
> The error check we're hitting came from HDFS-1073, and it's intended to 
> verify that we're connecting to the correct NN.  But the check is too strict 
> and considers "different metadata version" to be the same as "different 
> clusterID".
> I believe the check in {{doCheckpoint}} simply needs to explicitly check for 
> and handle the update case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3597) SNN can fail to start on upgrade

2012-07-05 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HDFS-3597:


Attachment: hdfs-3597-2.txt

Attaching new version of patch that addresses review comments.  Please check 
the {{doCheckpoint}} logic specifically, I'm happy with this refactoring but am 
open to better suggestions.

Running a full set of tests locally to verify no breakage.

> SNN can fail to start on upgrade
> 
>
> Key: HDFS-3597
> URL: https://issues.apache.org/jira/browse/HDFS-3597
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
>Priority: Minor
> Attachments: hdfs-3597-2.txt, hdfs-3597.txt
>
>
> When upgrading from 1.x to 2.0.0, the SecondaryNameNode can fail to start up:
> {code}
> 2012-06-16 09:52:33,812 ERROR 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in 
> doCheckpoint
> java.io.IOException: Inconsistent checkpoint fields.
> LV = -40 namespaceID = 64415959 cTime = 1339813974990 ; clusterId = 
> CID-07a82b97-8d04-4fdd-b3a1-f40650163245 ; blockpoolId = 
> BP-1792677198-172.29.121.67-1339813967723.
> Expecting respectively: -19; 64415959; 0; ; .
> at 
> org.apache.hadoop.hdfs.server.namenode.CheckpointSignature.validateStorageInfo(CheckpointSignature.java:120)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:454)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:334)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$2.run(SecondaryNameNode.java:301)
> at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:438)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:297)
> at java.lang.Thread.run(Thread.java:662)
> {code}
> The error check we're hitting came from HDFS-1073, and it's intended to 
> verify that we're connecting to the correct NN.  But the check is too strict 
> and considers "different metadata version" to be the same as "different 
> clusterID".
> I believe the check in {{doCheckpoint}} simply needs to explicitly check for 
> and handle the update case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HDFS-3597) SNN can fail to start on upgrade

2012-07-03 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HDFS-3597:


Attachment: hdfs-3597.txt

Attaching proposed fix, including positive and negative test cases showing that 
the check functions as expected.

> SNN can fail to start on upgrade
> 
>
> Key: HDFS-3597
> URL: https://issues.apache.org/jira/browse/HDFS-3597
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.0-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
>Priority: Minor
> Attachments: hdfs-3597.txt
>
>
> When upgrading from 1.x to 2.0.0, the SecondaryNameNode can fail to start up:
> {code}
> 2012-06-16 09:52:33,812 ERROR 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in 
> doCheckpoint
> java.io.IOException: Inconsistent checkpoint fields.
> LV = -40 namespaceID = 64415959 cTime = 1339813974990 ; clusterId = 
> CID-07a82b97-8d04-4fdd-b3a1-f40650163245 ; blockpoolId = 
> BP-1792677198-172.29.121.67-1339813967723.
> Expecting respectively: -19; 64415959; 0; ; .
> at 
> org.apache.hadoop.hdfs.server.namenode.CheckpointSignature.validateStorageInfo(CheckpointSignature.java:120)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:454)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:334)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$2.run(SecondaryNameNode.java:301)
> at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:438)
> at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:297)
> at java.lang.Thread.run(Thread.java:662)
> {code}
> The error check we're hitting came from HDFS-1073, and it's intended to 
> verify that we're connecting to the correct NN.  But the check is too strict 
> and considers "different metadata version" to be the same as "different 
> clusterID".
> I believe the check in {{doCheckpoint}} simply needs to explicitly check for 
> and handle the update case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira