[jira] [Comment Edited] (HDFS-5138) Support HDFS upgrade in HA

2014-03-20 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13942638#comment-13942638
 ] 

Jing Zhao edited comment on HDFS-5138 at 3/21/14 1:36 AM:
--

So I did a simple test for HDFS upgrade with HA, and hit the following 
exception while doing rollback (with layoutversion change in the upgrade):
{code}
14/03/21 01:01:53 FATAL namenode.NameNode: Exception in namenode join
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Could not check if roll 
back possible for one or more JournalNodes. 1 exceptions thrown:
Unexpected version of storage directory /grid/1/tmp/journal/mycluster. 
Reported: -56. Expecting = -55.
at 
org.apache.hadoop.hdfs.server.common.StorageInfo.setLayoutVersion(StorageInfo.java:178)
at 
org.apache.hadoop.hdfs.server.common.StorageInfo.setFieldsFromProperties(StorageInfo.java:131)
at 
org.apache.hadoop.hdfs.server.common.StorageInfo.readProperties(StorageInfo.java:228)
at 
org.apache.hadoop.hdfs.qjournal.server.JNStorage.analyzeStorage(JNStorage.java:202)
at 
org.apache.hadoop.hdfs.qjournal.server.JNStorage.init(JNStorage.java:73)
at 
org.apache.hadoop.hdfs.qjournal.server.Journal.init(Journal.java:142)
at 
org.apache.hadoop.hdfs.qjournal.server.JournalNode.getOrCreateJournal(JournalNode.java:87)
at 
org.apache.hadoop.hdfs.qjournal.server.JournalNode.canRollBack(JournalNode.java:309)
at 
org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.canRollBack(JournalNodeRpcServer.java:228)
{code}

In my HA upgrade test, the new software bumped the layoutversion from -55 to 
-56. I stopped all the services and restarted JNs with old software. Then I run 
namenode -rollback and hit the above exception. Looks like for rollback JN 
with old software cannot handle future layoutversion brought by new software.


was (Author: jingzhao):
So I did a simple test for HDFS upgrade with HA, and hit the following 
exception while doing rollback (with layoutversion change in the upgrade):
{code}
14/03/21 01:01:53 FATAL namenode.NameNode: Exception in namenode join
org.apache.hadoop.hdfs.qjournal.client.QuorumException: Could not check if roll 
back possible for one or more JournalNodes. 1 exceptions thrown:
Unexpected version of storage directory /grid/1/tmp/journal/mycluster. 
Reported: -56. Expecting = -55.
at 
org.apache.hadoop.hdfs.server.common.StorageInfo.setLayoutVersion(StorageInfo.java:178)
at 
org.apache.hadoop.hdfs.server.common.StorageInfo.setFieldsFromProperties(StorageInfo.java:131)
at 
org.apache.hadoop.hdfs.server.common.StorageInfo.readProperties(StorageInfo.java:228)
at 
org.apache.hadoop.hdfs.qjournal.server.JNStorage.analyzeStorage(JNStorage.java:202)
at 
org.apache.hadoop.hdfs.qjournal.server.JNStorage.init(JNStorage.java:73)
at 
org.apache.hadoop.hdfs.qjournal.server.Journal.init(Journal.java:142)
at 
org.apache.hadoop.hdfs.qjournal.server.JournalNode.getOrCreateJournal(JournalNode.java:87)
at 
org.apache.hadoop.hdfs.qjournal.server.JournalNode.canRollBack(JournalNode.java:309)
at 
org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.canRollBack(JournalNodeRpcServer.java:228)
{code}

In my HA upgrade test, the new software bumped the layoutversion from -55 to 
-56. Then I stopped all the services and restarted JNs with old software. Then 
I run namenode -rollback and hit the above exception. Looks like for rollback 
JN with old software cannot handle future layoutversion brought by new software.

 Support HDFS upgrade in HA
 --

 Key: HDFS-5138
 URL: https://issues.apache.org/jira/browse/HDFS-5138
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.1.1-beta
Reporter: Kihwal Lee
Assignee: Aaron T. Myers
Priority: Blocker
 Fix For: 3.0.0

 Attachments: HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, 
 HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, 
 HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, 
 hdfs-5138-branch-2.txt


 With HA enabled, NN wo't start with -upgrade. Since there has been a layout 
 version change between 2.0.x and 2.1.x, starting NN in upgrade mode was 
 necessary when deploying 2.1.x to an existing 2.0.x cluster. But the only way 
 to get around this was to disable HA and upgrade. 
 The NN and the cluster cannot be flipped back to HA until the upgrade is 
 finalized. If HA is disabled only on NN for layout upgrade and HA is turned 
 back on without involving DNs, things will work, but finaliizeUpgrade won't 
 work (the NN is in HA and it cannot be in upgrade mode) and DN's upgrade 
 snapshots won't get removed.
 We will need a different ways of doing layout upgrade and upgrade snapshot.  

[jira] [Comment Edited] (HDFS-5138) Support HDFS upgrade in HA

2014-01-27 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883415#comment-13883415
 ] 

Suresh Srinivas edited comment on HDFS-5138 at 1/27/14 10:09 PM:
-

bq. The concern is about losing edit logs by overwriting a renamed directory 
with some contents, so by definition there will be some files in the directory 
being renamed to.
That makes sense. Thanks.

bq. The preupgrade and upgrade failure scenarios should both be handled either 
manually or by the storage recovery process
I do not think JN performs recovery, based on the following code from 
JNStorage.java
{code}
  void analyzeStorage() throws IOException {
this.state = sd.analyzeStorage(StartupOption.REGULAR, this);
if (state == StorageState.NORMAL) {
  readProperties(sd);
}
  }
{code}

For JournalNode, StorageDirectory#doRecover() is not called. Is that correct? 
From my understanding, once it gets into this state, JournalNode should not 
startup?


was (Author: sureshms):
bq. The concern is about losing edit logs by overwriting a renamed directory 
with some contents, so by definition there will be some files in the directory 
being renamed to.
That makes sense. Thanks.

bq. The preupgrade and upgrade failure scenarios should both be handled either 
manually or by the storage recovery process
I do not think JN performs recovery, based on the following code from 
JNStorage.java
{code}
  void analyzeStorage() throws IOException {
this.state = sd.analyzeStorage(StartupOption.REGULAR, this);
if (state == StorageState.NORMAL) {
  readProperties(sd);
}
  }
{code}

For JournalNode, node call StorageDirectory#doRecover(). Is that correct?

 Support HDFS upgrade in HA
 --

 Key: HDFS-5138
 URL: https://issues.apache.org/jira/browse/HDFS-5138
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.1.1-beta
Reporter: Kihwal Lee
Assignee: Aaron T. Myers
Priority: Blocker
 Fix For: 3.0.0

 Attachments: HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, 
 HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, 
 HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, 
 hdfs-5138-branch-2.txt


 With HA enabled, NN wo't start with -upgrade. Since there has been a layout 
 version change between 2.0.x and 2.1.x, starting NN in upgrade mode was 
 necessary when deploying 2.1.x to an existing 2.0.x cluster. But the only way 
 to get around this was to disable HA and upgrade. 
 The NN and the cluster cannot be flipped back to HA until the upgrade is 
 finalized. If HA is disabled only on NN for layout upgrade and HA is turned 
 back on without involving DNs, things will work, but finaliizeUpgrade won't 
 work (the NN is in HA and it cannot be in upgrade mode) and DN's upgrade 
 snapshots won't get removed.
 We will need a different ways of doing layout upgrade and upgrade snapshot.  
 I am marking this as a 2.1.1-beta blocker based on feedback from others.  If 
 there is a reasonable workaround that does not increase maintenance window 
 greatly, we can lower its priority from blocker to critical.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Comment Edited] (HDFS-5138) Support HDFS upgrade in HA

2014-01-27 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13883415#comment-13883415
 ] 

Suresh Srinivas edited comment on HDFS-5138 at 1/27/14 10:51 PM:
-

bq. The concern is about losing edit logs by overwriting a renamed directory 
with some contents, so by definition there will be some files in the directory 
being renamed to.
That makes sense. Thanks.

bq. The preupgrade and upgrade failure scenarios should both be handled either 
manually or by the storage recovery process
I do not think JN performs recovery, based on the following code from 
JNStorage.java
{code}
  void analyzeStorage() throws IOException {
this.state = sd.analyzeStorage(StartupOption.REGULAR, this);
if (state == StorageState.NORMAL) {
  readProperties(sd);
}
  }
{code}

For JournalNode, StorageDirectory#doRecover() is not called. Is that correct? 
From my understanding, once it gets into this state, JournalNode restart will 
not work?


was (Author: sureshms):
bq. The concern is about losing edit logs by overwriting a renamed directory 
with some contents, so by definition there will be some files in the directory 
being renamed to.
That makes sense. Thanks.

bq. The preupgrade and upgrade failure scenarios should both be handled either 
manually or by the storage recovery process
I do not think JN performs recovery, based on the following code from 
JNStorage.java
{code}
  void analyzeStorage() throws IOException {
this.state = sd.analyzeStorage(StartupOption.REGULAR, this);
if (state == StorageState.NORMAL) {
  readProperties(sd);
}
  }
{code}

For JournalNode, StorageDirectory#doRecover() is not called. Is that correct? 
From my understanding, once it gets into this state, JournalNode should not 
startup?

 Support HDFS upgrade in HA
 --

 Key: HDFS-5138
 URL: https://issues.apache.org/jira/browse/HDFS-5138
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.1.1-beta
Reporter: Kihwal Lee
Assignee: Aaron T. Myers
Priority: Blocker
 Fix For: 3.0.0

 Attachments: HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, 
 HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, 
 HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, 
 hdfs-5138-branch-2.txt


 With HA enabled, NN wo't start with -upgrade. Since there has been a layout 
 version change between 2.0.x and 2.1.x, starting NN in upgrade mode was 
 necessary when deploying 2.1.x to an existing 2.0.x cluster. But the only way 
 to get around this was to disable HA and upgrade. 
 The NN and the cluster cannot be flipped back to HA until the upgrade is 
 finalized. If HA is disabled only on NN for layout upgrade and HA is turned 
 back on without involving DNs, things will work, but finaliizeUpgrade won't 
 work (the NN is in HA and it cannot be in upgrade mode) and DN's upgrade 
 snapshots won't get removed.
 We will need a different ways of doing layout upgrade and upgrade snapshot.  
 I am marking this as a 2.1.1-beta blocker based on feedback from others.  If 
 there is a reasonable workaround that does not increase maintenance window 
 greatly, we can lower its priority from blocker to critical.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Comment Edited] (HDFS-5138) Support HDFS upgrade in HA

2014-01-06 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13863927#comment-13863927
 ] 

Suresh Srinivas edited comment on HDFS-5138 at 1/7/14 5:46 AM:
---

This jira can use a design document. The current description in the comment 
covers what is being done, but it is not clear why it is being done that 
way. The subtle issues may be understood better with a design document. I 
would love to see a separate summary section that covers how commands worked 
before and how they work now and what commands are no longer supported. 

Some early comments:
bq. I've removed the -finalize startup option, and instead running with this 
will direct users to use the `hdfs dfsadmin -finalizeUpgrade' command. 
Supporting both styles of finalization seems unnecessary, and makes HA 
finalization more difficult.
Can you describe what is the difference between this vs older version of 
finalize? The command -finalize is fairly well know and this change will be 
backward compatible.

bq. Starting the NN with the '-rollback' flag will perform the rollback just as 
before, but it will not then proceed to start the NN daemon. Supporting this 
mode also makes HA rollback more difficult, and doesn't seem to be necessary or 
helpful, since to perform a rollback we don't need to load the fsimage/edit 
log, and thus performing the actual rollback should be quick. Operators can 
then start the NN as normal after rolling back the FS.
Sorry I am not sure I understand this. Why does HA rollback become more 
difficult?

bq. On start, each one of the NNs will first try to create a special lock file, 
either in the shared edits dir in the NFS case or on each of the JNs in the QJM 
case. This lock file will contain the CTime that that NN would like to upgrade 
the...
Why is the lock file required? Why cannot NN just write an editlog about 
upgrade intent, including the new layout version? During rollback we can 
discount the editlog starting from the upgrade intent log. Infact we can also 
consider requiring users to save namespace with empty editlogs? With this, 
perhaps we can avoid the following:
bq. At the time when either NN is transitioned to the active state, that NN 
will perform an upgrade of the shared log, either on NFS or on the JNs.

bq. To finalize an HA upgrade, an operator will just use hdfsadmin as described 
before. The active NN at the time this happens will perform the upgrade of the 
shared log. Finalization will also remove the shared log lock file previously 
described.
You mean finalize of the shared log in above?


was (Author: sureshms):
This jira can use a design document. The current description in the comment 
covers what is being done, but it is not clear why it is being done that 
way. The subtle issues may be understood better with a design document. I 
would love to see a separate summary section that covers how commands worked 
before and how they work now and what commands are no longer supported. 

Some early comments:
bq. I've removed the -finalize startup option, and instead running with this 
will direct users to use the `hdfs dfsadmin -finalizeUpgrade' command. 
Supporting both styles of finalization seems unnecessary, and makes HA 
finalization more difficult.
Can you describe what is the difference between this vs older version of 
finalize? The command -finalize is fairly well know and this change will be 
backward compatible.

bq. Starting the NN with the '-rollback' flag will perform the rollback just as 
before, but it will not then proceed to start the NN daemon. Supporting this 
mode also makes HA rollback more difficult, and doesn't seem to be necessary or 
helpful, since to perform a rollback we don't need to load the fsimage/edit 
log, and thus performing the actual rollback should be quick. Operators can 
then start the NN as normal after rolling back the FS.
Sorry I am not sure I understand this. Why does HA rollback become more 
difficult?

bq. On start, each one of the NNs will first try to create a special lock file, 
either in the shared edits dir in the NFS case or on each of the JNs in the QJM 
case. This lock file will contain the CTime that that NN would like to upgrade 
the...
Why is the lock file required? Why cannot NN just write an editlog about 
upgrade intent, including the new layout version? During rollback we can 
discount the editlog starting from the upgrade intent log. Infact we can also 
consider requiring users to save namespace with empty editlogs? With this, 
perhaps we can avoid the following:
bq. At the time when either NN is transitioned to the active state, that NN 
will perform an upgrade of the shared log, either on NFS or on the JNs.

bq. To finalize an HA upgrade, an operator will just use hdfsadmin as described 
before. The active NN at the time this happens will perform the upgrade of the 
shared log. 

[jira] [Comment Edited] (HDFS-5138) Support HDFS upgrade in HA

2013-09-13 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13766268#comment-13766268
 ] 

Konstantin Shvachko edited comment on HDFS-5138 at 9/13/13 6:08 AM:


Hey, guys. Indeed, the scope of this jira should probably be #1. 
Not to diminish in any way the importance of rolling upgrades.

The NN upgrade happens in loadNamesystem() before RPCServer is started, so SBN 
wont even see this.
Then DNs are asked to upgrade before they are allowed to register. That is, 
Active NN is in SafeMode and there is nothing for SBN to worry about yet as the 
journal is not changing.

With NFS-mounted shared storage the upgrade should be pretty straightforward. 
We should modify the code to allow it, and then lots of testing of course.

For QJM I am not sure.
Would it be easier to let SBN checkpoint from the upgraded NN and start reading 
the journal from that image.
With -finalize SBN should probably do the same thing.

  was (Author: shv):
Hey, guys. Indeed, the scope of this jira should probably be #1. 
Not to diminish in any way the importance of rolling upgrades.

The NN upgrade happens in loadNamesystem() before RPCServer is started, so SBN 
wont even see this.
Then DNs are asked to upgrade before they are allowed to register. That is, 
Active NN is in SafeMode and there is nothing for SBN to worry about yet as the 
journal is not changing.

With NFS-mounted shared storage the upgrade should be pretty straightforward. 
We should modify the code to allow it, and then lots of testing of course.

For QJM I am not sure.
Would it be easier to let SBN checkpoint from the upgraded NN and start reading 
the journal from that image.
With finalize SBN should probably do the same thing.
  
 Support HDFS upgrade in HA
 --

 Key: HDFS-5138
 URL: https://issues.apache.org/jira/browse/HDFS-5138
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.1.1-beta
Reporter: Kihwal Lee
Priority: Blocker

 With HA enabled, NN wo't start with -upgrade. Since there has been a layout 
 version change between 2.0.x and 2.1.x, starting NN in upgrade mode was 
 necessary when deploying 2.1.x to an existing 2.0.x cluster. But the only way 
 to get around this was to disable HA and upgrade. 
 The NN and the cluster cannot be flipped back to HA until the upgrade is 
 finalized. If HA is disabled only on NN for layout upgrade and HA is turned 
 back on without involving DNs, things will work, but finaliizeUpgrade won't 
 work (the NN is in HA and it cannot be in upgrade mode) and DN's upgrade 
 snapshots won't get removed.
 We will need a different ways of doing layout upgrade and upgrade snapshot.  
 I am marking this as a 2.1.1-beta blocker based on feedback from others.  If 
 there is a reasonable workaround that does not increase maintenance window 
 greatly, we can lower its priority from blocker to critical.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (HDFS-5138) Support HDFS upgrade in HA

2013-09-13 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13766268#comment-13766268
 ] 

Konstantin Shvachko edited comment on HDFS-5138 at 9/13/13 6:12 AM:


Hey, guys. Indeed, the scope of this jira should probably be #1. 
Not to diminish in any way the importance of rolling upgrades.

The NN upgrade happens in loadNamesystem() before RPCServer is started, so SBN 
wont even see this.
Then DNs are asked to upgrade before they are allowed to register. That is, 
Active NN is in SafeMode and there is nothing for SBN to worry about yet as the 
journal is not changing.

With NFS-mounted shared storage the upgrade should be pretty straightforward. 
We should modify the code to allow it, and then lots of testing of course.

For QJM I am not sure.
Would it be easier to let SBN checkpoint from the upgraded NN and start reading 
the journal from that image.
With -rollback SBN should probably do the same thing.
Edited: I meant rollback rather than as it was finalize

  was (Author: shv):
Hey, guys. Indeed, the scope of this jira should probably be #1. 
Not to diminish in any way the importance of rolling upgrades.

The NN upgrade happens in loadNamesystem() before RPCServer is started, so SBN 
wont even see this.
Then DNs are asked to upgrade before they are allowed to register. That is, 
Active NN is in SafeMode and there is nothing for SBN to worry about yet as the 
journal is not changing.

With NFS-mounted shared storage the upgrade should be pretty straightforward. 
We should modify the code to allow it, and then lots of testing of course.

For QJM I am not sure.
Would it be easier to let SBN checkpoint from the upgraded NN and start reading 
the journal from that image.
With -finalize SBN should probably do the same thing.
  
 Support HDFS upgrade in HA
 --

 Key: HDFS-5138
 URL: https://issues.apache.org/jira/browse/HDFS-5138
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.1.1-beta
Reporter: Kihwal Lee
Priority: Blocker

 With HA enabled, NN wo't start with -upgrade. Since there has been a layout 
 version change between 2.0.x and 2.1.x, starting NN in upgrade mode was 
 necessary when deploying 2.1.x to an existing 2.0.x cluster. But the only way 
 to get around this was to disable HA and upgrade. 
 The NN and the cluster cannot be flipped back to HA until the upgrade is 
 finalized. If HA is disabled only on NN for layout upgrade and HA is turned 
 back on without involving DNs, things will work, but finaliizeUpgrade won't 
 work (the NN is in HA and it cannot be in upgrade mode) and DN's upgrade 
 snapshots won't get removed.
 We will need a different ways of doing layout upgrade and upgrade snapshot.  
 I am marking this as a 2.1.1-beta blocker based on feedback from others.  If 
 there is a reasonable workaround that does not increase maintenance window 
 greatly, we can lower its priority from blocker to critical.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira