[jira] [Commented] (HDFS-2291) HA: Checkpointing in an HA setup

2012-01-05 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13180423#comment-13180423
 ] 

Hudson commented on HDFS-2291:
--

Integrated in Hadoop-Hdfs-HAbranch-build #38 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-HAbranch-build/38/])
HDFS-2291. Allow the StandbyNode to make checkpoints in an HA setup. 
Contributed by Todd Lipcon.

todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1227411
Files : 
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/CHANGES.HDFS-1623.txt
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/CheckpointConf.java
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/Checkpointer.java
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/GetImageServlet.java
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NNStorage.java
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SaveNamespaceCancelledException.java
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/SecondaryNameNode.java
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/TransferFsImage.java
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/EditLogTailer.java
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/HAContext.java
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/HAState.java
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/StandbyCheckpointer.java
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/StandbyState.java
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSCluster.java
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/MiniDFSNNTopology.java
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSImageTestUtil.java
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/NameNodeAdapter.java
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCheckpoint.java
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestEditLogsDuringFailover.java
* 
/hadoop/common/branches/HDFS-1623/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestStandbyCheckpoints.java


 HA: Checkpointing in an HA setup
 

 Key: HDFS-2291
 URL: https://issues.apache.org/jira/browse/HDFS-2291
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha, name-node
Affects Versions: HA branch (HDFS-1623)
Reporter: Aaron T. Myers
Assignee: Todd Lipcon
 Fix For: HA branch (HDFS-1623)

 Attachments: hdfs-2291.txt, hdfs-2291.txt, hdfs-2291.txt, 
 hdfs-2291.txt


 We obviously need to create checkpoints when HA is enabled. One thought is to 
 use a third, dedicated checkpointing node in addition to the active and 
 standby nodes. Another option would be to make the standby capable of also 
 performing the function of checkpointing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 

[jira] [Commented] (HDFS-2291) HA: Checkpointing in an HA setup

2012-01-04 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13179850#comment-13179850
 ] 

Todd Lipcon commented on HDFS-2291:
---

bq. dfs.namenode.standby.checkpoints - perhaps include .ha in there to make 
it clear that this option is only applicable in an HA setup
renamed to dfs.ha.standby.checkpoints and DFS_HA_STANDBY_CHECKPOINTS_KEY

{quote}
Might as well make the members of CheckpointConf final.

LOG.info(Counted txns in  + file + :  + val.getNumTransactions()); - Either 
should be removed or should not be info level.

prepareStopStandbyServices is kind of a weird name. Perhaps 
prepareToStopStandbyServices ?

// TODO interface audience in TransferFsImage

TODO: need to cancel the savenamespace operation if it's in flight - I think 
this comment is no longer applicable to this patch, right?

LOG.info(Time for a checkpoint !); - while strictly accurate, this doesn't 
seem to be the most helpful log message.

e.printStackTrace(); in CheckpointerThread should probably be tossed.

Nit: in CheckpointerThread#doWork: 
if(UserGroupInformation.isSecurityEnabled()) - space between if and (, 
and curly braces around body of if.

You use System.currentTimeMillis in a bunch of places. How about replacing 
with o.a.h.hdfs.server.common.Util#now ?
{quote}
fixed the above

bq. Does it not seem strange to you that the order of operations when setting a 
state is prepareExit - prepareEnter - exit - enter, instead of 
prepareExit - exit - prepareEnter - enter
The point of the {{prepare*}} methods is that they have to happen before the 
lock is taken. So, {{prepareEnter}} can't happen after {{exit}}, because the 
lock already is held there. I clarified the javadoc a bit.

bq. What's the point of the changes in EditLogTailer?
In order for the test to spy on saveNamespace, I had to move the {{getFSImage}} 
call down. Otherwise, the spy wasn't getting picked up properly and the test 
was failing.

bq. Can we make CheckpointerThread a static inner class?
Currently it calls {{doCheckpoint}} in the outer class. I suppose it could be 
static, but it isn't really easy to test in isolation anyway, so I'm going to 
punt o this.

bq. Does it make sense to explicitly disallow the SBN from allowing checkpoints 
to be uploaded to it? 

Yes and no... I sort of see your point. But, people have also discussed an 
external tool which would perform checkpoints for many clusters and then upload 
them 

 HA: Checkpointing in an HA setup
 

 Key: HDFS-2291
 URL: https://issues.apache.org/jira/browse/HDFS-2291
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha, name-node
Affects Versions: HA branch (HDFS-1623)
Reporter: Aaron T. Myers
Assignee: Todd Lipcon
 Fix For: HA branch (HDFS-1623)

 Attachments: hdfs-2291.txt, hdfs-2291.txt, hdfs-2291.txt


 We obviously need to create checkpoints when HA is enabled. One thought is to 
 use a third, dedicated checkpointing node in addition to the active and 
 standby nodes. Another option would be to make the standby capable of also 
 performing the function of checkpointing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2291) HA: Checkpointing in an HA setup

2012-01-04 Thread Aaron T. Myers (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13179877#comment-13179877
 ] 

Aaron T. Myers commented on HDFS-2291:
--

bq. Yes and no... I sort of see your point. But, people have also discussed an 
external tool which would perform checkpoints for many clusters and then upload 
them

I'm still a little leery of this behavior, but I don't feel strongly about it, 
so let's just roll with it.

I should have said this earlier, but I'd also recommend changing 
prepareEnterState to prepareToEnterState, and likewise for exit.

Otherwise the patch looks good to me. +1 once that's addressed.

 HA: Checkpointing in an HA setup
 

 Key: HDFS-2291
 URL: https://issues.apache.org/jira/browse/HDFS-2291
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha, name-node
Affects Versions: HA branch (HDFS-1623)
Reporter: Aaron T. Myers
Assignee: Todd Lipcon
 Fix For: HA branch (HDFS-1623)

 Attachments: hdfs-2291.txt, hdfs-2291.txt, hdfs-2291.txt


 We obviously need to create checkpoints when HA is enabled. One thought is to 
 use a third, dedicated checkpointing node in addition to the active and 
 standby nodes. Another option would be to make the standby capable of also 
 performing the function of checkpointing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2291) HA: Checkpointing in an HA setup

2012-01-03 Thread Aaron T. Myers (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13179188#comment-13179188
 ] 

Aaron T. Myers commented on HDFS-2291:
--

Thanks a lot for providing this patch, Todd. What's below are mostly nits. I 
agree that there could be a few more comments for the new public methods, so I 
didn't include that in my feedback.

# {{dfs.namenode.standby.checkpoints}} - perhaps include .ha in there to make 
it clear that this option is only applicable in an HA setup?
# Might as well make the members of {{CheckpointConf}} final.
# {{LOG.info(Counted txns in  + file + :  + val.getNumTransactions());}} - 
Either should be removed or should not be info level.
# {{prepareStopStandbyServices}} is kind of a weird name. Perhaps 
prepareToStopStandbyServices ?
# // TODO interface audience in {{TransferFsImage}}
# Does it not seem strange to you that the order of operations when setting a 
state is prepareExit - prepareEnter - exit - enter, instead of 
prepareExit - exit - prepareEnter - enter ? i.e. I don't think there's a 
correctness issue here, but if I were designing a system where this set of 
events is triggered, I'd go with the latter.
# What's the point of the changes in {{EditLogTailer}}?
# TODO: need to cancel the savenamespace operation if it's in flight - I 
think this comment is no longer applicable to this patch, right?
# {{LOG.info(Time for a checkpoint !);}} - while strictly accurate, this 
doesn't seem to be the most helpful log message.
# Can we make {{CheckpointerThread}} a static inner class?
# {{e.printStackTrace();}} in {{CheckpointerThread}} should probably be tossed.
# Nit: in {{CheckpointerThread#doWork}}: 
if(UserGroupInformation.isSecurityEnabled()) - space between if and (, 
and curly braces around body of if.
# You use System.currentTimeMillis in a bunch of places. How about replacing 
with o.a.h.hdfs.server.common.Util#now ?
# Does it make sense to explicitly disallow the SBN from allowing checkpoints 
to be uploaded to it? I realize the case when both nodes are in standby is 
already handled by this patch, since you don't allow checkpoints if the node 
already has a checkpoint for a given txid, but I mean from a principled 
perspective. It seems kind of odd to me that two nodes both sitting in standby 
would be doing checkpoint transfers at all.


 HA: Checkpointing in an HA setup
 

 Key: HDFS-2291
 URL: https://issues.apache.org/jira/browse/HDFS-2291
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha, name-node
Affects Versions: HA branch (HDFS-1623)
Reporter: Aaron T. Myers
Assignee: Todd Lipcon
 Fix For: HA branch (HDFS-1623)

 Attachments: hdfs-2291.txt, hdfs-2291.txt


 We obviously need to create checkpoints when HA is enabled. One thought is to 
 use a third, dedicated checkpointing node in addition to the active and 
 standby nodes. Another option would be to make the standby capable of also 
 performing the function of checkpointing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2291) HA: Checkpointing in an HA setup

2011-12-21 Thread Eli Collins (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13174328#comment-13174328
 ] 

Eli Collins commented on HDFS-2291:
---

Ditto, option (b) seems preferable. I think we should minimize the difference 
between the 2NN and the SBN checkpointing since we'll have to support both.

 HA: Checkpointing in an HA setup
 

 Key: HDFS-2291
 URL: https://issues.apache.org/jira/browse/HDFS-2291
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha, name-node
Affects Versions: HA branch (HDFS-1623)
Reporter: Aaron T. Myers
Assignee: Todd Lipcon
 Fix For: HA branch (HDFS-1623)


 We obviously need to create checkpoints when HA is enabled. One thought is to 
 use a third, dedicated checkpointing node in addition to the active and 
 standby nodes. Another option would be to make the standby capable of also 
 performing the function of checkpointing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2291) HA: Checkpointing in an HA setup

2011-12-20 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13173895#comment-13173895
 ] 

Todd Lipcon commented on HDFS-2291:
---

I plan to start working on this tomorrow. My thinking is to have a checkpoint 
thread which wakes up on the checkpoint interval, stops the edit log tailer 
thread, enters safe mode, creates a checkpoint, and comes back out of safemode. 
If at any point the SB needs to process a failover, it will cancel the 
checkpoint (using the HDFS-2507 feature) and proceed as usual.

The remaining question I've yet to figure out is whether it should (a) save the 
checkpoints into the shared edits directory, or (b) save in its own and then 
upload the checkpoints to the primary via HTTP just like the 2NN does today.

b is probably preferable since the shared edits directory may in fact be BK 
or some other journal plugin in the future, whereas a would break the 
abstraction.

If anyone has any strong opinions please shout now :)

 HA: Checkpointing in an HA setup
 

 Key: HDFS-2291
 URL: https://issues.apache.org/jira/browse/HDFS-2291
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha, name-node
Affects Versions: HA branch (HDFS-1623)
Reporter: Aaron T. Myers
Assignee: Todd Lipcon
 Fix For: HA branch (HDFS-1623)


 We obviously need to create checkpoints when HA is enabled. One thought is to 
 use a third, dedicated checkpointing node in addition to the active and 
 standby nodes. Another option would be to make the standby capable of also 
 performing the function of checkpointing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2291) HA: Checkpointing in an HA setup

2011-12-20 Thread Aaron T. Myers (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13173903#comment-13173903
 ] 

Aaron T. Myers commented on HDFS-2291:
--

I support option b, not only for the reason stated above. Option b also 
implicitly solves the problem of what to do about fsimages in the standby, as 
well as just seeming overall safer. I'm leery of any plan which involves the 
standby temporarily writing to the shared edits dir.

 HA: Checkpointing in an HA setup
 

 Key: HDFS-2291
 URL: https://issues.apache.org/jira/browse/HDFS-2291
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha, name-node
Affects Versions: HA branch (HDFS-1623)
Reporter: Aaron T. Myers
Assignee: Todd Lipcon
 Fix For: HA branch (HDFS-1623)


 We obviously need to create checkpoints when HA is enabled. One thought is to 
 use a third, dedicated checkpointing node in addition to the active and 
 standby nodes. Another option would be to make the standby capable of also 
 performing the function of checkpointing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2291) HA: Checkpointing in an HA setup

2011-11-27 Thread Eli Collins (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13158060#comment-13158060
 ] 

Eli Collins commented on HDFS-2291:
---

Agree that the SBN should be able to do checkpoints - someone running a typical 
20x configuration with two hosts (NN and 2NN) should be able to keep the same 
hardware config (NN and SBN).

 HA: Checkpointing in an HA setup
 

 Key: HDFS-2291
 URL: https://issues.apache.org/jira/browse/HDFS-2291
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: HA branch (HDFS-1623)
Reporter: Aaron T. Myers
Assignee: Todd Lipcon
 Fix For: HA branch (HDFS-1623)


 We obviously need to create checkpoints when HA is enabled. One thought is to 
 use a third, dedicated checkpointing node in addition to the active and 
 standby nodes. Another option would be to make the standby capable of also 
 performing the function of checkpointing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2291) HA: Checkpointing in an HA setup

2011-08-31 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13094863#comment-13094863
 ] 

Todd Lipcon commented on HDFS-2291:
---

Ravi: the docs are right -- the 2NN needs as much memory as the NN. But the 
same is true of the SBN. But it's the same memory - a copy of the namespace, 
etc.

So, I agree that the SBN should be able to do checkpoints. We just need to 
implement a checkpoint abort functionality. I will look into this.

 HA: Checkpointing in an HA setup
 

 Key: HDFS-2291
 URL: https://issues.apache.org/jira/browse/HDFS-2291
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: HA branch (HDFS-1623)
Reporter: Aaron T. Myers
Assignee: Todd Lipcon
 Fix For: HA branch (HDFS-1623)


 We obviously need to create checkpoints when HA is enabled. One thought is to 
 use a third, dedicated checkpointing node in addition to the active and 
 standby nodes. Another option would be to make the standby capable of also 
 performing the function of checkpointing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2291) HA: Checkpointing in an HA setup

2011-08-29 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13093223#comment-13093223
 ] 

Suresh Srinivas commented on HDFS-2291:
---

My preference is to do checkpointing in standby. If standby is in the middle 
checkpointing, it should abandon checkpointing and become active.

 HA: Checkpointing in an HA setup
 

 Key: HDFS-2291
 URL: https://issues.apache.org/jira/browse/HDFS-2291
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: HA branch (HDFS-1623)
Reporter: Aaron T. Myers
Assignee: Todd Lipcon
 Fix For: HA branch (HDFS-1623)


 We obviously need to create checkpoints when HA is enabled. One thought is to 
 use a third, dedicated checkpointing node in addition to the active and 
 standby nodes. Another option would be to make the standby capable of also 
 performing the function of checkpointing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2291) HA: Checkpointing in an HA setup

2011-08-29 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13093253#comment-13093253
 ] 

Ravi Prakash commented on HDFS-2291:


Will the standby+checkpointing node have to have twice the memory? I thought 
the main reason for running a secondary namenode on a different machine was 
because checkpointing needed just as much memory as the namenode needed to 
maintain  metadata.

 HA: Checkpointing in an HA setup
 

 Key: HDFS-2291
 URL: https://issues.apache.org/jira/browse/HDFS-2291
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: HA branch (HDFS-1623)
Reporter: Aaron T. Myers
Assignee: Todd Lipcon
 Fix For: HA branch (HDFS-1623)


 We obviously need to create checkpoints when HA is enabled. One thought is to 
 use a third, dedicated checkpointing node in addition to the active and 
 standby nodes. Another option would be to make the standby capable of also 
 performing the function of checkpointing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2291) HA: Checkpointing in an HA setup

2011-08-29 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13093285#comment-13093285
 ] 

Suresh Srinivas commented on HDFS-2291:
---

bq. Will the standby+checkpointing node have to have twice the memory?
No

bq. I thought the main reason for running a secondary namenode on a different 
machine was because checkpointing needed just as much memory as the namenode 
needed to maintain metadata.
The reason why we do not do it in primary is, checkpointing blocks the 
operations. 

 HA: Checkpointing in an HA setup
 

 Key: HDFS-2291
 URL: https://issues.apache.org/jira/browse/HDFS-2291
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: HA branch (HDFS-1623)
Reporter: Aaron T. Myers
Assignee: Todd Lipcon
 Fix For: HA branch (HDFS-1623)


 We obviously need to create checkpoints when HA is enabled. One thought is to 
 use a third, dedicated checkpointing node in addition to the active and 
 standby nodes. Another option would be to make the standby capable of also 
 performing the function of checkpointing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2291) HA: Checkpointing in an HA setup

2011-08-29 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13093287#comment-13093287
 ] 

Ravi Prakash commented on HDFS-2291:


@Suresh: Thanks! Blame not the dev for what he read in docs outdated 
http://hadoop.apache.org/common/docs/current/hdfs_user_guide.html#Secondary+NameNode
 :)
bq. It is usually run on a different machine than the primary NameNode since 
its memory requirements are on the same order as the primary NameNode

 HA: Checkpointing in an HA setup
 

 Key: HDFS-2291
 URL: https://issues.apache.org/jira/browse/HDFS-2291
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: HA branch (HDFS-1623)
Reporter: Aaron T. Myers
Assignee: Todd Lipcon
 Fix For: HA branch (HDFS-1623)


 We obviously need to create checkpoints when HA is enabled. One thought is to 
 use a third, dedicated checkpointing node in addition to the active and 
 standby nodes. Another option would be to make the standby capable of also 
 performing the function of checkpointing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HDFS-2291) HA: Checkpointing in an HA setup

2011-08-27 Thread Uma Maheswara Rao G (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-2291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13092225#comment-13092225
 ] 

Uma Maheswara Rao G commented on HDFS-2291:
---

{quote}
One thought is to use a third, dedicated checkpointing node in addition to the 
active and standby nodes.
{quote}
  Introducing new nodes may create overheads in setting up the clusters. we can 
always think to reduce the cluster complexities to create setups.

{quote}
Another option would be to make the standby capable of also performing the 
function of checkpointing.
{quote}
IMO , standby can do checkpointing job.

what do you say Aaron?

 HA: Checkpointing in an HA setup
 

 Key: HDFS-2291
 URL: https://issues.apache.org/jira/browse/HDFS-2291
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: HA branch (HDFS-1623)
Reporter: Aaron T. Myers
Assignee: Todd Lipcon
 Fix For: HA branch (HDFS-1623)


 We obviously need to create checkpoints when HA is enabled. One thought is to 
 use a third, dedicated checkpointing node in addition to the active and 
 standby nodes. Another option would be to make the standby capable of also 
 performing the function of checkpointing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira