Re: HDFS Backup nodes
On 12/13/11 11:00 PM, M. C. Srivas mcsri...@gmail.com wrote: Suresh, As of today, there is no option except to use NFS. And as you yourself mention, the first HA prototype when it comes out will require NFS. How will it 'require' NFS? Won't any 'remote, high availability storage' work? NFS is unreliable unless in my experience unless: * Its a Netapp * Its based on Solaris (caveat: I have only used 5 NFS solution types over the last decade, and the issues are not data integrity, rather availability from a client perspective) A solution with a brief 'stall' in service while a SAN mount switched over or similar with drbd should be possible and data safe, if this is being built to truly 'require' NFS that is no better for me than the current situation, which we manage using OS level tools for failover that will temporarily break clients but resume availability quickly thereafter. Where I would like the most help from hadoop is in making the failover transparent to clients, not in solving the reliable storage problem or failover scenarios that Storage and OS vendors do. (a) I wasn't aware that Bookkeeper had progressed that far. I wonder whether it would be able to keep up with the data rates that is required in order to hold the NN log without falling behind. (b) I do know Karthik Ranga at FB just started a design to put the NN data in HDFS itself, but that is in very preliminary design stages with no real code there. The problem is that the HA code written with NFS in mind is very different from the HA code written with HDFS in mind, which are both quite different from the code that is written with Bookkeeper in mind. Essentially the three options will form three different implementations, since the failure modes of each of the back-ends are different. Am I totally off base? thanks, Srivas. On Tue, Dec 13, 2011 at 11:00 AM, Suresh Srinivas sur...@hortonworks.comwrote: Srivas, As you may know already, NFS is just being used in the first prototype for HA. Two options for editlog store are: 1. Using BookKeeper. Work has already completed on trunk towards this. This will replace need for NFS to store the editlogs and is highly available. This solution will also be used for HA. 2. We have a short term goal also to enable editlogs going to HDFS itself. The work is in progress. Regards, Suresh -- Forwarded message -- From: M. C. Srivas mcsri...@gmail.com Date: Sun, Dec 11, 2011 at 10:47 PM Subject: Re: HDFS Backup nodes To: common-user@hadoop.apache.org You are out of luck if you don't want to use NFS, and yet want redundancy for the NN. Even the new NN HA work being done by the community will require NFS ... and the NFS itself needs to be HA. But if you use a Netapp, then the likelihood of the Netapp crashing is lower than the likelihood of a garbage-collection-of-death happening in the NN. [ disclaimer: I don't work for Netapp, I work for MapR ] On Wed, Dec 7, 2011 at 4:30 PM, randy randy...@comcast.net wrote: Thanks Joey. We've had enough problems with nfs (mainly under very high load) that we thought it might be riskier to use it for the NN. randy On 12/07/2011 06:46 PM, Joey Echeverria wrote: Hey Rand, It will mark that storage directory as failed and ignore it from then on. In order to do this correctly, you need a couple of options enabled on the NFS mount to make sure that it doesn't retry infinitely. I usually run with the tcp,soft,intr,timeo=10,**retrans=10 options set. -Joey On Wed, Dec 7, 2011 at 12:37 PM,randy...@comcast.net wrote: What happens then if the nfs server fails or isn't reachable? Does hdfs lock up? Does it gracefully ignore the nfs copy? Thanks, randy - Original Message - From: Joey Echeverriaj...@cloudera.com To: common-user@hadoop.apache.org Sent: Wednesday, December 7, 2011 6:07:58 AM Subject: Re: HDFS Backup nodes You should also configure the Namenode to use an NFS mount for one of it's storage directories. That will give the most up-to-date back of the metadata in case of total node failure. -Joey On Wed, Dec 7, 2011 at 3:17 AM, praveenesh kumar praveen...@gmail.com wrote: This means still we are relying on Secondary NameNode idealogy for Namenode's backup. Can OS-mirroring of Namenode is a good alternative keep it alive all the time ? Thanks, Praveenesh On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G mahesw...@huawei.comwrote: AFAIK backup node introduced in 0.21 version onwards. __**__ From: praveenesh kumar [praveen...@gmail.com] Sent: Wednesday, December 07, 2011 12:40 PM To: common-user@hadoop.apache.org Subject: HDFS Backup nodes Does hadoop 0.20.205 supports configuring HDFS backup nodes ? Thanks, Praveenesh -- Joseph
Re: HDFS Backup nodes
On Wed, Dec 14, 2011 at 10:09AM, Scott Carey wrote: On 12/13/11 11:28 PM, Konstantin Boudnik c...@apache.org wrote: On Tue, Dec 13, 2011 at 11:00PM, M. C. Srivas wrote: Suresh, As of today, there is no option except to use NFS. And as you yourself mention, the first HA prototype when it comes out will require NFS. NFS is just happen to be readily available in any data center and doesn't require much of the extra investment on top of what exists. That is a false assumption. I'm not buying a netapp filer just for this. We have no NFS, or want any. If we ever use it, it won't be in the data center with Hadoop! It isn't a false assumption, it is a reasonable one based on the experience. You don't need netapp for NFS, you can have a Thumper or whatever. I am not saying NFS is the only and the best - all I said it is pretty common ;) I would opt fo BK or Jini Spaces like solution any day, though. Cos
Re: HDFS Backup nodes
On Wed, Dec 14, 2011 at 10:00 AM, Scott Carey sc...@richrelevance.com wrote: As of today, there is no option except to use NFS. And as you yourself mention, the first HA prototype when it comes out will require NFS. How will it 'require' NFS? Won't any 'remote, high availability storage' work? NFS is unreliable unless in my experience unless: ... A solution with a brief 'stall' in service while a SAN mount switched over or similar with drbd should be possible and data safe, if this is being built to truly 'require' NFS that is no better for me than the current situation, which we manage using OS level tools for failover that will temporarily break clients but resume availability quickly thereafter. Where I would like the most help from hadoop is in making the failover transparent to clients, not in solving the reliable storage problem or failover scenarios that Storage and OS vendors do. Currently our requirement is that we can have two client machines mount the storage, though only one needs to have it mounted rw at a time. This is certainly doable with DRBD in conjunction with a clustered filesystem like GPFS2. I believe Dhruba was doing some experimentation with an approach like this. It's not currently provided for, but it wouldn't be very difficult to extend the design so that the standby didn't even need read access until the failover event. It would just cause a longer failover period since the standby would have more edits to catch up with, etc. I don't think anyone's currently working on this, but if you wanted to contribute I can point you in the right direction. If you happen to be at the SF HUG tonight, grab me and I'll give you the rundown on what would be needed. -Todd -- Todd Lipcon Software Engineer, Cloudera
Re: HDFS Backup nodes
Srivas, As you may know already, NFS is just being used in the first prototype for HA. Two options for editlog store are: 1. Using BookKeeper. Work has already completed on trunk towards this. This will replace need for NFS to store the editlogs and is highly available. This solution will also be used for HA. 2. We have a short term goal also to enable editlogs going to HDFS itself. The work is in progress. Regards, Suresh -- Forwarded message -- From: M. C. Srivas mcsri...@gmail.com Date: Sun, Dec 11, 2011 at 10:47 PM Subject: Re: HDFS Backup nodes To: common-user@hadoop.apache.org You are out of luck if you don't want to use NFS, and yet want redundancy for the NN. Even the new NN HA work being done by the community will require NFS ... and the NFS itself needs to be HA. But if you use a Netapp, then the likelihood of the Netapp crashing is lower than the likelihood of a garbage-collection-of-death happening in the NN. [ disclaimer: I don't work for Netapp, I work for MapR ] On Wed, Dec 7, 2011 at 4:30 PM, randy randy...@comcast.net wrote: Thanks Joey. We've had enough problems with nfs (mainly under very high load) that we thought it might be riskier to use it for the NN. randy On 12/07/2011 06:46 PM, Joey Echeverria wrote: Hey Rand, It will mark that storage directory as failed and ignore it from then on. In order to do this correctly, you need a couple of options enabled on the NFS mount to make sure that it doesn't retry infinitely. I usually run with the tcp,soft,intr,timeo=10,**retrans=10 options set. -Joey On Wed, Dec 7, 2011 at 12:37 PM,randy...@comcast.net wrote: What happens then if the nfs server fails or isn't reachable? Does hdfs lock up? Does it gracefully ignore the nfs copy? Thanks, randy - Original Message - From: Joey Echeverriaj...@cloudera.com To: common-user@hadoop.apache.org Sent: Wednesday, December 7, 2011 6:07:58 AM Subject: Re: HDFS Backup nodes You should also configure the Namenode to use an NFS mount for one of it's storage directories. That will give the most up-to-date back of the metadata in case of total node failure. -Joey On Wed, Dec 7, 2011 at 3:17 AM, praveenesh kumarpraveen...@gmail.com wrote: This means still we are relying on Secondary NameNode idealogy for Namenode's backup. Can OS-mirroring of Namenode is a good alternative keep it alive all the time ? Thanks, Praveenesh On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G mahesw...@huawei.comwrote: AFAIK backup node introduced in 0.21 version onwards. __**__ From: praveenesh kumar [praveen...@gmail.com] Sent: Wednesday, December 07, 2011 12:40 PM To: common-user@hadoop.apache.org Subject: HDFS Backup nodes Does hadoop 0.20.205 supports configuring HDFS backup nodes ? Thanks, Praveenesh -- Joseph Echeverria Cloudera, Inc. 443.305.9434
Re: HDFS Backup nodes
On Sun, Dec 11, 2011 at 10:47 PM, M. C. Srivas mcsri...@gmail.com wrote: But if you use a Netapp, then the likelihood of the Netapp crashing is lower than the likelihood of a garbage-collection-of-death happening in the NN. This is pure FUD. I've never seen a garbage collection of death ever in any NN with smaller than a 40GB heap, and only a small handful of times on larger heaps. So, unless you're running a 4000 node cluster, you shouldn't be concerned with this. And the existence of many 4000 node clusters running fine on HDFS indicates that a properly tuned NN does just fine. [Disclaimer: I don't spread FUD regardless of vendor affiliation.] -Todd [ disclaimer: I don't work for Netapp, I work for MapR ] On Wed, Dec 7, 2011 at 4:30 PM, randy randy...@comcast.net wrote: Thanks Joey. We've had enough problems with nfs (mainly under very high load) that we thought it might be riskier to use it for the NN. randy On 12/07/2011 06:46 PM, Joey Echeverria wrote: Hey Rand, It will mark that storage directory as failed and ignore it from then on. In order to do this correctly, you need a couple of options enabled on the NFS mount to make sure that it doesn't retry infinitely. I usually run with the tcp,soft,intr,timeo=10,**retrans=10 options set. -Joey On Wed, Dec 7, 2011 at 12:37 PM,randy...@comcast.net wrote: What happens then if the nfs server fails or isn't reachable? Does hdfs lock up? Does it gracefully ignore the nfs copy? Thanks, randy - Original Message - From: Joey Echeverriaj...@cloudera.com To: common-user@hadoop.apache.org Sent: Wednesday, December 7, 2011 6:07:58 AM Subject: Re: HDFS Backup nodes You should also configure the Namenode to use an NFS mount for one of it's storage directories. That will give the most up-to-date back of the metadata in case of total node failure. -Joey On Wed, Dec 7, 2011 at 3:17 AM, praveenesh kumarpraveen...@gmail.com wrote: This means still we are relying on Secondary NameNode idealogy for Namenode's backup. Can OS-mirroring of Namenode is a good alternative keep it alive all the time ? Thanks, Praveenesh On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G mahesw...@huawei.comwrote: AFAIK backup node introduced in 0.21 version onwards. __**__ From: praveenesh kumar [praveen...@gmail.com] Sent: Wednesday, December 07, 2011 12:40 PM To: common-user@hadoop.apache.org Subject: HDFS Backup nodes Does hadoop 0.20.205 supports configuring HDFS backup nodes ? Thanks, Praveenesh -- Joseph Echeverria Cloudera, Inc. 443.305.9434 -- Todd Lipcon Software Engineer, Cloudera
Re: HDFS Backup nodes
Suresh, As of today, there is no option except to use NFS. And as you yourself mention, the first HA prototype when it comes out will require NFS. (a) I wasn't aware that Bookkeeper had progressed that far. I wonder whether it would be able to keep up with the data rates that is required in order to hold the NN log without falling behind. (b) I do know Karthik Ranga at FB just started a design to put the NN data in HDFS itself, but that is in very preliminary design stages with no real code there. The problem is that the HA code written with NFS in mind is very different from the HA code written with HDFS in mind, which are both quite different from the code that is written with Bookkeeper in mind. Essentially the three options will form three different implementations, since the failure modes of each of the back-ends are different. Am I totally off base? thanks, Srivas. On Tue, Dec 13, 2011 at 11:00 AM, Suresh Srinivas sur...@hortonworks.comwrote: Srivas, As you may know already, NFS is just being used in the first prototype for HA. Two options for editlog store are: 1. Using BookKeeper. Work has already completed on trunk towards this. This will replace need for NFS to store the editlogs and is highly available. This solution will also be used for HA. 2. We have a short term goal also to enable editlogs going to HDFS itself. The work is in progress. Regards, Suresh -- Forwarded message -- From: M. C. Srivas mcsri...@gmail.com Date: Sun, Dec 11, 2011 at 10:47 PM Subject: Re: HDFS Backup nodes To: common-user@hadoop.apache.org You are out of luck if you don't want to use NFS, and yet want redundancy for the NN. Even the new NN HA work being done by the community will require NFS ... and the NFS itself needs to be HA. But if you use a Netapp, then the likelihood of the Netapp crashing is lower than the likelihood of a garbage-collection-of-death happening in the NN. [ disclaimer: I don't work for Netapp, I work for MapR ] On Wed, Dec 7, 2011 at 4:30 PM, randy randy...@comcast.net wrote: Thanks Joey. We've had enough problems with nfs (mainly under very high load) that we thought it might be riskier to use it for the NN. randy On 12/07/2011 06:46 PM, Joey Echeverria wrote: Hey Rand, It will mark that storage directory as failed and ignore it from then on. In order to do this correctly, you need a couple of options enabled on the NFS mount to make sure that it doesn't retry infinitely. I usually run with the tcp,soft,intr,timeo=10,**retrans=10 options set. -Joey On Wed, Dec 7, 2011 at 12:37 PM,randy...@comcast.net wrote: What happens then if the nfs server fails or isn't reachable? Does hdfs lock up? Does it gracefully ignore the nfs copy? Thanks, randy - Original Message - From: Joey Echeverriaj...@cloudera.com To: common-user@hadoop.apache.org Sent: Wednesday, December 7, 2011 6:07:58 AM Subject: Re: HDFS Backup nodes You should also configure the Namenode to use an NFS mount for one of it's storage directories. That will give the most up-to-date back of the metadata in case of total node failure. -Joey On Wed, Dec 7, 2011 at 3:17 AM, praveenesh kumar praveen...@gmail.com wrote: This means still we are relying on Secondary NameNode idealogy for Namenode's backup. Can OS-mirroring of Namenode is a good alternative keep it alive all the time ? Thanks, Praveenesh On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G mahesw...@huawei.comwrote: AFAIK backup node introduced in 0.21 version onwards. __**__ From: praveenesh kumar [praveen...@gmail.com] Sent: Wednesday, December 07, 2011 12:40 PM To: common-user@hadoop.apache.org Subject: HDFS Backup nodes Does hadoop 0.20.205 supports configuring HDFS backup nodes ? Thanks, Praveenesh -- Joseph Echeverria Cloudera, Inc. 443.305.9434
Re: HDFS Backup nodes
On Tue, Dec 13, 2011 at 10:42 PM, M. C. Srivas mcsri...@gmail.com wrote: Any simple file meta-data test will cause the NN to spiral to death with infinite GC. For example, try create many many files. Or even simple stat a bunch of file continuously. Sure. If I run dd if=/dev/zero of=foo my laptop will spiral to death also. I think this is what you're referring to -- continuously write files until it is out of RAM. This is a well understood design choice of HDFS. It is not designed as general purpose storage for small files, and if you run tests against it assuming it is, you'll get bad results. I agree there. The real FUD going on is refusing to acknowledge that there is indeed a real problem. Yes, if you use HDFS for workloads for which it was never designed, you'll have a problem. If you stick to commonly accepted best practices I think you'll find the same thing that hundreds of other companies have found: HDFS is stable and reliable and has no such GC of death problems when used as intended. -Todd -- Todd Lipcon Software Engineer, Cloudera
Re: HDFS Backup nodes
On Tue, Dec 13, 2011 at 11:00PM, M. C. Srivas wrote: Suresh, As of today, there is no option except to use NFS. And as you yourself mention, the first HA prototype when it comes out will require NFS. Well, in the interest of full disclosure NFS is just one of the options and not the only one. Any auxiliary storage will do greatly. Distributed in-memory redundant storage for sub-seconds fail-over? Sure, Gigaspaces do this for years using very mature JINI. NFS is just happen to be readily available in any data center and doesn't require much of the extra investment on top of what exists. NFS comes with its own set of problems of course. First and foremost is No-File-Security which requires use of something like Kerberos for third-party user management. And when paired with something like LinuxTaskController it can produce some very interesting effects. Cos (a) I wasn't aware that Bookkeeper had progressed that far. I wonder whether it would be able to keep up with the data rates that is required in order to hold the NN log without falling behind. (b) I do know Karthik Ranga at FB just started a design to put the NN data in HDFS itself, but that is in very preliminary design stages with no real code there. The problem is that the HA code written with NFS in mind is very different from the HA code written with HDFS in mind, which are both quite different from the code that is written with Bookkeeper in mind. Essentially the three options will form three different implementations, since the failure modes of each of the back-ends are different. Am I totally off base? thanks, Srivas. On Tue, Dec 13, 2011 at 11:00 AM, Suresh Srinivas sur...@hortonworks.comwrote: Srivas, As you may know already, NFS is just being used in the first prototype for HA. Two options for editlog store are: 1. Using BookKeeper. Work has already completed on trunk towards this. This will replace need for NFS to store the editlogs and is highly available. This solution will also be used for HA. 2. We have a short term goal also to enable editlogs going to HDFS itself. The work is in progress. Regards, Suresh -- Forwarded message -- From: M. C. Srivas mcsri...@gmail.com Date: Sun, Dec 11, 2011 at 10:47 PM Subject: Re: HDFS Backup nodes To: common-user@hadoop.apache.org You are out of luck if you don't want to use NFS, and yet want redundancy for the NN. Even the new NN HA work being done by the community will require NFS ... and the NFS itself needs to be HA. But if you use a Netapp, then the likelihood of the Netapp crashing is lower than the likelihood of a garbage-collection-of-death happening in the NN. [ disclaimer: I don't work for Netapp, I work for MapR ] On Wed, Dec 7, 2011 at 4:30 PM, randy randy...@comcast.net wrote: Thanks Joey. We've had enough problems with nfs (mainly under very high load) that we thought it might be riskier to use it for the NN. randy On 12/07/2011 06:46 PM, Joey Echeverria wrote: Hey Rand, It will mark that storage directory as failed and ignore it from then on. In order to do this correctly, you need a couple of options enabled on the NFS mount to make sure that it doesn't retry infinitely. I usually run with the tcp,soft,intr,timeo=10,**retrans=10 options set. -Joey On Wed, Dec 7, 2011 at 12:37 PM,randy...@comcast.net wrote: What happens then if the nfs server fails or isn't reachable? Does hdfs lock up? Does it gracefully ignore the nfs copy? Thanks, randy - Original Message - From: Joey Echeverriaj...@cloudera.com To: common-user@hadoop.apache.org Sent: Wednesday, December 7, 2011 6:07:58 AM Subject: Re: HDFS Backup nodes You should also configure the Namenode to use an NFS mount for one of it's storage directories. That will give the most up-to-date back of the metadata in case of total node failure. -Joey On Wed, Dec 7, 2011 at 3:17 AM, praveenesh kumar praveen...@gmail.com wrote: This means still we are relying on Secondary NameNode idealogy for Namenode's backup. Can OS-mirroring of Namenode is a good alternative keep it alive all the time ? Thanks, Praveenesh On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G mahesw...@huawei.comwrote: AFAIK backup node introduced in 0.21 version onwards. __**__ From: praveenesh kumar [praveen...@gmail.com] Sent: Wednesday, December 07, 2011 12:40 PM To: common-user@hadoop.apache.org Subject: HDFS Backup nodes Does hadoop 0.20.205 supports configuring HDFS backup nodes ? Thanks, Praveenesh -- Joseph Echeverria Cloudera, Inc. 443.305.9434
Re: HDFS Backup nodes
On Tue, Dec 13, 2011 at 11:00 PM, M. C. Srivas mcsri...@gmail.com wrote: (a) I wasn't aware that Bookkeeper had progressed that far. I wonder whether it would be able to keep up with the data rates that is required in order to hold the NN log without falling behind. It's a good question - but one which has data relatively available. Reading from Flavio Junqueira's slides from the Hadoop In China conference a few weeks ago, he can maintain ~50k TPS with 20ms latency, with 128 byte transactions. Given that HDFS does batch multiple transactions per commit (standard group commit techniques) we might imagine 4KB transactions where it looks like about 5K TPS, equating to around 20MB/sec throughput. These transaction rates should be plenty for the edit logging use case in my experience. (b) I do know Karthik Ranga at FB just started a design to put the NN data in HDFS itself, but that is in very preliminary design stages with no real code there. Agreed. But it's not particularly complex either.. things can move from preliminary design to working code in short timelines. The problem is that the HA code written with NFS in mind is very different from the HA code written with HDFS in mind, which are both quite different from the code that is written with Bookkeeper in mind. Essentially the three options will form three different implementations, since the failure modes of each of the back-ends are different. Am I totally off base? Actually since the beginning of the HA project we have been keeping in mind that NFS is only a step along the way. The shared edits storage only has to have the following very basic operations: - write and append to files (log segments) - read from closed files - fence another writer (which can also be implemented with STONITH) As I understand it, BK supports all of the above and in fact the BK team has a working prototype of journal storage in BK. The interface is already made pluggable as of last month. So this is not far-off brainstorming but rather a very real implementation that's coming very soon to stable releases. -Todd On Tue, Dec 13, 2011 at 11:00 AM, Suresh Srinivas sur...@hortonworks.comwrote: Srivas, As you may know already, NFS is just being used in the first prototype for HA. Two options for editlog store are: 1. Using BookKeeper. Work has already completed on trunk towards this. This will replace need for NFS to store the editlogs and is highly available. This solution will also be used for HA. 2. We have a short term goal also to enable editlogs going to HDFS itself. The work is in progress. Regards, Suresh -- Forwarded message -- From: M. C. Srivas mcsri...@gmail.com Date: Sun, Dec 11, 2011 at 10:47 PM Subject: Re: HDFS Backup nodes To: common-user@hadoop.apache.org You are out of luck if you don't want to use NFS, and yet want redundancy for the NN. Even the new NN HA work being done by the community will require NFS ... and the NFS itself needs to be HA. But if you use a Netapp, then the likelihood of the Netapp crashing is lower than the likelihood of a garbage-collection-of-death happening in the NN. [ disclaimer: I don't work for Netapp, I work for MapR ] On Wed, Dec 7, 2011 at 4:30 PM, randy randy...@comcast.net wrote: Thanks Joey. We've had enough problems with nfs (mainly under very high load) that we thought it might be riskier to use it for the NN. randy On 12/07/2011 06:46 PM, Joey Echeverria wrote: Hey Rand, It will mark that storage directory as failed and ignore it from then on. In order to do this correctly, you need a couple of options enabled on the NFS mount to make sure that it doesn't retry infinitely. I usually run with the tcp,soft,intr,timeo=10,**retrans=10 options set. -Joey On Wed, Dec 7, 2011 at 12:37 PM,randy...@comcast.net wrote: What happens then if the nfs server fails or isn't reachable? Does hdfs lock up? Does it gracefully ignore the nfs copy? Thanks, randy - Original Message - From: Joey Echeverriaj...@cloudera.com To: common-user@hadoop.apache.org Sent: Wednesday, December 7, 2011 6:07:58 AM Subject: Re: HDFS Backup nodes You should also configure the Namenode to use an NFS mount for one of it's storage directories. That will give the most up-to-date back of the metadata in case of total node failure. -Joey On Wed, Dec 7, 2011 at 3:17 AM, praveenesh kumar praveen...@gmail.com wrote: This means still we are relying on Secondary NameNode idealogy for Namenode's backup. Can OS-mirroring of Namenode is a good alternative keep it alive all the time ? Thanks, Praveenesh On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G mahesw...@huawei.comwrote: AFAIK backup node introduced in 0.21 version onwards. __**__ From: praveenesh kumar
Re: HDFS Backup nodes
You are out of luck if you don't want to use NFS, and yet want redundancy for the NN. Even the new NN HA work being done by the community will require NFS ... and the NFS itself needs to be HA. But if you use a Netapp, then the likelihood of the Netapp crashing is lower than the likelihood of a garbage-collection-of-death happening in the NN. [ disclaimer: I don't work for Netapp, I work for MapR ] On Wed, Dec 7, 2011 at 4:30 PM, randy randy...@comcast.net wrote: Thanks Joey. We've had enough problems with nfs (mainly under very high load) that we thought it might be riskier to use it for the NN. randy On 12/07/2011 06:46 PM, Joey Echeverria wrote: Hey Rand, It will mark that storage directory as failed and ignore it from then on. In order to do this correctly, you need a couple of options enabled on the NFS mount to make sure that it doesn't retry infinitely. I usually run with the tcp,soft,intr,timeo=10,**retrans=10 options set. -Joey On Wed, Dec 7, 2011 at 12:37 PM,randy...@comcast.net wrote: What happens then if the nfs server fails or isn't reachable? Does hdfs lock up? Does it gracefully ignore the nfs copy? Thanks, randy - Original Message - From: Joey Echeverriaj...@cloudera.com To: common-user@hadoop.apache.org Sent: Wednesday, December 7, 2011 6:07:58 AM Subject: Re: HDFS Backup nodes You should also configure the Namenode to use an NFS mount for one of it's storage directories. That will give the most up-to-date back of the metadata in case of total node failure. -Joey On Wed, Dec 7, 2011 at 3:17 AM, praveenesh kumarpraveen...@gmail.com wrote: This means still we are relying on Secondary NameNode idealogy for Namenode's backup. Can OS-mirroring of Namenode is a good alternative keep it alive all the time ? Thanks, Praveenesh On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G mahesw...@huawei.comwrote: AFAIK backup node introduced in 0.21 version onwards. __**__ From: praveenesh kumar [praveen...@gmail.com] Sent: Wednesday, December 07, 2011 12:40 PM To: common-user@hadoop.apache.org Subject: HDFS Backup nodes Does hadoop 0.20.205 supports configuring HDFS backup nodes ? Thanks, Praveenesh -- Joseph Echeverria Cloudera, Inc. 443.305.9434
RE: HDFS Backup nodes
Hi Koji, This was on CHD3U1. For the record I had the dfs.name.dir.restore which Harsh mentioned enabled as well. Jorn -Oorspronkelijk bericht- Van: Koji Noguchi [mailto:knogu...@yahoo-inc.com] Verzonden: woensdag 7 december 2011 17:59 Aan: common-user@hadoop.apache.org Onderwerp: Re: HDFS Backup nodes Hi Jorn, Which hadoop version were you using when you hit that issue? Koji On 12/7/11 5:25 AM, Jorn Argelo - Ephorus jorn.arg...@ephorus.com wrote: Just to add to that note - we've ran into an issue where the NFS share was out of sync (the namenode storage failed even though the NFS share was working), but the other local metadata was fine. At the restart of the namenode it picked the NFS share's fsimage even if it was out of sync. This had the effect that loads of blocks were marked as invalid and deleted by the datanodes, and the namenode never came out of safe mode because it was missing blocks. The Hadoop documentation says it always picks the most recent version of the fsimage but in my case this doesn't seem to have happened. Maybe a bug? With that said I've been having issues with NFS before (the NFS namenode storage always failed every hour even if the cluster was idle). Now since this was just test data it wasn't all that important ... but if that would happen with your production cluster you got yourself a problem. I've moved away from NFS and I'm using DRBD instead. Not having any problems anymore whatsoever. YMMV. Jorn -Oorspronkelijk bericht- Van: Joey Echeverria [mailto:j...@cloudera.com] Verzonden: woensdag 7 december 2011 12:08 Aan: common-user@hadoop.apache.org Onderwerp: Re: HDFS Backup nodes You should also configure the Namenode to use an NFS mount for one of it's storage directories. That will give the most up-to-date back of the metadata in case of total node failure. -Joey On Wed, Dec 7, 2011 at 3:17 AM, praveenesh kumar praveen...@gmail.com wrote: This means still we are relying on Secondary NameNode idealogy for Namenode's backup. Can OS-mirroring of Namenode is a good alternative keep it alive all the time ? Thanks, Praveenesh On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G mahesw...@huawei.comwrote: AFAIK backup node introduced in 0.21 version onwards. From: praveenesh kumar [praveen...@gmail.com] Sent: Wednesday, December 07, 2011 12:40 PM To: common-user@hadoop.apache.org Subject: HDFS Backup nodes Does hadoop 0.20.205 supports configuring HDFS backup nodes ? Thanks, Praveenesh
RE: HDFS Backup nodes
AFAIK backup node introduced in 0.21 version onwards. From: praveenesh kumar [praveen...@gmail.com] Sent: Wednesday, December 07, 2011 12:40 PM To: common-user@hadoop.apache.org Subject: HDFS Backup nodes Does hadoop 0.20.205 supports configuring HDFS backup nodes ? Thanks, Praveenesh
Re: HDFS Backup nodes
This means still we are relying on Secondary NameNode idealogy for Namenode's backup. Can OS-mirroring of Namenode is a good alternative keep it alive all the time ? Thanks, Praveenesh On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G mahesw...@huawei.comwrote: AFAIK backup node introduced in 0.21 version onwards. From: praveenesh kumar [praveen...@gmail.com] Sent: Wednesday, December 07, 2011 12:40 PM To: common-user@hadoop.apache.org Subject: HDFS Backup nodes Does hadoop 0.20.205 supports configuring HDFS backup nodes ? Thanks, Praveenesh
RE: HDFS Backup nodes
Yes ... it you are looking for high uptime then keeping the Namenode OS-mirror always running would be the best way to go. We might need to explore further on the capabilities of HDFS backup node to see how it can be utilized. Thanks, Sagar -Original Message- From: praveenesh kumar [mailto:praveen...@gmail.com] Sent: Wednesday, December 07, 2011 1:47 PM To: common-user@hadoop.apache.org Subject: Re: HDFS Backup nodes This means still we are relying on Secondary NameNode idealogy for Namenode's backup. Can OS-mirroring of Namenode is a good alternative keep it alive all the time ? Thanks, Praveenesh On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G mahesw...@huawei.comwrote: AFAIK backup node introduced in 0.21 version onwards. From: praveenesh kumar [praveen...@gmail.com] Sent: Wednesday, December 07, 2011 12:40 PM To: common-user@hadoop.apache.org Subject: HDFS Backup nodes Does hadoop 0.20.205 supports configuring HDFS backup nodes ? Thanks, Praveenesh DISCLAIMER == This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.
Re: HDFS Backup nodes
You should also configure the Namenode to use an NFS mount for one of it's storage directories. That will give the most up-to-date back of the metadata in case of total node failure. -Joey On Wed, Dec 7, 2011 at 3:17 AM, praveenesh kumar praveen...@gmail.com wrote: This means still we are relying on Secondary NameNode idealogy for Namenode's backup. Can OS-mirroring of Namenode is a good alternative keep it alive all the time ? Thanks, Praveenesh On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G mahesw...@huawei.comwrote: AFAIK backup node introduced in 0.21 version onwards. From: praveenesh kumar [praveen...@gmail.com] Sent: Wednesday, December 07, 2011 12:40 PM To: common-user@hadoop.apache.org Subject: HDFS Backup nodes Does hadoop 0.20.205 supports configuring HDFS backup nodes ? Thanks, Praveenesh -- Joseph Echeverria Cloudera, Inc. 443.305.9434
RE: HDFS Backup nodes
Just to add to that note - we've ran into an issue where the NFS share was out of sync (the namenode storage failed even though the NFS share was working), but the other local metadata was fine. At the restart of the namenode it picked the NFS share's fsimage even if it was out of sync. This had the effect that loads of blocks were marked as invalid and deleted by the datanodes, and the namenode never came out of safe mode because it was missing blocks. The Hadoop documentation says it always picks the most recent version of the fsimage but in my case this doesn't seem to have happened. Maybe a bug? With that said I've been having issues with NFS before (the NFS namenode storage always failed every hour even if the cluster was idle). Now since this was just test data it wasn't all that important ... but if that would happen with your production cluster you got yourself a problem. I've moved away from NFS and I'm using DRBD instead. Not having any problems anymore whatsoever. YMMV. Jorn -Oorspronkelijk bericht- Van: Joey Echeverria [mailto:j...@cloudera.com] Verzonden: woensdag 7 december 2011 12:08 Aan: common-user@hadoop.apache.org Onderwerp: Re: HDFS Backup nodes You should also configure the Namenode to use an NFS mount for one of it's storage directories. That will give the most up-to-date back of the metadata in case of total node failure. -Joey On Wed, Dec 7, 2011 at 3:17 AM, praveenesh kumar praveen...@gmail.com wrote: This means still we are relying on Secondary NameNode idealogy for Namenode's backup. Can OS-mirroring of Namenode is a good alternative keep it alive all the time ? Thanks, Praveenesh On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G mahesw...@huawei.comwrote: AFAIK backup node introduced in 0.21 version onwards. From: praveenesh kumar [praveen...@gmail.com] Sent: Wednesday, December 07, 2011 12:40 PM To: common-user@hadoop.apache.org Subject: HDFS Backup nodes Does hadoop 0.20.205 supports configuring HDFS backup nodes ? Thanks, Praveenesh -- Joseph Echeverria Cloudera, Inc. 443.305.9434
Re: HDFS Backup nodes
What happens then if the nfs server fails or isn't reachable? Does hdfs lock up? Does it gracefully ignore the nfs copy? Thanks, randy - Original Message - From: Joey Echeverria j...@cloudera.com To: common-user@hadoop.apache.org Sent: Wednesday, December 7, 2011 6:07:58 AM Subject: Re: HDFS Backup nodes You should also configure the Namenode to use an NFS mount for one of it's storage directories. That will give the most up-to-date back of the metadata in case of total node failure. -Joey On Wed, Dec 7, 2011 at 3:17 AM, praveenesh kumar praveen...@gmail.com wrote: This means still we are relying on Secondary NameNode idealogy for Namenode's backup. Can OS-mirroring of Namenode is a good alternative keep it alive all the time ? Thanks, Praveenesh On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G mahesw...@huawei.comwrote: AFAIK backup node introduced in 0.21 version onwards. From: praveenesh kumar [praveen...@gmail.com] Sent: Wednesday, December 07, 2011 12:40 PM To: common-user@hadoop.apache.org Subject: HDFS Backup nodes Does hadoop 0.20.205 supports configuring HDFS backup nodes ? Thanks, Praveenesh -- Joseph Echeverria Cloudera, Inc. 443.305.9434
Re: HDFS Backup nodes
Hey Rand, It will mark that storage directory as failed and ignore it from then on. In order to do this correctly, you need a couple of options enabled on the NFS mount to make sure that it doesn't retry infinitely. I usually run with the tcp,soft,intr,timeo=10,retrans=10 options set. -Joey On Wed, Dec 7, 2011 at 12:37 PM, randy...@comcast.net wrote: What happens then if the nfs server fails or isn't reachable? Does hdfs lock up? Does it gracefully ignore the nfs copy? Thanks, randy - Original Message - From: Joey Echeverria j...@cloudera.com To: common-user@hadoop.apache.org Sent: Wednesday, December 7, 2011 6:07:58 AM Subject: Re: HDFS Backup nodes You should also configure the Namenode to use an NFS mount for one of it's storage directories. That will give the most up-to-date back of the metadata in case of total node failure. -Joey On Wed, Dec 7, 2011 at 3:17 AM, praveenesh kumar praveen...@gmail.com wrote: This means still we are relying on Secondary NameNode idealogy for Namenode's backup. Can OS-mirroring of Namenode is a good alternative keep it alive all the time ? Thanks, Praveenesh On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G mahesw...@huawei.comwrote: AFAIK backup node introduced in 0.21 version onwards. From: praveenesh kumar [praveen...@gmail.com] Sent: Wednesday, December 07, 2011 12:40 PM To: common-user@hadoop.apache.org Subject: HDFS Backup nodes Does hadoop 0.20.205 supports configuring HDFS backup nodes ? Thanks, Praveenesh -- Joseph Echeverria Cloudera, Inc. 443.305.9434 -- Joseph Echeverria Cloudera, Inc. 443.305.9434
Re: HDFS Backup nodes
Thanks Joey. We've had enough problems with nfs (mainly under very high load) that we thought it might be riskier to use it for the NN. randy On 12/07/2011 06:46 PM, Joey Echeverria wrote: Hey Rand, It will mark that storage directory as failed and ignore it from then on. In order to do this correctly, you need a couple of options enabled on the NFS mount to make sure that it doesn't retry infinitely. I usually run with the tcp,soft,intr,timeo=10,retrans=10 options set. -Joey On Wed, Dec 7, 2011 at 12:37 PM,randy...@comcast.net wrote: What happens then if the nfs server fails or isn't reachable? Does hdfs lock up? Does it gracefully ignore the nfs copy? Thanks, randy - Original Message - From: Joey Echeverriaj...@cloudera.com To: common-user@hadoop.apache.org Sent: Wednesday, December 7, 2011 6:07:58 AM Subject: Re: HDFS Backup nodes You should also configure the Namenode to use an NFS mount for one of it's storage directories. That will give the most up-to-date back of the metadata in case of total node failure. -Joey On Wed, Dec 7, 2011 at 3:17 AM, praveenesh kumarpraveen...@gmail.com wrote: This means still we are relying on Secondary NameNode idealogy for Namenode's backup. Can OS-mirroring of Namenode is a good alternative keep it alive all the time ? Thanks, Praveenesh On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao Gmahesw...@huawei.comwrote: AFAIK backup node introduced in 0.21 version onwards. From: praveenesh kumar [praveen...@gmail.com] Sent: Wednesday, December 07, 2011 12:40 PM To: common-user@hadoop.apache.org Subject: HDFS Backup nodes Does hadoop 0.20.205 supports configuring HDFS backup nodes ? Thanks, Praveenesh -- Joseph Echeverria Cloudera, Inc. 443.305.9434
Re: HDFS Backup nodes
Randy, On recent releases (CDH3u2 here for example), you also have dfs.name.dir.restore, a boolean flag that will automatically try to enable previously failed name directories upon every checkpoint if possible. Hence if you have a SNN running, and your NFS failed at some point and got marked as FAILED on your NN web UI, if the NFS is back up again before the next checkpoint interval, it will be auto-restored after the NN deems its in a writable state again. On Thu, Dec 8, 2011 at 6:00 AM, randy randy...@comcast.net wrote: Thanks Joey. We've had enough problems with nfs (mainly under very high load) that we thought it might be riskier to use it for the NN. randy On 12/07/2011 06:46 PM, Joey Echeverria wrote: Hey Rand, It will mark that storage directory as failed and ignore it from then on. In order to do this correctly, you need a couple of options enabled on the NFS mount to make sure that it doesn't retry infinitely. I usually run with the tcp,soft,intr,timeo=10,retrans=10 options set. -Joey On Wed, Dec 7, 2011 at 12:37 PM,randy...@comcast.net wrote: What happens then if the nfs server fails or isn't reachable? Does hdfs lock up? Does it gracefully ignore the nfs copy? Thanks, randy - Original Message - From: Joey Echeverriaj...@cloudera.com To: common-user@hadoop.apache.org Sent: Wednesday, December 7, 2011 6:07:58 AM Subject: Re: HDFS Backup nodes You should also configure the Namenode to use an NFS mount for one of it's storage directories. That will give the most up-to-date back of the metadata in case of total node failure. -Joey On Wed, Dec 7, 2011 at 3:17 AM, praveenesh kumarpraveen...@gmail.com wrote: This means still we are relying on Secondary NameNode idealogy for Namenode's backup. Can OS-mirroring of Namenode is a good alternative keep it alive all the time ? Thanks, Praveenesh On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao Gmahesw...@huawei.comwrote: AFAIK backup node introduced in 0.21 version onwards. From: praveenesh kumar [praveen...@gmail.com] Sent: Wednesday, December 07, 2011 12:40 PM To: common-user@hadoop.apache.org Subject: HDFS Backup nodes Does hadoop 0.20.205 supports configuring HDFS backup nodes ? Thanks, Praveenesh -- Joseph Echeverria Cloudera, Inc. 443.305.9434 -- Harsh J
HDFS Backup nodes
Does hadoop 0.20.205 supports configuring HDFS backup nodes ? Thanks, Praveenesh