RE: HDFS Backup nodes
Hi Koji, This was on CHD3U1. For the record I had the dfs.name.dir.restore which Harsh mentioned enabled as well. Jorn -Oorspronkelijk bericht- Van: Koji Noguchi [mailto:knogu...@yahoo-inc.com] Verzonden: woensdag 7 december 2011 17:59 Aan: common-user@hadoop.apache.org Onderwerp: Re: HDFS Backup nodes Hi Jorn, Which hadoop version were you using when you hit that issue? Koji On 12/7/11 5:25 AM, Jorn Argelo - Ephorus jorn.arg...@ephorus.com wrote: Just to add to that note - we've ran into an issue where the NFS share was out of sync (the namenode storage failed even though the NFS share was working), but the other local metadata was fine. At the restart of the namenode it picked the NFS share's fsimage even if it was out of sync. This had the effect that loads of blocks were marked as invalid and deleted by the datanodes, and the namenode never came out of safe mode because it was missing blocks. The Hadoop documentation says it always picks the most recent version of the fsimage but in my case this doesn't seem to have happened. Maybe a bug? With that said I've been having issues with NFS before (the NFS namenode storage always failed every hour even if the cluster was idle). Now since this was just test data it wasn't all that important ... but if that would happen with your production cluster you got yourself a problem. I've moved away from NFS and I'm using DRBD instead. Not having any problems anymore whatsoever. YMMV. Jorn -Oorspronkelijk bericht- Van: Joey Echeverria [mailto:j...@cloudera.com] Verzonden: woensdag 7 december 2011 12:08 Aan: common-user@hadoop.apache.org Onderwerp: Re: HDFS Backup nodes You should also configure the Namenode to use an NFS mount for one of it's storage directories. That will give the most up-to-date back of the metadata in case of total node failure. -Joey On Wed, Dec 7, 2011 at 3:17 AM, praveenesh kumar praveen...@gmail.com wrote: This means still we are relying on Secondary NameNode idealogy for Namenode's backup. Can OS-mirroring of Namenode is a good alternative keep it alive all the time ? Thanks, Praveenesh On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G mahesw...@huawei.comwrote: AFAIK backup node introduced in 0.21 version onwards. From: praveenesh kumar [praveen...@gmail.com] Sent: Wednesday, December 07, 2011 12:40 PM To: common-user@hadoop.apache.org Subject: HDFS Backup nodes Does hadoop 0.20.205 supports configuring HDFS backup nodes ? Thanks, Praveenesh
RE: HDFS Backup nodes
Just to add to that note - we've ran into an issue where the NFS share was out of sync (the namenode storage failed even though the NFS share was working), but the other local metadata was fine. At the restart of the namenode it picked the NFS share's fsimage even if it was out of sync. This had the effect that loads of blocks were marked as invalid and deleted by the datanodes, and the namenode never came out of safe mode because it was missing blocks. The Hadoop documentation says it always picks the most recent version of the fsimage but in my case this doesn't seem to have happened. Maybe a bug? With that said I've been having issues with NFS before (the NFS namenode storage always failed every hour even if the cluster was idle). Now since this was just test data it wasn't all that important ... but if that would happen with your production cluster you got yourself a problem. I've moved away from NFS and I'm using DRBD instead. Not having any problems anymore whatsoever. YMMV. Jorn -Oorspronkelijk bericht- Van: Joey Echeverria [mailto:j...@cloudera.com] Verzonden: woensdag 7 december 2011 12:08 Aan: common-user@hadoop.apache.org Onderwerp: Re: HDFS Backup nodes You should also configure the Namenode to use an NFS mount for one of it's storage directories. That will give the most up-to-date back of the metadata in case of total node failure. -Joey On Wed, Dec 7, 2011 at 3:17 AM, praveenesh kumar praveen...@gmail.com wrote: This means still we are relying on Secondary NameNode idealogy for Namenode's backup. Can OS-mirroring of Namenode is a good alternative keep it alive all the time ? Thanks, Praveenesh On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G mahesw...@huawei.comwrote: AFAIK backup node introduced in 0.21 version onwards. From: praveenesh kumar [praveen...@gmail.com] Sent: Wednesday, December 07, 2011 12:40 PM To: common-user@hadoop.apache.org Subject: HDFS Backup nodes Does hadoop 0.20.205 supports configuring HDFS backup nodes ? Thanks, Praveenesh -- Joseph Echeverria Cloudera, Inc. 443.305.9434
Running more than one secondary namenode
Hi all, I was wondering if there are any (technical) issues with running two secondary namenodes on two separate servers rather than running just one. Since basically everything falls or stands with a consistent snapshot of the namenode fsimage I was considering to run two secondary namenodes for additional resilience. Has this been done before or am I being too paranoid? Are there any caveats with doing this? Thanks, Jorn
RE: Running more than one secondary namenode
Hi Chris, I am doing exactly what you described there apart from the regular backup thing (which is still on the todo list). Unfortunately my Java knowledge is poor at best so I'm not sure if I would actually understand the Namenode internals. I'm going to give it a try nevertheless though! I guess you're quite right that if we have regular backups of the namenode fsimage and edit logs we're quite safe. Thanks for your feedback. Jorn -Oorspronkelijk bericht- Van: Chris Smith [mailto:csmi...@gmail.com] Verzonden: woensdag 12 oktober 2011 12:03 Aan: common-user@hadoop.apache.org Onderwerp: Re: Running more than one secondary namenode Jorn, If you've configured the Name Node fsimage and edit log replication to both NFS and Secondary Name Node and regularly backup the fsimage and edit logs you would do better investing time in understanding exactly how the Name Node builds up it's internal database and how it applies it's edit logs; 'read the code, Luke'. Then, if you really want to be prepared, you can then produce some test scenarios by applying a corruption (that the Name Node can't handle automatically) to the fsimage or edit logs on a sacrificial system (VM?) and see if you can recover from this. That way, if you ever get hit with a Name Node corruption you'll be in a much better place to recovery most/all of your data. Even with the best setup it can happen if you hit a 'corner case' scenario. Chris On 12 October 2011 08:50, Jorn Argelo - Ephorus jorn.arg...@ephorus.com wrote: Hi all, I was wondering if there are any (technical) issues with running two secondary namenodes on two separate servers rather than running just one. Since basically everything falls or stands with a consistent snapshot of the namenode fsimage I was considering to run two secondary namenodes for additional resilience. Has this been done before or am I being too paranoid? Are there any caveats with doing this? Thanks, Jorn