RE: HDFS Backup nodes

2011-12-08 Thread Jorn Argelo - Ephorus
Hi Koji,

This was on CHD3U1. For the record I had the dfs.name.dir.restore which
Harsh mentioned enabled as well.

Jorn

-Oorspronkelijk bericht-
Van: Koji Noguchi [mailto:knogu...@yahoo-inc.com] 
Verzonden: woensdag 7 december 2011 17:59
Aan: common-user@hadoop.apache.org
Onderwerp: Re: HDFS Backup nodes

Hi Jorn, 

Which hadoop version were you using when you hit that issue?

Koji


On 12/7/11 5:25 AM, Jorn Argelo - Ephorus jorn.arg...@ephorus.com
wrote:

 Just to add to that note - we've ran into an issue where the NFS share
 was out of sync (the namenode storage failed even though the NFS share
 was working), but the other local metadata was fine. At the restart of
 the namenode it picked the NFS share's fsimage even if it was out of
 sync. This had the effect that loads of blocks were marked as invalid
 and deleted by the datanodes, and the namenode never came out of safe
 mode because it was missing blocks. The Hadoop documentation says it
 always picks the most recent version of the fsimage but in my case
this
 doesn't seem to have happened. Maybe a bug? With that said I've been
 having issues with NFS before (the NFS namenode storage always failed
 every hour even if the cluster was idle).
 
 Now since this was just test data it wasn't all that important ... but
 if that would happen with your production cluster you got yourself a
 problem. I've moved away from NFS and I'm using DRBD instead. Not
having
 any problems anymore whatsoever.
 
 YMMV.
 
 Jorn
 
 -Oorspronkelijk bericht-
 Van: Joey Echeverria [mailto:j...@cloudera.com]
 Verzonden: woensdag 7 december 2011 12:08
 Aan: common-user@hadoop.apache.org
 Onderwerp: Re: HDFS Backup nodes
 
 You should also configure the Namenode to use an NFS mount for one of
 it's storage directories. That will give the most up-to-date back of
 the metadata in case of total node failure.
 
 -Joey
 
 On Wed, Dec 7, 2011 at 3:17 AM, praveenesh kumar
praveen...@gmail.com
 wrote:
 This means still we are relying on Secondary NameNode idealogy for
 Namenode's backup.
 Can OS-mirroring of Namenode is a good alternative keep it alive all
 the
 time ?
 
 Thanks,
 Praveenesh
 
 On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G
 mahesw...@huawei.comwrote:
 
 AFAIK backup node introduced in 0.21 version onwards.
 
 From: praveenesh kumar [praveen...@gmail.com]
 Sent: Wednesday, December 07, 2011 12:40 PM
 To: common-user@hadoop.apache.org
 Subject: HDFS Backup nodes
 
 Does hadoop 0.20.205 supports configuring HDFS backup nodes ?
 
 Thanks,
 Praveenesh
 
 
 



RE: HDFS Backup nodes

2011-12-07 Thread Jorn Argelo - Ephorus
Just to add to that note - we've ran into an issue where the NFS share
was out of sync (the namenode storage failed even though the NFS share
was working), but the other local metadata was fine. At the restart of
the namenode it picked the NFS share's fsimage even if it was out of
sync. This had the effect that loads of blocks were marked as invalid
and deleted by the datanodes, and the namenode never came out of safe
mode because it was missing blocks. The Hadoop documentation says it
always picks the most recent version of the fsimage but in my case this
doesn't seem to have happened. Maybe a bug? With that said I've been
having issues with NFS before (the NFS namenode storage always failed
every hour even if the cluster was idle).

Now since this was just test data it wasn't all that important ... but
if that would happen with your production cluster you got yourself a
problem. I've moved away from NFS and I'm using DRBD instead. Not having
any problems anymore whatsoever.

YMMV.

Jorn

-Oorspronkelijk bericht-
Van: Joey Echeverria [mailto:j...@cloudera.com] 
Verzonden: woensdag 7 december 2011 12:08
Aan: common-user@hadoop.apache.org
Onderwerp: Re: HDFS Backup nodes

You should also configure the Namenode to use an NFS mount for one of
it's storage directories. That will give the most up-to-date back of
the metadata in case of total node failure.

-Joey

On Wed, Dec 7, 2011 at 3:17 AM, praveenesh kumar praveen...@gmail.com
wrote:
 This means still we are relying on Secondary NameNode idealogy for
 Namenode's backup.
 Can OS-mirroring of Namenode is a good alternative keep it alive all
the
 time ?

 Thanks,
 Praveenesh

 On Wed, Dec 7, 2011 at 1:35 PM, Uma Maheswara Rao G
mahesw...@huawei.comwrote:

 AFAIK backup node introduced in 0.21 version onwards.
 
 From: praveenesh kumar [praveen...@gmail.com]
 Sent: Wednesday, December 07, 2011 12:40 PM
 To: common-user@hadoop.apache.org
 Subject: HDFS Backup nodes

 Does hadoop 0.20.205 supports configuring HDFS backup nodes ?

 Thanks,
 Praveenesh




-- 
Joseph Echeverria
Cloudera, Inc.
443.305.9434


Running more than one secondary namenode

2011-10-12 Thread Jorn Argelo - Ephorus
Hi all,

 

I was wondering if there are any (technical) issues with running two
secondary namenodes on two separate servers rather than running just
one. Since basically everything falls or stands with a consistent
snapshot of the namenode fsimage I was considering to run two secondary
namenodes for additional resilience. Has this been done before or am I
being too paranoid? Are there any caveats with doing this?

 

Thanks,

 

Jorn



RE: Running more than one secondary namenode

2011-10-12 Thread Jorn Argelo - Ephorus
Hi Chris,

I am doing exactly what you described there apart from the regular
backup thing (which is still on the todo list). Unfortunately my Java
knowledge is poor at best so I'm not sure if I 
would actually understand the Namenode internals. I'm going to give it a
try nevertheless though!

I guess you're quite right that if we have regular backups of the
namenode fsimage and edit logs we're quite safe.

Thanks for your feedback.

Jorn

-Oorspronkelijk bericht-
Van: Chris Smith [mailto:csmi...@gmail.com] 
Verzonden: woensdag 12 oktober 2011 12:03
Aan: common-user@hadoop.apache.org
Onderwerp: Re: Running more than one secondary namenode

Jorn,

If you've configured the Name Node fsimage and edit log replication to
both  NFS and Secondary Name Node and regularly backup the fsimage and
edit logs you would do better investing time in understanding exactly
how the Name Node builds up it's internal database and how it applies
it's edit logs; 'read the code, Luke'.

Then, if you really want to be prepared, you can then produce some
test scenarios by applying a corruption (that the Name Node can't
handle automatically) to the fsimage or edit logs on a sacrificial
system (VM?) and see if you can recover from this.  That way, if you
ever get hit with a Name Node corruption you'll be in a much better
place to recovery most/all of your data.

Even with the best setup it can happen if you hit  a 'corner case'
scenario.

Chris

On 12 October 2011 08:50, Jorn Argelo - Ephorus
jorn.arg...@ephorus.com wrote:
 Hi all,



 I was wondering if there are any (technical) issues with running two
 secondary namenodes on two separate servers rather than running just
 one. Since basically everything falls or stands with a consistent
 snapshot of the namenode fsimage I was considering to run two
secondary
 namenodes for additional resilience. Has this been done before or am I
 being too paranoid? Are there any caveats with doing this?



 Thanks,



 Jorn