Related (but not helping the immediate question).  China Telecom developed 
something they call HyperDFS.  They modified Hadoop and made it possible to run 
a cluster of NNs, thus eliminating the SPOF.

I don't have the details - the presenter at Hadoop World (last round of 
sessions, 2nd floor) mentioned that.  Didn't give a clear answer when asked 
about contributing it back.

 Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



----- Original Message ----
> From: Steve Loughran <ste...@apache.org>
> To: common-user@hadoop.apache.org
> Sent: Friday, October 2, 2009 7:22:45 AM
> Subject: Re: NameNode high availability
> 
> Stas Oskin wrote:
> > Hi.
> > 
> > The HA service (heartbeat) is running on Dom0, and when the primary
> > node is down, it basically just starts the VM on the other node. So
> > there not supposed to be any time issues.
> > 
> > Can you explain a bit more about your approach, how to automate it for 
> example?
> 
> * You need to have something " a resource manager" keeping an eye on the NN 
> from 
> somewhere. Needless to say, that needs to be fairly HA too.
> 
> * your NN image has to be ready to go
> 
> * when the deployed NA goes away, bring up a new machine with the same image, 
> hostname *and IP Address*. You can't always pull the latter off, it depends 
> on 
> the infrastructure. Without that, you'd need to bring up all the nodes with 
> DNS 
> caching set to a short time and update a DNS entry.
> 
> This isn't real HA, its recovery.

Reply via email to