Re: NameNode failover procedure

Himanshu Sharma Fri, 01 Aug 2008 01:47:35 -0700

NFS is problematic, that's sure. So, what if secondary namenode where only
the secondary process is running, itself used as backup of Editslog file
using any synchronisation tool? Then we may have a backup in case primary
namenode goes down so that it can be started there at the secondary
namenode.



Steve Loughran wrote:
> 
> Himanshu Sharma wrote:
>> The NFS seems to be having problem as NFS locking causes namenode hangup.
>> Can't be there any other way, say if namenode starts writing
>> synchronously
>> to secondary namenode apart from local directories, then in case of
>> namenode
>> failover, we can start the primary namenode process on secondary namenode
>> and the latest checkpointed fsimage is already there on secondary
>> namenode.
> 
> NFS shouldn't be used in production datacentres, at least not as the 
> main way that the nodes talk to a common filesystem.
> 
> That doesn't mean it doesn't get used that way, but when the network 
> plays up, all 1000+ servers suddenly halt on file IO with their logs 
> filling up with NFS warnings. The problem here is that the OS assumes 
> that file IO is local and fast, and NFS is trying "transparently" to 
> recover by blocking for a while, so bringing your apps to a halt. It is 
> way better to have the failures visible at the app level and make it 
> apply whatever policy you want -which is exactly what the DFS clients do 
> when talking to name- or -data nodes.
> 
> 
> say no to NFS.
> 
> Alternatives
> 
> * Some HA databases have two servers sharing access to the same disk 
> array at the physical layer, so when the 1ary node goes down, the 
> secondary can take over. but that assumes that it is never the raid-5 
> disk array that is going to fail. If something very bad happens to the 
> RAID controller, that assumption may prove to be false.
> 
> * SAN storage arrays to route RAID-backed storage to specific nodes in 
> the cluster. Again, you are hoping that nothing goes wrong behind the 
> scenes.
> 
> * Product placement warning: HP extreme storage with CPUs in the rack
> http://h71028.www7.hp.com/enterprise/cache/592778-0-0-0-121.html
> 
> I haven't tried bringing up hadoop on one of these -but it would be 
> interesting to see how well it works. Maybe Apache could start having an 
> "approved by hadoop" sticker with a yellow elephant on it to attach to 
> hardware that is known to work.
> 
>> This also raises a fundamental question, whether we can run secondary
>> namenode process on the same node as primary namenode process without any
>> out of memory / heap exceptions ? Also ideally what should be the memory
>> size of primary namenode if alone and when with secondary namenode
>> process ?
> 
> 
> What failures are you planning to deal with? Running the secondary node 
> process on the same machine means that you could cope with a process 
> failure, but not machine failure or network outage. You'd also need the 
> 2ary process listening on a second port, so clients would still need to 
> do some kind of handover.
> 
> 
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Re%3A-NameNode-failover-procedure-tp18740218p18770460.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Re: NameNode failover procedure

Reply via email to