I've been wondering about DRBD. Many (5+?) years ago when I looked at DRBD it required too much low-level tinkering and required hardware I did not have. I wonder what it takes to set it up now and if there are any Hadoop-specific things you needed to do? Overall, are you happy with DRBD? (you are limited to 2 nodes, right?)
Thanks, Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- > From: paul <[EMAIL PROTECTED]> > To: core-user@hadoop.apache.org > Sent: Tuesday, July 29, 2008 2:56:44 PM > Subject: Re: Multiple master nodes > > I'm currently running with your option B setup and it seems to be reliable > for me (so far). I use a combination of drbd and various hearbeat/LinuxHA > scripts that handle the failover process, including a virtual IP for the > namenode. I haven't had any real-world unexpected failures to deal with, > yet, but all manual testing has had consistent and reliable results. > > > > -paul > > > On Tue, Jul 29, 2008 at 1:54 PM, Ryan Shih wrote: > > > Dear Hadoop Community -- > > > > I am wondering if it is already possible or in the plans to add capability > > for multiple master nodes. I'm in a situation where I have a master node > > that may potentially be in a less than ideal execution and networking > > environment. For this reason, it's possible that the master node could die > > at any time. On the other hand, the application must always be available. I > > have accessible to me other machines but I'm still unclear on the best > > method to add reliability. > > > > Here are a few options that I'm exploring: > > a) To create a completely secondary Hadoop cluster that we can flip to when > > we detect that the master node has died. This will double hardware costs, > > so > > if we originally have a 5 node cluster, then we would need to pull 5 more > > machines out of somewhere for this decision. This is not the preferable > > choice. > > b) Just mirror the master node via other always available software, such as > > DRBD for real time synchronization. Upon detection we could swap to the > > alternate node. > > c) Or if Hadoop had some functionality already in place, it would be > > fantastic to be able to take advantage of that. I don't know if anything > > like this is available but I could not find anything as of yet. It seems to > > me, however, that having multiple master nodes would be the direction > > Hadoop > > needs to go if it is to be useful in high availability applications. I was > > told there are some papers on Amazon's Elastic Computing that I'm about to > > look for that follow this approach. > > > > In any case, could someone with experience in solving this type of problem > > share how they approached this issue? > > > > Thanks! > >