There is a "secondary" NameNode which performs periodic checkpoints:
http://wiki.apache.org/hadoop/FAQ?highlight=(secondary)#7 Are there any instructions out there on how to copy the FS image and edits log from the secondary NameNode to a new machine when the original NameNode fails? Bill On Fri, Nov 14, 2008 at 12:50 PM, Alex Loddengaard <[EMAIL PROTECTED]>wrote: > HDFS does have a single point of failure, and there is no way around this > in > its current implementation. The namenode keeps track of a FS image and and > edits log. It's common for these to be stored both on the local disk and > on > a NFS mount. In the case when the namenode fails, a new machine can be > provisioned to be the namenode by loading the backed-up image and edits > files. > Can you say more about how you'll use HDFS? It's not a very latent file > system, so it shouldn't be used to serve images, videos, etc in a web > environment. It's most common use is to be the basis of batch Map/Reduce > jobs. > > Alex > > On Thu, Nov 13, 2008 at 5:18 PM, S. L. <[EMAIL PROTECTED]> wrote: > > > Hi list > > I am kind of new to Hadoop but have some good background. I am seriously > > considering adopting Hadoop and especially HDFS first to be able to store > > various files (in the low hundreds thousands at first) on a few nodes in > a > > manner where I don't need a RAID system or a SAN. HDFS seems a perfect > fit > > for the job... > > > > BUT > > > > from what I learn in the past couple days it seems that the single point > of > > failure in HDFS is the NameNode. So I was wondering if anyone in the list > > that did deploy HDFS in a production environment on what is their > strategy > > for High Availability of the system... Having the NameNode unavailable is > > basically bringing the whole HDFS system offline. So what are the scripts > > or > > other techniques recommended to add H.A to HDFS ! > > > > Thank ! > > > > -- S. > > >
