HDFS does have a single point of failure, and there is no way around this in
its current implementation.  The namenode keeps track of a FS image and and
edits log.  It's common for these to be stored both on the local disk and on
a NFS mount.  In the case when the namenode fails, a new machine can be
provisioned to be the namenode by loading the backed-up image and edits
files.
Can you say more about how you'll use HDFS?  It's not a very latent file
system, so it shouldn't be used to serve images, videos, etc in a web
environment.  It's most common use is to be the basis of batch Map/Reduce
jobs.

Alex

On Thu, Nov 13, 2008 at 5:18 PM, S. L. <[EMAIL PROTECTED]> wrote:

> Hi list
> I am kind of new to Hadoop but have some good background. I am seriously
> considering adopting Hadoop and especially HDFS first to be able to store
> various files (in the low hundreds thousands at first) on a few nodes in a
> manner where I don't need a RAID system or a SAN. HDFS seems a perfect fit
> for the job...
>
> BUT
>
> from what I learn in the past couple days it seems that the single point of
> failure in HDFS is the NameNode. So I was wondering if anyone in the list
> that did deploy HDFS in a production environment on what is their strategy
> for High Availability of the system... Having the NameNode unavailable is
> basically bringing the whole HDFS system offline. So what are the scripts
> or
> other techniques recommended to add H.A to HDFS !
>
> Thank !
>
> -- S.
>

Reply via email to