Hello experts,

I'm new to hdfs/hadoop.  After reading the hdfs documents, I'm getting
confused by the differences between a namenode and a master server.  It's
my understanding that the namenode is responsible for managing metadata,
while the master-replica group (which is comprised by a number of
datanodes) stores the actual data blocks.  In the master-replica group, the
master server accepts read/write requests, and load balances (or routes)
read requests to the appropriate replica. In other words, we should
configure the namenode and master server on two different physical machines
in a production environment, right?  Is this a correct assumption?

One other question about HDFS cluster setup:

- requirements:  one namenode, replication factor = 3, in a production
environment.

how would the topology look like?  Can I configure as follows?


in conf/core-site.xml:
    fs.default.name = hdfs://machineAAA:54321/

in conf/masters:
    machineBBB

in conf/slaves:
    machineCCC
    machineDDD


Can someone please confirm and/or comment?

Sorry for my new bie questions. Thanks for the help.

Chao

Reply via email to