Stas,

There is no HA solution currently for Hadoop.
You can do things like Cloudera describes.
Their solution works with 2 real name-nodes.
No Backup node involved.

As for Backup node, I don't really understand Todd's comment
but the fact is that Backup node (BN) is not a standby
node. The failover procedure is not implemented for BN,
so neither clients nor data-node don't fail-over anywhere
when the main name-node (NN) dies, they don't have a clue.

The purpose of the BN is
1) to keep an up-to-date image of the namespace in memory.
This does not include block locations.
BN does not know where file blocks are.
2) to make periodic checkpoints, like SecondaryNameNode did,
but more efficiently, since BN does not need to load image
and edits from NN, its namespace is already up-to-date.

There is provision to transform BN to a real standby node,
with failover, but it has not been implemented yet.

Hope this clarifies things.

Thanks,
--Konstantin


Todd Lipcon wrote:
On Wed, Aug 12, 2009 at 3:42 AM, Stas Oskin <stas.os...@gmail.com> wrote:

Hi.


You can also use a utility like Linux-HA (aka heartbeat) to handle IP
address failover. It will even send gratuitous ARPs to make sure to get
the
new mac address registered after a failover. Check out this blog for info
about a setup like this:

http://www.cloudera.com/blog/2009/07/22/hadoop-ha-configuration/

Hope that helps

Thanks, exactly what I looked for :).

 I presume that with the coming BB node, there won't be need for DRBD, am I
correct?


I haven't followed that development closely, but I believe that's the case.
The BackupNode will stream the FSEditLog writes as they occur while
replaying them into its own FSNamesystem. Then during a failover a real
NameNode starts on that FSNamesystem "ready to go". As for how the
BackupNode keeps track of block locations, I'm not sure - is there a
replication stream between BlockManagers too? Or is the cluster in a broken
state until all of the DNs have processed new block reports?

-Todd

Reply via email to