On 3/19/10 4:32 AM, "Mag Gam" <magaw...@gmail.com> wrote:
> Thanks everyone. I think everyone can agree that this part of the
> documentation is lacking for hadoop.
>
> Can someone please provide be a use case, for example:
>
> #server 1
> Input > script.sh
> Output > rack01
>
> #server 2
> Input > script.sh
> Output > rack02
I think you have it in your head that the NameNode asks the DataNode what
rack it is. This is completely backwards. The DataNode has *no* concept of
what a rack is. It is purely a storage process. There isn't much logic in
it at all.
The topology script is *ONLY* run by the NameNode and JobTracker processes.
That's it. It is not run on the compute nodes. That setting is completely
*ignored* by the DataNode and TaskTracker processes.
So to rewrite your use case:
# NameNode
Input > server 1
Output > rack01
# NameNode
Input > server 2
Output > rack02
> Is this how its supposed to work? I am bad with bash so I am trying to
> understand the logic so I can implement it with another language such
> as tcl
The program logic is :
Input -> IP address or Hostname
Output -> /racklocation
That's it. There is nothing fancy going on here.