On Mon, Aug 24, 2009 at 3:40 AM, Sugandha Naolekar<sugandha....@gmail.com> wrote: > Hello! > > ************************************************************************************** > Below is the python script I have written:: > > #!/usr/bin/env python > > ''' > This script used by hadoop to determine network/rack topology. It > should be specified in hadoop-site.xml via topology.script.file.name > Property. > > <property> > name>topology.script.file.name</name> > <value>/home/hadoop/topology.py</value> > </property> > ''' > > import sys > from string import join > > DEFAULT_RACK = '/default/rack0'; > > RACK_MAP = > { > '10.20.220.35' : '/jobsec/rack1', > '10.20.220.78' : '/jobsec/rack1', > '10.20.220.71' : '/jobsec/rack1', > > '10.20.220.74' : '/jobsec/rack2', > } > > if len(sys.argv)==1: > print DEFAULT_RACK > else: > print join([RACK_MAP.get(i, DEFAULT_RACK) for i in sys.argv[1:]]," ") > > *************************************************************************************** > I have to configure my 6 node cluster in which 4 r d datanodes and the other > 2 are playing NN,JT and sec.NN roles. The m/c playing JT is also Sec. NN. > Now, i want to configure 3 DN's in rack 1 of JT and the 4th DN in rack2 of > JT. > ************************************************************************************** > > Also, the values of corresponding tags is also set in site-sml file as:: > > <property> > <name>topology.node.switch.mapping.impl</name> > <value>org.apache.hadoop.net.ScriptBasedMapping</value> > <description> The default implementation of the DNSToSwitchMapping. It > invokes a script specified in topology.script.file.name to resolve > node names. If the value for topology.script.file.name is not set, the > default value of DEFAULT_RACK is returned for all node names. > </description> > </property> > > <property> > <name>topology.script.file.name</name> > <value>/home/hadoop/Softwares/hadoop-0.19.0/conf/test.py</value> > <description> The script name that should be invoked to resolve DNS names > to > NetworkTopology names. Example: the script would take host.foo.bar as an > argument, and return /rack1 as the output. > </description> > </property> > > <property> > <name>topology.script.number.args</name> > <value>10</value> > <description> The max number of args that the script configured with > topology.script.file.name should be run with. Each arg is an > IP address. > </description> > </property> > > Here, test.py is the python script file name. > > ***************************************************** > > Now, the moment I start or invoke namenode daemon, this file gets invoked > automatically? What do I do to acheive my purpose...The aove things are > correct(files and scripsts and tags?) how to chk the rack status of > machines..and special command? PLease do help me out! > > -- > Regards! > Sugandha >
You know you have done this correctly if you look at the logs of the namenode and jobtracker. They log messages like adding node 'X' rack 'Y'. Also running 'hadoop fsck /' shows you the number of racks and nodes. Is there anything that produces a pretty layout? I do not think so.