[ https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12554686 ]
Devaraj Das commented on HADOOP-1985: ------------------------------------- I ran the Scan benchmark attached in the patch (the benchmark just scans inputs; no sort/shuffle/reduce). The input data was generated on a cluster of ~300 machines. Randomwriter with the following config was run - 40 maps per host, each map configured to generate 1G, dfs blk size 256 MB. The input data set was thus around 11.6 TB. Another cluster of ~900 nodes, with its dfs pointing to the earlier 300 node cluster, was used to run the Scan benchmark. The number of maps was equal to the number of dfs blocks in the input. The two clusters had common racks but no common nodes. With the rack aware patch, the scan program took 25 minutes (with 90% rack-local tasks), and without the patch, the scan took around 35 minutes, ~30% improvement. > Abstract node to switch mapping into a topology service class used by > namenode and jobtracker > --------------------------------------------------------------------------------------------- > > Key: HADOOP-1985 > URL: https://issues.apache.org/jira/browse/HADOOP-1985 > Project: Hadoop > Issue Type: New Feature > Reporter: eric baldeschwieler > Assignee: Devaraj Das > Fix For: 0.16.0 > > Attachments: 1985.new.patch, 1985.v1.patch, 1985.v2.patch, > 1985.v3.patch > > > In order to implement switch locality in MapReduce, we need to have switch > location in both the namenode and job tracker. Currently the namenode asks > the data nodes for this info and they run a local script to answer this > question. In our environment and others that I know of there is no reason to > push this to each node. It is easier to maintain a centralized script that > maps node DNS names to switch strings. > I propose that we build a new class that caches known DNS name to switch > mappings and invokes a loadable class or a configurable system call to > resolve unknown DNS to switch mappings. We can then add this to the namenode > to support the current block to switch mapping needs and simplify the data > nodes. We can also add this same callout to the job tracker and then > implement rack locality logic there without needing to chane the filesystem > API or the split planning API. > Not only is this the least intrusive path to building racklocal MR I can ID, > it is also future compatible to future infrastructures that may derive > topology on the fly, etc, etc... -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.