[
https://issues.apache.org/jira/browse/HADOOP-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12566806#action_12566806
]
dhruba borthakur commented on HADOOP-1985:
------------------------------------------
I like sanjay's proposal. If the namenode receives a block report but the
datanode's hostname has not been resolved yet, then the NN sets a bit in the
DatanodeDescriptor that indicates that the namenode should request a block
report. The NN then dumps the contents of this block report on the floor but
returns success to the block-report RPC.
Finally, when the name of the datanode is resolved and if this new bit is set,
then the NN sends a request-for-block-report request as part of the response to
a heartbeat from that datanode.
> Abstract node to switch mapping into a topology service class used by
> namenode and jobtracker
> ---------------------------------------------------------------------------------------------
>
> Key: HADOOP-1985
> URL: https://issues.apache.org/jira/browse/HADOOP-1985
> Project: Hadoop Core
> Issue Type: New Feature
> Components: dfs, mapred
> Reporter: eric baldeschwieler
> Assignee: Devaraj Das
> Fix For: 0.17.0
>
> Attachments: 1985.new.patch, 1985.v1.patch, 1985.v10.patch,
> 1985.v11.patch, 1985.v2.patch, 1985.v3.patch, 1985.v4.patch, 1985.v5.patch,
> 1985.v6.patch, 1985.v9.patch, jobinprogress.patch
>
>
> In order to implement switch locality in MapReduce, we need to have switch
> location in both the namenode and job tracker. Currently the namenode asks
> the data nodes for this info and they run a local script to answer this
> question. In our environment and others that I know of there is no reason to
> push this to each node. It is easier to maintain a centralized script that
> maps node DNS names to switch strings.
> I propose that we build a new class that caches known DNS name to switch
> mappings and invokes a loadable class or a configurable system call to
> resolve unknown DNS to switch mappings. We can then add this to the namenode
> to support the current block to switch mapping needs and simplify the data
> nodes. We can also add this same callout to the job tracker and then
> implement rack locality logic there without needing to chane the filesystem
> API or the split planning API.
> Not only is this the least intrusive path to building racklocal MR I can ID,
> it is also future compatible to future infrastructures that may derive
> topology on the fly, etc, etc...
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.