[ https://issues.apache.org/jira/browse/HBASE-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194482#comment-13194482 ]
Phabricator commented on HBASE-5259: ------------------------------------ Kannan has accepted the revision "[jira][HBASE-5259] Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup.". Liyin -- looks good to me. One minor suggestion inlined. INLINE COMMENTS src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java:171 logging here is unnecessary because of logging in line 191. The "split" (TableSplit's toString() method already will print the regionLocation along with the start/stop keys for each map task. REVISION DETAIL https://reviews.facebook.net/D1413 > Normalize the RegionLocation in TableInputFormat by the reverse DNS lookup. > --------------------------------------------------------------------------- > > Key: HBASE-5259 > URL: https://issues.apache.org/jira/browse/HBASE-5259 > Project: HBase > Issue Type: Improvement > Reporter: Liyin Tang > Assignee: Liyin Tang > Attachments: D1413.1.patch, D1413.1.patch, D1413.1.patch, > D1413.1.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, D1413.2.patch, > D1413.3.patch, D1413.3.patch, D1413.3.patch, D1413.3.patch > > > Assuming the HBase and MapReduce running in the same cluster, the > TableInputFormat is to override the split function which divides all the > regions from one particular table into a series of mapper tasks. So each > mapper task can process a region or one part of a region. Ideally, the mapper > task should run on the same machine on which the region server hosts the > corresponding region. That's the motivation that the TableInputFormat sets > the RegionLocation so that the MapReduce framework can respect the node > locality. > The code simply set the host name of the region server as the > HRegionLocation. However, the host name of the region server may have > different format with the host name of the task tracker (Mapper task). The > task tracker always gets its hostname by the reverse DNS lookup. And the DNS > service may return different host name format. For example, the host name of > the region server is correctly set as a.b.c.d while the reverse DNS lookup > may return a.b.c.d. (With an additional doc in the end). > So the solution is to set the RegionLocation by the reverse DNS lookup as > well. No matter what host name format the DNS system is using, the > TableInputFormat has the responsibility to keep the consistent host name > format with the MapReduce framework. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira