[
https://issues.apache.org/jira/browse/MAPREDUCE-1222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12783962#action_12783962
]
Dick King commented on MAPREDUCE-1222:
--------------------------------------
After I wrote my comment of 24/Nov/09 07:59 PM , I looked at the Java API
because I came to wonder whether unescaping and using the Java API could be
made to work by itself. I did look for alternatives before I created my big
regular expression.
The big problem is that Java doesn't really present any API that distinguishes
numeric IP addresses from symbolic addresses. Although
InetAddress.getByName(String) must have some means of parsing an IPV4 and IPV6
literal numeric address, this functionality is not presented to java.net.*
users. InetAddress.getByName(String) will parse either a numeric address or a
symbolic name and produce indistinguishable results. That piece of the API
does not give us a means to distinguish the two. I was unable to find any
other API that did make the distinction.
The formats of numeric literal IPV4 and IPV6 internet addresses are fixed in
RFCs and are extremely unlikely to be changed in the foreseeable future. We
are therefore not exposed to any non-future-proofing. The only exposure we
have is a possible future IPV8, but the ICANN is doing its best to make that
unnecessary for a very long time.
Considering that Apache already owns this regular expression we should consider
using it.
I considered the simpler approach of considering any address that contains a
colon character to be a numeric IPV6 address, but colons are used as other
punctuation, ie., separation between IP address and port number. That solution
felt to me to be too brittle and accident-prone, and doesn't solve the IPV8
problem. There is a continuum of IPV6 solutions ranging from "look for a
colon" to the correct regular expression you see here, and no principled way to
decide where to stop.
> [Mumak] We should not include nodes with numeric ips in cluster topology.
> -------------------------------------------------------------------------
>
> Key: MAPREDUCE-1222
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1222
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: contrib/mumak
> Affects Versions: 0.21.0, 0.22.0
> Reporter: Hong Tang
> Assignee: Hong Tang
> Fix For: 0.21.0, 0.22.0
>
> Attachments: IPv6-predicate.patch, mapreduce-1222-20091119.patch,
> mapreduce-1222-20091121.patch
>
>
> Rumen infers cluster topology by parsing input split locations from job
> history logs. Due to HDFS-778, a cluster node may appear both as a numeric ip
> or as a host name in job history logs. We should exclude nodes appeared as
> numeric ips in cluster toplogy when we run mumak until a solution is found so
> that numeric ips would never appear in input split locations.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.