[ https://issues.apache.org/jira/browse/HADOOP-7418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13065002#comment-13065002 ]
Aaron T. Myers commented on HADOOP-7418: ---------------------------------------- Hey Andrew, I think the regex needs to changed. In particular, I don't think it will actually cover the multiple back slash case since the double back slash in your regex actually is just string-escaping one back slash, which is then regex-escaping the "+" character. If you want to include a literal back slash in the regex, you need to use 4 back slashes. (Silly, I know.) Furthermore, I think that doing the replace in two stages (first forward slashes, then back slashes) won't cover the case when forward slashes are separated by back slashes (e.g. "/foo/\/bar".) To cover that case, you have two options: # Replace back slashes first, before forward slashes. The back slash replacement could even be a 1-for-1 replacement, leaving you with a bunch of consecutive forward slashes, which then get replaced by a single forward slash in the next regex. # Use something like this regex: "{{.replaceAll("(/|\\\\)+", "/")}}", which replaces multiple consecutive "/" or "\" with a single "/". It would also be worthwhile to add test cases to cover these cases. > support for multiple slashes in the path separator > -------------------------------------------------- > > Key: HADOOP-7418 > URL: https://issues.apache.org/jira/browse/HADOOP-7418 > Project: Hadoop Common > Issue Type: Bug > Affects Versions: 0.23.0 > Environment: Linux running JDK 1.6 > Reporter: Sudharsan Sampath > Assignee: Andrew Look > Priority: Minor > Labels: newbie > Fix For: 0.23.0 > > Attachments: HADOOP-7418.txt, HADOOP-7418.txt, HDFS-1460.txt, > HDFS-1460.txt > > > the parsing of the input path string to identify the uri authority conflicts > with the file system paths. For instance the following is a valid path in > both the linux file system and the hdfs. > //user/directory1//directory2. > While this works perfectly fine in the command line for manipulating hdfs, > the same fails when specified as the input path for a mapper class with the > following expcetion. > Exception in thread "main" java.net.UnknownHostException: unknown host: user > at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:195) > as the org.apache.hadoop.fs.Path class assumes the string that follows the > '//' to be an uri authority -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira