[ 
https://issues.apache.org/jira/browse/HADOOP-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12532489
 ] 

Doug Cutting commented on HADOOP-1995:
--------------------------------------

> A fix is to make path normalization file system dependent.

First, there's a technical problem, that normalization is currently done when 
the FileSystem is unknown, under Path's constructor.  But, even so, I'm not 
sure that will solve it.

By this you mean that a local path that contains backslashes will have them 
escaped by Path's constructor.  So that "\[bar,baz]" will be parsed as 
"/[bar,baz]", while an HDFS path like "\[bar,baz]" will be parsed as 
"\[bar,baz]", so that the '[' is unavailable for globbing.  But then 
applications which run on both unix and Windows and using both the local fs and 
HDFS will have to pass in different kinds of path strings, no?

Not all paths come from a FileSystem implementation, some come from environment 
variables, config files, constant strings in user code, etc.  Thus we must be 
able to handle Windows file names passed to the Path constructor that have not 
undergone special escaping, e.g., C:\foo\bar should be parsed as c:/foo/bar.  
We've tried other approaches and they've not worked well.

This is a hard problem to handle well:

http://www.cygwin.com/ml/cygwin/1999-06/msg00213.html

Perhaps we need to expect some Path-related things to be broken on Windows, but 
make those be rarely used things.  Windows paths that contains '[' or ']' 
simply might not work correctly when passed to listPaths unless the user is 
careful to insert escapes: we will not attempt to insert such escapes 
automatically.  We would  only translate '\' to '/' when running on Windows, 
and only then when it's not immediately followed by another backslash.  This 
will mean that a directory whose name starts with a glob character will not 
work correctly on Windows unless the developer manually inserts appropriate 
escapes, but that globs will work correctly on Windows.  My assumption is that 
directories beginning with glob characters are much more rare than uses of glob 
characters for globbing.  Could that work?


> Path can not handle a file name that contains a back slash
> ----------------------------------------------------------
>
>                 Key: HADOOP-1995
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1995
>             Project: Hadoop
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 0.14.1
>            Reporter: Hairong Kuang
>             Fix For: 0.16.0
>
>
> When normalizing a path name, Path incorrectly converts a back slash to a 
> path separator even if  the path name is of the unix style. This prohibs a 
> glob from using a back slash to escape a special character. A fix is to make 
> path normalization file system dependent.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to