[ 
https://issues.apache.org/jira/browse/VFS-203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675797#action_12675797
 ] 

Joerg Schaible commented on VFS-203:
------------------------------------

Benjamin Bentman once gave a good summary of this issue on the Maven list. 
Citation of  http://markmail.org/message/lbnhjsmzrc2ht2fa and following below:

{quote}
URLs and filesystem paths are really two different beasts and converting 
between them is not trivial. The main source of problems is that different 
encoding rules apply for the strings that make up a URL or filesystem path. 
For example, consider the following code snippet: 
 File file = new File( "foo bar+foo" );
 URL url = file.toURI().toURL();
 System.out.println( file.toURL() );
 System.out.println( url );
 System.out.println( url.getPath() );
 System.out.println( URLDecoder.decode( url.getPath(), "UTF-8" ) );

which outputs something like 
 file:/M:/scratch-pad/foo bar+foo
 file:/M:/scratch-pad/foo%20bar+foo
 /M:/scratch-pad/foo%20bar+foo
 /M:/scratch-pad/foo bar foo

First of all, please note that File.toURL() does not escape the space 
character. This yields an invalid URL, as per RFC 2396, section 2.4.3 "Excluded 
US-ASCII Characters". The class java.net.URL will silently accept such invalid 
URLs, in contrast java.net.URI will not (see also URL.toURI()). For this 
reason, this API method has already been deprecated and should be replaced with 
File.toURI().toURL(). 
Next, URL.getPath() does in general not return a string that can be used as a 
filesystem path. It returns a substring of the URL and as such can contain 
escape sequences. The prominent example is the space character which will show 
up as "%20". People sometimes hack around this by means of replace("%20", " ") 
but that does simply not cover all cases. It's worth to mention that on the 
other hand the related method URI.getPath() does decode escapes but still the 
result is not a filesystem path (compare the source for the constructor 
File(URI)). 
To decode a URL, people sometimes also choose java.net.URLDecoder. The pitfall 
with this class is that is actually performs HTML form decoding which is yet 
another encoding and not the same as the URL encoding (compare last paragraph 
in class javadoc about java.net.URL). For instance, a URLDecoder will 
errorneously convert the character "+" into a space as illustrated by the last 
sysout in the example above. 
Code targetting JRE 1.4+ should easily avoid these problems by using 
 new File( new URI( url.toString() ) )

when converting a URL to a filesystem path and with JDKs >= 1.5 using 
 file.toURI().toURL()

when converting back. 
JRE 1.4 is happily returning invalid/unescaped URLs from 
ClassLoader.getResource(), making the above suggestion fail with a 
URISyntaxException. 
The suggestion is to use FileUtils.toFile(URL) from Commons IO.
{quote}

> FileObject..getName().getURI() returns URIs with spaces
> -------------------------------------------------------
>
>                 Key: VFS-203
>                 URL: https://issues.apache.org/jira/browse/VFS-203
>             Project: Commons VFS
>          Issue Type: Bug
>    Affects Versions: 1.0
>            Reporter: Tim Lebedkov
>
> Windows supports file names with spaces and '#'. AFAIK spaces are not allowed 
> in URIs and # will be interpreted as an URI fragment.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to