Hi Chris,

On Mon, 2007-04-16 at 11:18 +0100, Chris Burdess wrote:
> I've found the problem. java.io.File.toURL() is returning an invalid  
> URL in DomLSParser.getInputSource.
> 
> I'm sure we've been over this a hundred times or more but maybe it's  
> time to revisit this yet again. Here is an example of a "URL"  
> returned in my local testcase:
> 
>    file:/home/dog/test/twisti-xml/xml/test.xml
> 
> According to RFC 1738:
> 
>     A file URL takes the form
> 
>        file://<host>/<path>
> 
>     where <host> is the fully qualified domain name of the system on
>     which the <path> is accessible, and <path> is a hierarchical
>     directory path of the form <directory>/<directory>/.../<name>.
> 
> ...
> 
>     As a special case, <host> can be the string "localhost" or the empty
>     string; this is interpreted as `the machine from which the URL is
>     being interpreted'.
> 
> Therefore File.toURL should be returning a URL of the form
> 
>    file:///home/dog/test/twisti-xml/xml/test.xml
> 
> Let the flamewar commence...

:)

OK, I'll bite. I think that according to RFC 3986 if the authority is
empty/undefined it is preferred to drop the leading //. One could of
course debate whether or not URIs defined by 3986 actually replace file
URLs as defined in 1738. I believe URL and URI in java.net mostly try to
follow 3986 in any case (with some weird java exceptions it seems).

A more practical observation might be that since we have to deal with
whatever interpretation URL/URL have in java, when the URL scheme is
"file" instead of using toURL().toString() we might just want to use
"file://" + URL.getPath() which unambiguously gives the thing we are
after in DomLSParser.getInputSource().

Also File.toURL() seems deprecated in favor of File.toURI().toURL().

So what about the (untested) attached patch?

Cheers,

Mark
Index: gnu/xml/dom/ls/DomLSParser.java
===================================================================
RCS file: /cvsroot/classpath/classpath/gnu/xml/dom/ls/DomLSParser.java,v
retrieving revision 1.6
diff -u -r1.6 DomLSParser.java
--- gnu/xml/dom/ls/DomLSParser.java	8 Mar 2007 11:16:31 -0000	1.6
+++ gnu/xml/dom/ls/DomLSParser.java	16 Apr 2007 12:05:31 -0000
@@ -387,11 +387,11 @@
             catch (MalformedURLException e)
               {
                 File baseFile = (base == null) ? null : new File(base);
-                url = (baseFile == null) ? new File(systemId).toURL() :
-                  new File(baseFile, systemId).toURL();
+                url = (baseFile == null) ? new File(systemId).toURI().toURL() :
+                  new File(baseFile, systemId).toURI().toURL();
               }
             in = url.openStream();
-            systemId = url.toString();
+            systemId = urlToString(url);
             source = new InputSource(in);
             source.setSystemId(systemId);
           }
@@ -403,6 +403,19 @@
     return source;
   }
 
+  /**
+   * Helper to get a string representation of an URL that definitely
+   * contains an empty authority field if the protocol is file.
+   * Neither URL.toString() nor URI.toString() guarantee that.
+   */
+  static String urlToString(URL u)
+  {
+    if ("file".equals(u.getProtocol()))
+      return "file://" + u.getPath();
+    else
+      return u.toString();
+  }
+
   // -- DOMConfiguration --
 
   public void setParameter(String name, Object value)

Reply via email to