Hi Chris,
On Mon, 2007-04-16 at 11:18 +0100, Chris Burdess wrote:
> I've found the problem. java.io.File.toURL() is returning an invalid
> URL in DomLSParser.getInputSource.
>
> I'm sure we've been over this a hundred times or more but maybe it's
> time to revisit this yet again. Here is an example of a "URL"
> returned in my local testcase:
>
> file:/home/dog/test/twisti-xml/xml/test.xml
>
> According to RFC 1738:
>
> A file URL takes the form
>
> file://<host>/<path>
>
> where <host> is the fully qualified domain name of the system on
> which the <path> is accessible, and <path> is a hierarchical
> directory path of the form <directory>/<directory>/.../<name>.
>
> ...
>
> As a special case, <host> can be the string "localhost" or the empty
> string; this is interpreted as `the machine from which the URL is
> being interpreted'.
>
> Therefore File.toURL should be returning a URL of the form
>
> file:///home/dog/test/twisti-xml/xml/test.xml
>
> Let the flamewar commence...
:)
OK, I'll bite. I think that according to RFC 3986 if the authority is
empty/undefined it is preferred to drop the leading //. One could of
course debate whether or not URIs defined by 3986 actually replace file
URLs as defined in 1738. I believe URL and URI in java.net mostly try to
follow 3986 in any case (with some weird java exceptions it seems).
A more practical observation might be that since we have to deal with
whatever interpretation URL/URL have in java, when the URL scheme is
"file" instead of using toURL().toString() we might just want to use
"file://" + URL.getPath() which unambiguously gives the thing we are
after in DomLSParser.getInputSource().
Also File.toURL() seems deprecated in favor of File.toURI().toURL().
So what about the (untested) attached patch?
Cheers,
Mark
Index: gnu/xml/dom/ls/DomLSParser.java
===================================================================
RCS file: /cvsroot/classpath/classpath/gnu/xml/dom/ls/DomLSParser.java,v
retrieving revision 1.6
diff -u -r1.6 DomLSParser.java
--- gnu/xml/dom/ls/DomLSParser.java 8 Mar 2007 11:16:31 -0000 1.6
+++ gnu/xml/dom/ls/DomLSParser.java 16 Apr 2007 12:05:31 -0000
@@ -387,11 +387,11 @@
catch (MalformedURLException e)
{
File baseFile = (base == null) ? null : new File(base);
- url = (baseFile == null) ? new File(systemId).toURL() :
- new File(baseFile, systemId).toURL();
+ url = (baseFile == null) ? new File(systemId).toURI().toURL() :
+ new File(baseFile, systemId).toURI().toURL();
}
in = url.openStream();
- systemId = url.toString();
+ systemId = urlToString(url);
source = new InputSource(in);
source.setSystemId(systemId);
}
@@ -403,6 +403,19 @@
return source;
}
+ /**
+ * Helper to get a string representation of an URL that definitely
+ * contains an empty authority field if the protocol is file.
+ * Neither URL.toString() nor URI.toString() guarantee that.
+ */
+ static String urlToString(URL u)
+ {
+ if ("file".equals(u.getProtocol()))
+ return "file://" + u.getPath();
+ else
+ return u.toString();
+ }
+
// -- DOMConfiguration --
public void setParameter(String name, Object value)