Re: [cp-patches] Re: Absolute URL parsing bug

2005-07-07 Thread Andrew Haley
I think this is the right fix.  Tested with Mauve, no regressions.

Fixes the testcase 

context = "jar:file://www.example.com/test.jar!/foo/bar.txt", 
spec = "jar:file://www.example.com/test.jar!/foo/test.txt"

which previously caused a NullPointerException.

Andrew.




2005-07-07  Andrew Haley  <[EMAIL PROTECTED]>

* java/net/URL.java (URL): If the file part of a spec is absolute,
ignore the file part of its context.

Index: URL.java
===
RCS file: /cvs/gcc/gcc/libjava/java/net/URL.java,v
retrieving revision 1.52
diff -p -2 -u -r1.52 URL.java
--- URL.java30 Jun 2005 03:20:01 -  1.52
+++ URL.java7 Jul 2005 16:21:14 -
@@ -409,8 +409,11 @@ public final class URL implements Serial
host = context.host;
port = context.port;
-   file = context.file;
 userInfo = context.userInfo;
-   if (file == null || file.length() == 0)
- file = "/";
+   if (spec.indexOf(":/", 1) < 0)
+ {
+   file = context.file;
+   if (file == null || file.length() == 0)
+ file = "/";
+ }
authority = context.authority;
  }


___
Classpath-patches mailing list
Classpath-patches@gnu.org
http://lists.gnu.org/mailman/listinfo/classpath-patches


[cp-patches] Re: Absolute URL parsing bug

2005-07-05 Thread Per Bothner

Andrew Haley wrote:

The case that we get wrong is this one:

$ java TestURLs 'jar:file:/ejbjars/ws.jar!/META-INF/wsdl/ssbEndpoint.wsdl' 
'jar:file:ejbjars/ws.jar!/META-INF/wsdl/ssbEndpoint.wsdl'
jar:file:ejbjars/ws.jar!/META-INF/wsdl/ssbEndpoint.wsdl


None of these are valid URLs as documented in the javadoc.  It matches 
the URI specification if you view everything after the scheme (i.e. 
"file:/ejbjars/ws.jar!/META-INF/wsdl/ssbEndpoint.wsdl") as an 
'opaque_part'.  There is no concept of a "nested URL" and "jar:file:" is 
not a valid scheme.



 $ gij TestURLs 'jar:file:/ejbjars/ws.jar!/META-INF/wsdl/ssbEndpoint.wsdl' 
'jar:file:ejbjars/ws.jar!/META-INF/wsdl/ssbEndpoint.wsdl'
jar:file:/ejbjars/ws.jar!/META-INF/wsdl/file:ejbjars/ws.jar!/META-INF/wsdl/ssbEndpoint.wsdl


So, in that case the context should be ignored.
In fact, it seems like the context is ignored for all "jar:file" and "file:" 
URLs:


We should ignore the context for *all* absolute URLs - i.e. any URL that 
has a scheme - accordingto the URI spec.  But the Javadoc doc URL(URL, 
String) says "If the scheme component is defined in the given spec and 
does not match the scheme of the context, then the new URL is created as 
an absolute URL based on the spec alone."I assume the "and does not 
match" clause is documenting a hack (bug) in the implementation - which 
doesn't match the URI spec.  (Hacks like this may be one reason why they 
decided to start from scratch with a new URI class.)


I.e. java TestURLs http:/a/b/c http:d/e
prints:
http:/a/b/d/e
but according to the RFC spec it should be:
http:d/e
The URI.resolve method gets this right.

A suggested (untested) fix might be something like:

   int colon = spec.indexOf(':');
   int slash = spec.indexOf('/');
   if (colon > 0
   && (colon < slash || slash < 0)
   && (protocol == null
   || protocol.length() <= colon
   || ! protocol.equalsIgnoreCase(spec.substring(0, colon
 context = null;


But in the case of a URL that is not qualified with any protocol at
all we need the context:


Er, no.  See the 'TestURLs  http:/a/b/c http:d/e' example.


 $ java TestURLs 'jar:file:/ejbjars/ws.jar!/META-INF/wsdl/ssbEndpoint.wsdl' foo
jar:file:/ejbjars/ws.jar!/META-INF/wsdl/foo

Classpath gets that right too.  So, the only thing we seem to get
wrong is the parsing of a 'jar:file:' URL when a 'jar:' context is
supplied.


Do you have any reason to believe that "jar:" or "file:" URLs get any 
special treatment?  I see none.

--
--Per Bothner
[EMAIL PROTECTED]   http://per.bothner.com/


___
Classpath-patches mailing list
Classpath-patches@gnu.org
http://lists.gnu.org/mailman/listinfo/classpath-patches


Re: [cp-patches] Re: Absolute URL parsing bug

2005-07-05 Thread Andrew Haley
Mark Wielaard writes:
 > Hi,
 > 
 > On Mon, 2005-07-04 at 18:07 +0100, Andrew Haley wrote: 
 > > When creating a URL: if the spec is absolute and a context URL is
 > > supplied, we should inherit the host:port part of the context if its
 > > protocol the same as that of the spec, but we should not inherit its
 > > file path.
 > 
 > Have you added the tests from your previous email to mauve?
 > There are a lot of URL tests in mauve,

Ah, I didn't know that.  OK, I'll try running mauve.  Sigh.

 > any new PASSes or FAILs?
 > 
 > > *** public final class URL implements Serial
 > > *** 409,413 
 > >host = context.host;
 > >port = context.port;
 > > -  file = context.file;
 > >   userInfo = context.userInfo;
 > >if (file == null || file.length() == 0)
 > > --- 409,412 
 > 
 > Isn't that last if statement now unnecessary?
 > It seems you can set file to "/" unconditionally here.

True, yes.  I'm finding it pretty hard to get inside the thinking of
whoever wrote this.

Andrew.


___
Classpath-patches mailing list
Classpath-patches@gnu.org
http://lists.gnu.org/mailman/listinfo/classpath-patches


Re: [cp-patches] Re: Absolute URL parsing bug

2005-07-05 Thread Mark Wielaard
Hi,

On Mon, 2005-07-04 at 18:07 +0100, Andrew Haley wrote: 
> When creating a URL: if the spec is absolute and a context URL is
> supplied, we should inherit the host:port part of the context if its
> protocol the same as that of the spec, but we should not inherit its
> file path.

Have you added the tests from your previous email to mauve?
There are a lot of URL tests in mauve, any new PASSes or FAILs?

> *** public final class URL implements Serial
> *** 409,413 
>   host = context.host;
>   port = context.port;
> - file = context.file;
>   userInfo = context.userInfo;
>   if (file == null || file.length() == 0)
> --- 409,412 

Isn't that last if statement now unnecessary?
It seems you can set file to "/" unconditionally here.

Cheers,

Mark


signature.asc
Description: This is a digitally signed message part
___
Classpath-patches mailing list
Classpath-patches@gnu.org
http://lists.gnu.org/mailman/listinfo/classpath-patches


[cp-patches] Re: Absolute URL parsing bug

2005-07-04 Thread Andrew Haley
Thanks to Per's reasoning I have a better patch.  

When creating a URL: if the spec is absolute and a context URL is
supplied, we should inherit the host:port part of the context if its
protocol the same as that of the spec, but we should not inherit its
file path.

Andrew.



2005-07-04  Andrew Haley  <[EMAIL PROTECTED]>

* java/net/URL.java (URL): Don't inherit the path component when
the scheme component of the context matches the scheme component
of the spec.
 
Index: URL.java
===
RCS file: /cvs/gcc/gcc/libjava/java/net/URL.java,v
retrieving revision 1.51
diff -p -2 -c -r1.51 URL.java
*** URL.java27 Apr 2005 20:10:07 -  1.51
--- URL.java4 Jul 2005 16:45:42 -
*** public final class URL implements Serial
*** 409,413 
host = context.host;
port = context.port;
-   file = context.file;
  userInfo = context.userInfo;
if (file == null || file.length() == 0)
--- 409,412 


___
Classpath-patches mailing list
Classpath-patches@gnu.org
http://lists.gnu.org/mailman/listinfo/classpath-patches


[cp-patches] Re: Absolute URL parsing bug

2005-07-04 Thread Andrew Haley
Per Bothner writes:
 > Andrew Haley wrote:
 > 
 > >  Would it be OK to
 > > special-case "file" URIs so that "file:/" is rewritten to ""file:///" ?
 > 
 > Maybe for the URL class, but I don't think so for the URI class.
 > The difference is the getAuthority method.  My reading is that it should 
 > return null for "file:/" and "" for "file:///".  But I haven't tested 
 > what JDK does - and it wouldn't surprise me if the URI and URL classes 
 > are different in this respect.

It seems to me that the constructors for URLs and URIs are quite
different, and their description is separate.  The difference between
JDK and Classpath seems to be restricted to the case where a URL is
created with an enclosing context that is another URL.


Here's a tiny test case:

import java.net.*;

public class TestURLs
{
  public static void main (String[] argv)
throws Exception
  {
URL url = new URL(argv[0]);
URL url2 = new URL(url, argv[1], null);
System.err.println(url2);
  }
}

The case that we get wrong is this one:


 $ java TestURLs 'jar:file:/ejbjars/ws.jar!/META-INF/wsdl/ssbEndpoint.wsdl' 
'jar:file:ejbjars/ws.jar!/META-INF/wsdl/ssbEndpoint.wsdl'
jar:file:ejbjars/ws.jar!/META-INF/wsdl/ssbEndpoint.wsdl

 $ gij TestURLs 'jar:file:/ejbjars/ws.jar!/META-INF/wsdl/ssbEndpoint.wsdl' 
'jar:file:ejbjars/ws.jar!/META-INF/wsdl/ssbEndpoint.wsdl'
jar:file:/ejbjars/ws.jar!/META-INF/wsdl/file:ejbjars/ws.jar!/META-INF/wsdl/ssbEndpoint.wsdl


So, in that case the context should be ignored.
In fact, it seems like the context is ignored for all "jar:file" and "file:" 
URLs:


 $ java TestURLs 'http://ejbjars/ws.jar' 
'jar:file:foo!/META-INF/wsdl/ssbEndpoint.wsdl'
jar:file:foo!/META-INF/wsdl/ssbEndpoint.wsdl

 $ java TestURLs 'http://ejbjars/' 
'jar:file:foo!/META-INF/wsdl/ssbEndpoint.wsdl'
jar:file:foo!/META-INF/wsdl/ssbEndpoint.wsdl

 $ java TestURLs 'jar:file:/ejbjars/ws.jar!/META-INF/wsdl/ssbEndpoint.wsdl' 
file:foo
file:foo


But in the case of a URL that is not qualified with any protocol at
all we need the context:


 $ java TestURLs 'jar:file:/ejbjars/ws.jar!/META-INF/wsdl/ssbEndpoint.wsdl' foo
jar:file:/ejbjars/ws.jar!/META-INF/wsdl/foo


Classpath gets that right too.  So, the only thing we seem to get
wrong is the parsing of a 'jar:file:' URL when a 'jar:' context is
supplied.

Andrew.



___
Classpath-patches mailing list
Classpath-patches@gnu.org
http://lists.gnu.org/mailman/listinfo/classpath-patches


Re: [cp-patches] Re: Absolute URL parsing bug

2005-07-03 Thread Andrew Haley
Chris Burdess writes:
 > Tom Tromey wrote:
 > > Andrew> I have no opinions about the syntax of URIs, I only want to make
 > > Andrew> real-world applications work.  In this case we have to guess what 
 > > Java
 > > Andrew> libraries are doing.
 > > 
 > > FWIW we've had some patch churn in this area in the past.
 > > It would be excellent to have a set of Mauve tests so that future
 > > changes don't break things that we know need to work.  In particular
 > > at least the current case should be added.
 > 
 > Perhaps we should also get some feedback from the W3C URI team on how
 > they feel the APIs apply to the concepts?

If what W3C wants is different from what Java programmers do, then we
have to go with the API that Java programmers already use.  So, even
if W3C says something is wrong, we have to do it anyway.

Andrew.


___
Classpath-patches mailing list
Classpath-patches@gnu.org
http://lists.gnu.org/mailman/listinfo/classpath-patches


Re: [cp-patches] Re: Absolute URL parsing bug

2005-07-02 Thread Chris Burdess
Tom Tromey wrote:
> Andrew> I have no opinions about the syntax of URIs, I only want to make
> Andrew> real-world applications work.  In this case we have to guess what Java
> Andrew> libraries are doing.
> 
> FWIW we've had some patch churn in this area in the past.
> It would be excellent to have a set of Mauve tests so that future
> changes don't break things that we know need to work.  In particular
> at least the current case should be added.

Perhaps we should also get some feedback from the W3C URI team on how
they feel the APIs apply to the concepts?
-- 
Chris Burdess


___
Classpath-patches mailing list
Classpath-patches@gnu.org
http://lists.gnu.org/mailman/listinfo/classpath-patches


[cp-patches] Re: Absolute URL parsing bug

2005-07-02 Thread Tom Tromey
> "Andrew" == Andrew Haley <[EMAIL PROTECTED]> writes:

Andrew> I have no opinions about the syntax of URIs, I only want to make
Andrew> real-world applications work.  In this case we have to guess what Java
Andrew> libraries are doing.

FWIW we've had some patch churn in this area in the past.
It would be excellent to have a set of Mauve tests so that future
changes don't break things that we know need to work.  In particular
at least the current case should be added.

Tom


___
Classpath-patches mailing list
Classpath-patches@gnu.org
http://lists.gnu.org/mailman/listinfo/classpath-patches


[cp-patches] Re: Absolute URL parsing bug

2005-07-02 Thread Per Bothner

Andrew Haley wrote:

Per Bothner writes:
 > Andrew Haley wrote:
 > > [ An absolute file URL can look like:
 > > 
 > >absoluteURI = "file" ":" abs_path

 > >abs_path = "/" path_segments
 > > 
 > > That is, there is no need for "//".
 > 
 > I don't see that in any of the specs.


I got it from RFC 2396.  Which I might have read wrongly, of course.


I read more carefully.  In practice, it looks like "//" is optional. 
I.e. "file:///foo/bar" is the same as "file:/foo/bar".  There is a
subtle difference is the the former has an empty authority, while the 
latter has a missing (undefined) authority.  Thus resolving the relative 
URIs "/tmp/help" vs "//gcc.gnu.org/help" against the base URI 
"file://gnu.org/foo/" will yield "file://gnu.org/foo/help" vs 
"file://gcc.gnu.org/foo/help" respectively.  In practice, people seldom 
use non-empty authority components with file: URIs and I'm not sure it's 
well-defined.  Perhaps when using NFS?




 > Technically, "file:/tmp/foo.html" is not a valid URI, as far as I
 > can tell.  I notice that firefox rewrites it (in the navigation
 > bar) to "file:///tmp/foo.html".
 >
 > Now in practice we may want to allow "file:/tmp/foo.html", but it should 
 > be viewed as an unofficial short-hand for "file:///tmp/foo.html".
 > 
 > > And indeed, the URL spec in the

 > > SDK docs says 'If the spec's path component begins with a slash
 > > character "/" then the path is treated as absolute...' ]
 > 
 > The *path* is absolute, but the URI isn't.
 > 
 > This matters when resolving a relative URL against a base URI, such as 
 > he URL of the containing document.
 > 
 > If we have a base URI "http://bar.com/baz/index.html"; and a reference 
 > "/tmp/foo.html" then the resolved URI is "http://bar.com/tmp/foo.html";.
 > 
 > > But we parse the spec looking for "//" to determine if a URL is

 > > absolute,
 > 
 > A URL is absolute *only* if it has a "scheme".


I don't really understand what you're suggesting.


My comments are mostly applicative to the URI class rather than the URL 
class.  (I'm not sure why they're separate.  I believe all URLs are 
URIs, accdoring to the IEFT, but not the Java class hierarchy.)  The URI 
class has a bunch of methods like resolve and isAbsolute, which require 
more care about these fine points.



 Would it be OK to
special-case "file" URIs so that "file:/" is rewritten to ""file:///" ?


Maybe for the URL class, but I don't think so for the URI class.
The difference is the getAuthority method.  My reading is that it should 
return null for "file:/" and "" for "file:///".  But I haven't tested 
what JDK does - and it wouldn't surprise me if the URI and URL classes 
are different in this respect.

--
--Per Bothner
[EMAIL PROTECTED]   http://per.bothner.com/


___
Classpath-patches mailing list
Classpath-patches@gnu.org
http://lists.gnu.org/mailman/listinfo/classpath-patches


[cp-patches] Re: Absolute URL parsing bug

2005-07-02 Thread Andrew Haley
Per Bothner writes:
 > Andrew Haley wrote:
 > > This URL is an absolute URL because its path begins with a "/":
 > > 
 > > "jar:file:/usr/src/redhat/BUILD/jonas-4.3.3/jonas/output/JONAS_4_3_3/examples/webservices/beans/ws/temp/ejbjars/ws.jar!/META-INF/wsdl/ssbEndpoint.wsdl"
 > > 
 > > [ An absolute file URL can look like:
 > > 
 > >absoluteURI = "file" ":" abs_path
 > >abs_path = "/" path_segments
 > > 
 > > That is, there is no need for "//".
 > 
 > I don't see that in any of the specs.

I got it from RFC 2396.  Which I might have read wrongly, of course.

 > Technically, "file:/tmp/foo.html" is not a valid URI, as far as I
 > can tell.  I notice that firefox rewrites it (in the navigation
 > bar) to "file:///tmp/foo.html".
 >
 > Now in practice we may want to allow "file:/tmp/foo.html", but it should 
 > be viewed as an unofficial short-hand for "file:///tmp/foo.html".
 > 
 > > And indeed, the URL spec in the
 > > SDK docs says 'If the spec's path component begins with a slash
 > > character "/" then the path is treated as absolute...' ]
 > 
 > The *path* is absolute, but the URI isn't.
 > 
 > This matters when resolving a relative URL against a base URI, such as 
 > he URL of the containing document.
 > 
 > If we have a base URI "http://bar.com/baz/index.html"; and a reference 
 > "/tmp/foo.html" then the resolved URI is "http://bar.com/tmp/foo.html";.
 > 
 > > But we parse the spec looking for "//" to determine if a URL is
 > > absolute,
 > 
 > A URL is absolute *only* if it has a "scheme".

I don't really understand what you're suggesting.  Would it be OK to
special-case "file" URIs so that "file:/" is rewritten to ""file:///" ?

I have no opinions about the syntax of URIs, I only want to make
real-world applications work.  In this case we have to guess what Java
libraries are doing.

Andrew.


___
Classpath-patches mailing list
Classpath-patches@gnu.org
http://lists.gnu.org/mailman/listinfo/classpath-patches


[cp-patches] Re: Absolute URL parsing bug

2005-07-01 Thread Per Bothner

Andrew Haley wrote:

This URL is an absolute URL because its path begins with a "/":

"jar:file:/usr/src/redhat/BUILD/jonas-4.3.3/jonas/output/JONAS_4_3_3/examples/webservices/beans/ws/temp/ejbjars/ws.jar!/META-INF/wsdl/ssbEndpoint.wsdl"

[ An absolute file URL can look like:

   absoluteURI = "file" ":" abs_path
   abs_path = "/" path_segments

That is, there is no need for "//".


I don't see that in any of the specs.  Technically, "file:/tmp/foo.html" 
is not a valid URI, as far as I can tell.  I notice that firefox 
rewrites it (in the navigation bar) to "file:///tmp/foo.html".


Now in practice we may want to allow "file:/tmp/foo.html", but it should 
be viewed as an unofficial short-hand for "file:///tmp/foo.html".



And indeed, the URL spec in the
SDK docs says 'If the spec's path component begins with a slash
character "/" then the path is treated as absolute...' ]


The *path* is absolute, but the URI isn't.

This matters when resolving a relative URL against a base URI, such as 
he URL of the containing document.


If we have a base URI "http://bar.com/baz/index.html"; and a reference 
"/tmp/foo.html" then the resolved URI is "http://bar.com/tmp/foo.html";.



But we parse the spec looking for "//" to determine if a URL is
absolute,


A URL is absolute *only* if it has a "scheme".


___
Classpath-patches mailing list
Classpath-patches@gnu.org
http://lists.gnu.org/mailman/listinfo/classpath-patches