Kristian,

Looking at the source for the Axis 1.x URI class, it seem there are two facts 
here

1.       We borrowed this code from the Xerces 2 source tree

2.       It jumps through a lot of hoops to make sure the right characters are 
in the URI, and the comment header in the file references the following RFCs:
RFC 2396 - http://www.ietf.org/rfc/rfc2396.txt?number=2396
RFC 2732 - http://www.ietf.org/rfc/rfc2732.txt?number=2732

Hope that helps.

Tom Jordahl

From: Kristian Barek [mailto:bar...@gmail.com]
Sent: Tuesday, February 10, 2009 1:16 PM
To: axis-dev@ws.apache.org
Subject: anyURI MalformedURIException with UTF-8 characters - bug or feature?

Is Apache Axis correct in disallowing international (UTF-8) characters in 
anyURI tags when processing responses to web services requests?

I've looked at the specification at http://www.w3.org/TR/xmlschema-2/#anyURI , 
and as far as I can see, anyURIs can contain any character, so long as the 
resulting of URL encoring the URL is valid. This simple test case illustrates 
the problem:

class Test {
 public static void main(String[] args) {
   try {
   org.apache.axis.types.URI uri = new 
org.apache.axis.types.URI("http://www.utdanningsdirektoratet.no/templates/udir/TM_Læreplan.aspx?id=2100&laereplanid=707207<http://www.utdanningsdirektoratet.no/templates/udir/TM_L%C3%A6replan.aspx?id=2100&laereplanid=707207>");
   } catch(Exception e) {
     System.out.println(e);
   }
 }
}

If anyone can provide me with any background / reasons on why Axis indeed is 
correct in invalidating this URI, I would be very grateful.
(If I can point to which standards our web services vendor is breaking, then I 
have a much better case to get them to stop putting norwegian characters in 
their anyURIs. :)

Best regards,
Kristian Barek

Reply via email to