Hi Micheal,
thanks for all the help. You patch is in CVS.
Gareth
On Thu, 24 Jul 2003, Michael Glavassevich wrote:
> Hello everyone,
>
> I've been working lately to bring the URI implementation for Xerces-J
> closer to meeting the relevant RFCs. The implementation in Xerces-C is
> very similar. I've ported my fixes from the Java implementation. As well,
> I've fixed a few other issues with the C++ implementation. A combined
> patch is attached to this e-mail.
>
> The patch fixes Bugzilla #19787, #20006, #20009, #20010 and #20287, and
> several other issues. A summary of the changes is listed below:
>
> 1. Added '[' and ']' to reserved characters as per RFC 2732.
> 2. '[' and ']' added in RFC 2732, are not allowed in path segments, but
> may appear in the opaque part.
> 3. No URI can begin with a ':'.
> 4. URI has no scheme if ':' occurs in a URI after '?' or '#', it's part of
> the query string or fragment.
> 5. Whitespace (even escaped as %20) is not permitted in the authority
> portion of a URI.
> 6. IPv4 addresses must match 1*3DIGIT "." 1*3DIGIT "." 1*3DIGIT "."
> 1*3DIGIT. Since RFC 2732.
> 7. IPv4 addresses are 32-bit, therefore no segment may be larger than 255.
> This isn't expressed by the grammar.
> 8. Hostnames cannot end with a '-'.
> 9. Labels in a hostname must be 63 bytes or less [RFC 1034].
> 10. Hostnames may be no longer than 255 bytes [RFC 1034]. (That
> restriction was already there. I just moved it inwards.
> 11. Added support for IPv6 references added in RFC 2732. URIs such as
> http://[::ffff:1.2.3.4] are valid. The BNF in RFC 2373 isn't correct. IPv6
> addresses are read according to section 2.2 of RFC 2373.
>
> I also made a change that should improve performance. In many cases
> checking if a character belongs to a particular character class involves
> iterating over arrays. I switched the order of checks done during the
> scanning of the path, so that it checks if a character is alphanumeric
> before iterating over the various arrays.
>
> On the Java side, I replaced these arrays (Strings in the case of the Java
> implemenation) with a lookup table. This greatly imporved performance.
> That would certainly be worth migrating over in the future.
>
> --------------------
> Michael Glavassevich
> [EMAIL PROTECTED]
--
Gareth Reakes, Head of Product Development +44-1865-203192
DecisionSoft Limited http://www.decisionsoft.com
XML Development and Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]