Don't all of the 5 pre-defined characters (& < > ' ") need to be encoded to
avoid problems in parsers? I thought that was required for well-formed
XML. For example, apostrophe (') is used for delimiting attribute values.
I do know that in our case the Xerces SAX parser threw exceptions (or just
returned errors?) if any of those 5 appeared in a value string.
Rick
|---------+---------------------------->
| | "John Wilson" |
| | <[EMAIL PROTECTED]|
| | > |
| | |
| | 02/28/2002 02:51 |
| | PM |
| | Please respond to|
| | rpc-dev |
| | |
|---------+---------------------------->
>-----------------------------------------------------------------------------------------------------------------------|
|
|
| To: <[EMAIL PROTECTED]>
|
| cc:
|
| Subject: Re: DO NOT REPLY [Bug 6763] New: - XMLWriter doesn't escape
enough characters |
>-----------------------------------------------------------------------------------------------------------------------|
[snip]
> org.apache.xmlrpc.XmlRpc$XMLWriter.chardata escapes the characters &, <,
and >
> in strings passed as arguments to execute(). If the string contains
other
> characters that are not allowed in XML, then the XmlRpcServer fails with
a
> SAXParseException on the other side of the wire. In the example I
encountered,
> the string contained the character 0x05, which should probably be escaped
as
> . (I have worked around this by adding my own pass over the
argument
> strings before calling execute, but this is obviously not ideal.)
This isn't a bug. You just can't legally have a Unicode character with the
value 5 in a well formed XML document. Escaping it as  makes no
difference.
The relevant part of the spec is Section 4.1 Character and Entity
References
"Well-Formedness Constraint: Legal Character
Characters referred to using character references must match the production
for Char. "
MinML currently and erroneously allows this - I'm in process of tightening
it's checking and it will soon reject it.
John Wilson
The Wilson Partnership
http://www.wilson.co.uk