On 11-Feb-08, at 12:49 PM, Dan Fabulich wrote:


I didn't know this, so I imagine others might not. The string "�" is invalid XML. The character is simply not allowed in XML in any representation. XML 1.0 standard blocks most of the characters under x20, allowing only x9 xA and xD. XML 1.1 allows x1- x20, but still blocks x0.

http://www.w3.org/TR/1998/REC-xml-19980210#charsets
http://www.w3.org/TR/xml11/#charsets

This creates an interesting problem for serializing Java strings containing the null character, e.g. "\u0000", or for other non- whitespace control characters like the bell character "\u0007". We've got an integration test for this case in Surefire, and it does entirely the wrong thing (SUREFIRE-455).

In the patch submitted to that bug, Todor throws away nulls in his XML escaper, silently omitting them from the output; all other control characters (even the 1.0-illegal ones) pass through. That doesn't seem right, especially when we're talking about test results! (Expected "" but was "" ... Just imagine how painful it would be to track something like that down.)

But neither does it seem right to insert "�" when it's illegal XML. Notably, Java will cheerfully print � in XML if you tell it to do so, and many parsers will figure out what to do with it just fine; the same applies to "".

Thoughts? Should we emit "�", standards-be-damned? Silently omit the character? Print a "?" instead? Something else?


Where did you run into this? I think not showing what it actually is makes it immediately not obvious what's going wrong. So I'm for showing what it actually is. Can you just wrap in CDATA?

-Dan

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


Thanks,

Jason

----------------------------------------------------------
Jason van Zyl
Founder,  Apache Maven
jason at sonatype dot com
----------------------------------------------------------

happiness is like a butterfly: the more you chase it, the more it will
elude you, but if you turn your attention to other things, it will come
and sit softly on your shoulder ...

-- Thoreau



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to