We should not drop the offending characters, but escape them. Either the Unicode entity (&#nn;) or CDATA way is ok (and CDATA way is simpler).

This isn't entirely true, Andrzej -- escaping a character, or putting it in a CDATA section is just about different ways of expressing the same character code in an XML structure. The same and ILLEGAL character code in terms of XML spec (there is a fragment specifying legal character ranges there), so a conforming XML parser should throw an exception if it encounters anything outside of the legal range. The only way of transferring a full binary is to encode it to legal unicode characters (using uuencode or such).

I agree with the person who submitted this patch that it is a potential issue and should be addressed somehow.

D.

Reply via email to