Should we emit "�", standards-be-damned?

Dangerous: If some parser goes fully-XML-compliant, they will blame Surefire again. So just emitting "�" seems not really a long-term solution.

(Expected "" but was "" ... Just imagine how painful it would be to track something like that down.)
[...]
Silently omit the character?

I fully agree, that is painful.

Print a "?" instead?

Well, if altering the original string is acceptable, why not replace invalid characters with dedicated codes such that users can deduct their original contents? For example, replace invalid characters with their literal Unicode escape, e.g. print
 Expected "" but was "\u0000"
instead of
 Expected "" but was "?"

Something else?

If retaining the original string contents is the ulitmate goal, a reliable approach would be to preprocess strings with some encoding scheme (Base64, C-style escaping, etc.) before emitting to the XML. Of course, this would require a decoding on the other side and also some additions to the XML format (i.e. different attribute to hold the encoded string or additional attribute to flag some other attribute/element as being encoded).

Regards,



Benjamin Bentmann

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to