> >> For invalid characters such as 0x1e there are 3 possible solutions: > > >> 1) Discard the character from the output. > > >> 2) Replace the character with a numeric representation e.g. "0x1E". > > >> 3) Replace the character with an XML element e.g. <char code="30"/> > > > Nicko > > > >> favour option 3 above because information is not lost. In > options 1 > >> and 2 information is lost. In 2 the encoding is not > reversible. With > >> 3 the application reading the data requires additional smarts to > >> pickup on the encoded values in element, but all the original > >> information is preserved. If the app just asks for the > text nodes, > >> ignoring the child elements, then they will get back the > same result as from 1. > > If the application just deserializes the string, they'll end > up with a much more complex tree structure with a couple of > text nodes, an attribute node, ....
If the app does a GetText on the message element they will get all the text nodes joined up without the sub elements, which is reasonable, i.e. just drop the control codes. If they use InnerXml then they get XML elements, but then they should expect that and live with it! > I don't see that the transport of binary data is a key > purpose for log4net. Much as I dislike option proliferation, Log4net should not be throwing away data, even if it is not very string like, just because XML doesn't like it. > I wonder if would it be reasonable to have 3 as an optional > behavior but 1 or 2 as a default? What does log4j do in this > situation? log4j's XMLLayout just writes to the output stream through a Writer so it does no escaping or numeric character reference encoding. It does write the message out into a CDATA section but that should not resolve the issue. I doubt that it works there either. Nicko > -- > Mike Blake-Knox >
