DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=19804>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=19804

XMLSerializer serializes CR, LF and mutliple spaces without escaping





------- Additional Comments From [EMAIL PROTECTED]  2003-11-19 16:33 -------
As I said before, this is only an issue if the type of the attribute is
something other than CDATA. Have a look at the algorithm for attribute-value 
normalization [1]. Processing of non-CDATA attributes are done in two steps. 
For example lets assume you have an attribute called 'att' containing 
references to the space character (0x20) and it is of type NMTOKENS, here is 
the transformation it undergoes:

Unnormalized:
<elem att="&#x20;&#x20;one&#x20;&#x20;two&#x20;&#x20;three&#x20;&#x20;"/>

Normalization 1st Pass:
<elem att="  one  two  three  "/>

Normalization 2nd Pass (for non-CDATA type):
<elem att="one two three"/>

You cannot preserve these spaces. They're not even part of the infoset,
so the example you've given is not possible. It is possible to modify 
the DOM so that it contains an attribute of type non-CDATA which
contains extra spaces, but these values are invalid and can only arise
from such modifications. Even if we tried escaping these spaces they'd
be gobbled up when the document is read back in. If the attribute's type
is CDATA there's no reason to escape these spaces since they aren't
normalized.

[1] http://www.w3.org/TR/REC-xml#AVNormalize

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to