> (provided that the whitespace normalization algorithm will not
> include <ZWSP> in the whitespaces sequence and treat it
> isolately, something that a conforming HTML or XML processor
> should not do, as it should unify only sequences of <SPACE>,
> <TAB>, <CR>, <LF>, and only according to the context of the
> containing element whitespace properties controlling the
> normalization of XML whitespace sequences (leading, trailing,
> line break preservation, tabulator)...

ZWSP being normalised would be quite a bizarre bug, I can see it happening only if 
someone relied on a isWhiteSpace function provided by a non-XML aware library and that 
function considered ZWSP to be whitespace. I've never seen this, although I have seen 
similar assumptions made about how characters act in XML, and some deeply incorrect 
ones about how octets act in XML (that is they made incorrect assumptions about 
encodings, or even had no thoughts about encodings at all, an error which some 
environments and languages can lead the naïve too).

<NEL> and <LSEP> is added to your list of characters affected by whitespace 
normalisation for XML1.1. Possibly some people implemented the suggestion in 
<http://www.w3.org/TR/newline> before 1.1.


Reply via email to