On 26/11/2003 02:29, Philippe Verdy wrote:

[EMAIL PROTECTED] wrote:


Briefly, it's my opinion that applications which claim to support
and comply with Unicode should not 'step on' Unicode text. Any
loopholes in the 'letter of the law' which allow applications to
mung or reject Unicode text should be plugged.



If this "pluging" request must be done, it should be also the case for HTML and XML. For now, combining characters can be encoded directly just after a quote character (single or double) used to mark the beginning of an attribute value, or just after a tag-closing ">". HTML and XML parsers will parse these quotes or superior signs by ignoring the combining sequence, creating defective sequences, but this is a problem.

...


Why is this a problem? Quotes and ">" with combining marks are presumably not legal HTML or XML; and so the interpretation of a quotes or ">" followed by combining marks as a quote or ">" and a defective combining sequence is unambiguous, surely? There could of course be problems if there were any precomposed combinations of quotes or ">" with combining characters, but I don't think there are any, are there?

Your proposed solution to the problem is messy in requiring the use of numeric entities, and unnecessary.

--
Peter Kirk
[EMAIL PROTECTED] (personal)
[EMAIL PROTECTED] (work)
http://www.qaya.org/





Reply via email to