On 10/08/2004 18:33, Jon Hanna wrote:
...
As for modern markup, consider if instead of ̄ you had ̸
By the rules of XML that is treated as if the character U+0338 was there rather
than the escape sequence.
By the rules of Unicode the sequence U+003E, U+0338 is treated the same as the
character U+226F.
By the rules of XML replacing ≯ with U+226F would mean the document was
no longer well-formed.
So even without an explicit spec saying otherwise the above would be
problematic.
This means that the rules of XML conflict with the rules of Unicode. If
the string is a Unicode string, U+226F is canonically equivalent to
<U+003E, U+0338> and therefore any higher level protocol should treat
the two sequences as identical, rather than reject one of them as
causing the document to be ill-formed.
--
Peter Kirk
[EMAIL PROTECTED] (personal)
[EMAIL PROTECTED] (work)
http://www.qaya.org/