John Cowan <[EMAIL PROTECTED]> wrote :

> [EMAIL PROTECTED]
> scripsit:
> 
> > (The XML1.1 spec removes a few of those characters, I would have
> > removed more, but that's another issue).
> 
> You have no idea what fearful drubbings I had to administer to get
> even the few removed that I did.

Well I have a general tendency towards being liberal in these matters (as I've said 
before allowing nonsense is *sometimes* a good way to ensure you allow edge cases) so 
I can see where objectors would be coming from.

> > [D]oes ISO 10646 allow those characters even though Unicode has them
> > undefined?
> 
> No, it doesn't.  There was a strong feeling in the W3C Core WG that
> it be possible to handle the Astral Planes uniformly; every character
> off the BMP, therefore, is a valid Char as well as a valid NameStartChar.

Hmm. To my mind that isn't uniform at all - someone familiar with Unicode would have 
already disallowed, say U+4FFFE, as a non-character before they got as far as the 
production (making it effectively excluded) where someone else relying on the XML spec 
for information about character properties would allow it.

Maybe CharMod will safe us all...





Reply via email to