Kenneth Whistler <kenw at sybase dot com> wrote:

> But I challenge you to find anything in the standard that
> *prohibits* such sequences from occurring.

I've learned that this question of "illegal" or "invalid" character
sequences is one of the main distinguishing factors between those who
truly understand Unicode and those who are still on the Road to
Enlightenment.

Very, very few sequences of Unicode characters are truly "invalid" or
"illegal."  Unpaired surrogates are a rare exception.

In almost all cases, a given sequence might give unexpected results
(e.g. putting a combining diacritic before the base character) or might
be ineffectual (e.g. putting a variation selector before an arbitrary
character), but it is still perfectly legal to encode and exchange such
a sequence.

-Doug Ewell
 Fullerton, California
 http://users.adelphia.net/~dewell/


Reply via email to