Dan Sugalski writes:
: Fair enough. I think there are some cases where there's a base/combining
: pair of codepoints that don't map to a single combined-character code
: point. Not matching on a glyph boundary could make things really odd, but
: I'd hate to have the checking code on by default, since that'd slow down
: the common case where the string in NFC won't have those.
Assume that in practice most of the normalization will be done by the
input disciplines. Then we might have a pragma that says to try to
enforce level 1, level 2, level 3 if your data doesn't match your
expectations. Then hopefully the expected semantics of the operators
will usually (I almost said "normally" :-) match the form of the data
coming in, and forced conversions will be rare.
That's how I see it currently. But the smarter I get the less I know.
Larry