Re: Unicode handling

Larry Wall Tue, 27 Mar 2001 07:04:11 -0800

Dan Sugalski writes:
: Fair enough. I think there are some cases where there's a base/combining 
: pair of codepoints that don't map to a single combined-character code 
: point. Not matching on a glyph boundary could make things really odd, but 
: I'd hate to have the checking code on by default, since that'd slow down 
: the common case where the string in NFC won't have those.

Assume that in practice most of the normalization will be done by the
input disciplines.  Then we might have a pragma that says to try to
enforce level 1, level 2, level 3 if your data doesn't match your
expectations.  Then hopefully the expected semantics of the operators
will usually (I almost said "normally" :-) match the form of the data
coming in, and forced conversions will be rare.

That's how I see it currently.  But the smarter I get the less I know.

Larry

Re: Unicode handling

Reply via email to