Patrick R. Michaud via RT wrote:
> (3)  Case conversions on a string *can* cause its length to change -- in
> particular, the character "ß" (U+00DF) becomes "SS" when converted to
> uppercase.  (I'm not sure that we have any tests for this at present,
> and it probably doesn't work when ICU isn't present.)

How can you doubt, when a German hacker takes care of the test suite?
after all we're about the only ones who use that weird character ;-)

It's in t/spec/S32-str/uc.t, line 45, for regexes this is tested in
S05-modifier/ignorecase.t.

Actually there might be codepoints that turn into multiple codepoints on
conversion to upper case; in particular if there's a precomposed
character of a lower case letter and some diacritics, but no upper case
equivalent precomposed character exists in the Unicode repertoire.
(I don't know if such a thing actually exists, but it's entirely possible).

Cheers,
Moritz

Reply via email to