Simon Josefsson wrote: > >> I'm looking for an implementation of the toCasefold(X) operation defined > >> in Unicode 6.0 section 3.13 page 114 [1] like this: > >> > >> R4 toCasefold(X): Map each character C in X to Case_Folding(C). > >> > >> • Case_Folding(C) uses the mappings with the status field value “C” or > >> “F” in the data file CaseFolding.txt in the Unicode Character > >> Database. > ... > But does u32_casefold match Unicode toCasefold? Is it possible to > disable the SpecialCasing stuff?
SpecialCasing.txt applies to toUpper, toLower, toTitle mappings. For toCasefold, all mappings are given in CaseFolding.txt, namely: - the locale independent mappings (type 'C' and 'F'), - the locale dependent mappings (type 'T') - this is similar to SpecialCasing.txt. u32_casefold uses all of these mappings. And when you pass an empty string as ISO639_LANGUAGE, it uses only the locale independent mappings (type 'C' and 'F'), hence it matches what toCasefold does. Bruno -- In memoriam John Penry <http://en.wikipedia.org/wiki/John_Penry>