Mark *— Il meglio è l’inimico del bene —*
On Sat, Jun 11, 2011 at 08:04, Karl Williamson <pub...@khwilliamson.com>wrote: > On 06/08/2011 03:33 PM, Mark Davis ☕ wrote: > >> As to the first, it would seem reasonable. The simple folding is not >> covered by the following stability policies: >> >> http://www.unicode.org/**policies/stability_policy.**html#Case_Folding<http://www.unicode.org/policies/stability_policy.html#Case_Folding> >> http://www.unicode.org/**policies/stability_policy.**html#Case_Pair<http://www.unicode.org/policies/stability_policy.html#Case_Pair> >> >> However, the committee may be leery of changing these even though they >> are not covered by those policies. You can file a request form for the >> committee to consider it, at >> http://unicode.org/reporting.**html<http://unicode.org/reporting.html> >> >> The other two are special cases; they casefold together because of the >> way that the full case mapping is computed. Their equivalence is >> normally captured by a canonical-equivalent folding. Because the simple >> folding is only codepoint by codepoint, and only resulting in single >> code points, they can't be added. >> >> I didn't understand the sentence above. But would it be fair to say that > a plausible case could be made for FB06 folding to FB05 simply, but that > there really shouldn't be a simple fold for the other two cases? > Yes, that's what I mean. You can propose all three if you want, via the reporting form, but I think only #1 is a real possibility (IMO). > > Mark >> >> /— Il meglio è l’inimico del bene —/ >> >> >> On Sun, Jun 5, 2011 at 08:17, Karl Williamson <pub...@khwilliamson.com >> <mailto:public@khwilliamson.**com <pub...@khwilliamson.com>>> wrote: >> >> There are three pairs of characters in Unicode 6.0 in which each >> member of the pair has a full fold to the same sequence, yet there >> is no simple fold relation between them. They are: >> >> U+FB05 LATIN SMALL LIGATURE LONG S T and >> U+FB06 LATIN SMALL LIGATURE ST >> both fold to 'st'; >> >> U+0390 GREEK SMALL LETTER IOTA WITH DIALYTIKA AND TONOS >> U+1FD3 GREEK SMALL LETTER IOTA WITH DIALYTIKA AND OXIA >> both fold to the sequence "U+03B9 U+0308 U+0301" or (the dot >> standing for concatenation) >> GREEK SMALL LETTER IOTA . COMBINING DIAERESIS . COMBINING ACUTE ACCENT >> >> U+03B0 GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND TONOS >> U+1FE3 GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND OXIA >> both fold to the sequence "U+03C5 U+0308 U+0301" or >> GREEK SMALL LETTER UPSILON . COMBINING DIAERESIS . COMBINING ACUTE >> ACCENT >> >> Under full case folding rules, each member of one of these pairs is >> caselessly equivalent to the other member, even without adding NFD >> rules. Correct me if I'm wrong, but shouldn't they also be >> caselessly equivalent under simple folding rules? If so, I'm >> wondering what issues there would be in creating an S rule for these >> pairs in CaseFolding.txt, so that they would be considered >> caselessly equivalent even for applications that don't do full case >> folding? >> >> >> >> >> >> >> > >