On 7/27/2018 3:42 AM, Michael Everson via Unicode wrote:
Yes and it explains clearly that “effectively caseless Georgian” is incorrect. Georgian has case. Georgian uses case differently from other scripts. This is an orthographic distinction, not a structural one. In fact as it is also stated in the proposal, there are 19th-century texts which do titlecase. It’s just that that orthography is no longer in use and that behaviour no longer desirable.

"Georgian uses case differently from other scripts"

That's one of the key issues here for developers (and users) of libraries. Because it means that any implicit assumptions about the applicability of a certain case-transform is now broken.

This goes beyond whether fonts are actually installed now or at the end of some transition period, or ever: if functions like ToUpper, which used to have no effect on Georgian before, suddenly do - in ways that the users of the script do not expect, then your application is broken, from one day to the next.

The current situation prior to the change is perhaps best characterized by saying that there was support for some locale differences in the way certain characters were mapped, but not in whether or not to do a given mapping at all.

If, as has been suggested, the use of case in Georgian is more similar to that of smallcaps in other scripts, then, instead of ToUpper doing a case transformation for Georgian, what would be need is something like a "ToSmallCaps" function (better name here, because the Georgian letters aren't actually "small caps").

That way, the existing "ToUpper" could retain its implicit semantic of "uppercase transformation in those scripts where such transformations are used in a common way".

This would solve 1/2 of the problem, which is to prevent uppercasing where users of Georgian do not expect it. However, it does not work in plain text for the other scripts, because there, small caps are not encoded, so there's no plain-text solution.

To get back to Markus' original question on how to handle this for ICU: it seems more and more that Georgian should be exempted from standard library functions and that a new function needs to be added that just transforms Georgian and leaves all other scripts alone (or one that takes a language/local parameter).

A./


Reply via email to