RE: Unicode 11 Georgian uppercase vs. fonts

Peter Constable via Unicode Fri, 20 Jul 2018 00:23:47 -0700

IMO, the correct answer is 2, except that “all common fonts” is more sweeping 
that necessary: it’s sufficient to have fonts used for fallback in platforms 
and browsers, and the related fallback logic, to get updated. Of course, that 
takes some time, and it’s not even two months since Unicode 11 was released. 
The Georgian community understood that it would take time to get 
implementations in place, and that they would need to take measures to smooth 
over that transition — which can include having Web sites for Georgian 
businesses and institutions using fonts to match the requirements of the 
content.

Peter

From: Unicore <unicore-boun...@unicode.org> On Behalf Of Markus Scherer via 
Unicore
Sent: Wednesday, July 18, 2018 3:05 PM
To: unicore UnicoRe Discussion <unic...@unicode.org>
Cc: mark <m...@macchiato.com>
Subject: Unicode 11 Georgian uppercase vs. fonts

Dear fellow Unicoders,

We’ve run into some significant problems with the Georgian capital letters 
added in Unicode 11. If you have run into them yourselves, or have feedback on 
our brainstormed solutions below, we’d love to hear your thoughts.

Here's the problem. The vast majority of Georgian fonts do not yet have the new 
uppercase characters. So when any system uses case mapping to uppercase text 
(e.g. browsers interpreting CSS’s text-transform: capitalize), then the users 
of Georgian will see boxes (“tofu”) if the font they are using does not have 
the glyphs.

For example, a program constructs a web page with buttons. It uses a CSS style 
to uppercase text in buttons, as a house style. Unless the user has a very 
up-to-date font, they see tofu (boxes). If a server does backend rendering, its 
font has to be very up-to-date. We also saw this problem in a program that was 
doing titlecasing, but on the first character it used the uppercase mappings 
rather than titlecase mappings. Not the right thing to do, of course, but code 
that accidentally works (most of the time) doesn't get fixed if nobody reports 
a bug about it.

All of these will result in bad bugs in the UI, in software that formerly 
worked fine.

We brainstormed some options to fix this:

  1.  Get all call sites to change their code to not uppercase Georgian (and 
fix titlecasing to use the titlecase mappings, not the uppercase mappings). 
Since we have no control over call sites and release cycles of affected 
software, this would not help Georgian users for a long time, if ever. We’d 
eventually want to retract these changes, creating even more work.
  2.  Change all common fonts with Georgian characters to add the U11.0 ones. 
This should eventually happen but would probably take a couple of years at 
least, which does not help users in the short term.
  3.  Hack font CMAPs to just map the new characters to the glyphs of the old 
ones. Works but only when a programmer can control the fonts used, such as with 
server-side rendering or downloadable fonts.
  4.  Remove the uppercase mappings for Georgian, until the fonts catch up.

     *   Would at least have to be done in all browsers, otherwise web apps 
will still break for Georgian.
     *   A broader alternative is to do it in ICU. Because that is used by the 
majority of the browser implementations, it would solve the short-term problem 
for the browsers — and many other programs. Drawback: Non-conformant, and 
uppercasing will be inconsistent depending on who has which variant of ICU 
(with vs. without hack, on top of: with Unicode 11 vs. before Unicode 11).

        *   One precedent is that in CLDR we deliberately hold back from using 
new currency characters until the font support is sufficiently widespread. 
(Wishing we'd held back the uppercase mappings in Unicode 11.0 too!)

Mark & Markus

RE: Unicode 11 Georgian uppercase vs. fonts

Reply via email to