Re: Digit/letter variants in the "same" unified script (was: stability policy on numeric type = decimal)

Philippe Verdy Thu, 29 Jul 2010 09:22:06 -0700

"Mark Davis ☕" <[email protected]>
> It is not so strange. Read
> http://www.unicode.org/reports/tr24/proposed.html#Multiple_Script_Values,
> and other parts of #24 describing Common.


It is exactly because I had read this proposed update for UTS#24 that
I used my argument (if not, I would have not spoken about the
ExtendedScript property in my report : isn't it made to use more
precise mappings to ISO 15924, including script variants ?).

Nothing would be special about "Common" : "sc=Arabic" alias "sc=Arab"
could use the same formalism (also used for and "Hani", "Jpan" that
are defined as multiple scripts or script variants) to subdivide it
with the new "extended script" property.

It's true that for now, Unicode is unable to make distinctions between
"Hans" and "Hant" on just the encoded abstract characters (so for them
we have "sc=Hani" only, but an "extended script" property could make
more precise mappings, without being completely bound to the stability
policy).

But it does not mean that texts and localization resources can't make
such distinctions by external tagging, or in stylesheets, or in
romanization schemes. And librarians (and book readers) already make
distinctions as well between  Eastern and Western versions of the
unified Arabic.

It could even have benefit within IDNA to help diagnose those digits
that have confusable forms in the two variants (even if there's a work
in progress for defining the confusables needed for IDNA), and adding
the extra ISO 15924 codes (for Arabic variants) won't break Unicode
(after all there are already variants for Latin and Sinograms, exactly
because of these "font variants").

Re: Digit/letter variants in the "same" unified script (was: stability policy on numeric type = decimal)

Reply via email to