That just really isn't a script issue; it is more an issue of which language orthographies use which characters, and we have provision for that information in CLDR.
Mark *— Il meglio è l’inimico del bene —* On Thu, Jul 29, 2010 at 09:07, Philippe Verdy <[email protected]> wrote: > "Mark Davis ☕" <[email protected]> > > It is not so strange. Read > > http://www.unicode.org/reports/tr24/proposed.html#Multiple_Script_Values > , > > and other parts of #24 describing Common. > > It is exactly because I had read this proposed update for UTS#24 that > I used my argument (if not, I would have not spoken about the > ExtendedScript property in my report : isn't it made to use more > precise mappings to ISO 15924, including script variants ?). > > Nothing would be special about "Common" : "sc=Arabic" alias "sc=Arab" > could use the same formalism (also used for and "Hani", "Jpan" that > are defined as multiple scripts or script variants) to subdivide it > with the new "extended script" property. > > It's true that for now, Unicode is unable to make distinctions between > "Hans" and "Hant" on just the encoded abstract characters (so for them > we have "sc=Hani" only, but an "extended script" property could make > more precise mappings, without being completely bound to the stability > policy). > > But it does not mean that texts and localization resources can't make > such distinctions by external tagging, or in stylesheets, or in > romanization schemes. And librarians (and book readers) already make > distinctions as well between Eastern and Western versions of the > unified Arabic. > > It could even have benefit within IDNA to help diagnose those digits > that have confusable forms in the two variants (even if there's a work > in progress for defining the confusables needed for IDNA), and adding > the extra ISO 15924 codes (for Arabic variants) won't break Unicode > (after all there are already variants for Latin and Sinograms, exactly > because of these "font variants"). >

