2011/7/25 Peter Constable <peter...@microsoft.com>: > From: ver...@gmail.com [mailto:ver...@gmail.com] On Behalf Of Philippe Verdy > >> What would be the behavior of a font that would use GSUB entries (or >> ligatures) in a feature to implement the reordering that NO renderer >> currently implements for Buginese ? What will happen later if the >> renderer does implement it ? > > Your question is no coherent: OpenType features cannot be used to trigger > re-ordering.
Hmmm... Your reply is also incoherent: (1) There are lots of OpenType features registered that actually perform contextual reordering in Indic scripts, including when they are in fact mandatory for that script (example for repha forms of ra, or to move ra to a later position after another base consonnant, to make it shown on the next vowel, or other exceptions needed in khmer, lao,...). (2) These features were even registered by Microsoft. (3) Some of them are for pre-base reordering, other contain exceptions to the usually "mandatory" pre-base order, to change it in a post-base form in some other contexts. >> Does the OpenType specification allow specifying a temporary override >> for the missing renderer reordering capabilities ? > > No, and I don't see how that would make any sense: if a rendering system > support Buginese script, then it supports it and does the reordering > necessary. It either supports it or it doesn't. What I asked is if it is possible to have another feature, that would be triggered and enabled by default (and should occur before the nukta feature and other similar features like repha forms) and tagged with the Buginese script, unless the renderer knows that it supports itself the reordering of prepending vowels for that Buginese scripts (in which case that feature would be ignored). This is what I would call a smooth transition : existing renderers would work with a font presenting that feature, and future renderers that perform the necessary reordering would ignore it and would not even require that a Buginese script contains this feature. >> Note: The Microsoft Font Validator (found in Microsoft Typography >> website, section for Downloadable Tools) still does not recognize bit >> 96 of the ulUnicodeRange field, officially defined for the Buginese >> block range (U+1A00..U+1A1F), and reports an error if this bit is set. > > I'll report that to the team that maintains that tool. Thanks. It should also correctly parse the "head" table instead of reporting this (non-documented) internal exception in the validation report: E0041 : An exception occurred preventing completion of table validation" System.FormatException: Le format de la chaîne d'entrée est incorrect. à System.Number.StringToNumber(String str, NumberStyles options, NumberBuffer& number, NumberFormatInfo info, Boolean parseDecimal) à System.Number.ParseDouble(String value, NumberStyles options, NumberFormatInfo numfmt) à System.Double.Parse(String s, NumberStyles style, NumberFormatInfo info) à OTFontFileVal.val_head.Validate(Validator v, OTFontVal fontOwner) à OTFontFileVal.OTFontVal.Validate() >> And the Fonts folder in Windows 7 Explorer does not say that the font >> effectively supports Buginese (a Buginese font says that it supports no >> script at all, even if all code points assigned in the Buginese block are >> mapped, and bit 96 is set in Unicode Ranges of the header). > > Two issues: > > 1) Windows 7 does not provide text-display support for Buginese script. OK, so Uniscribe (and IE) does not perform the reordering. It's then impossible to display correctly encoded Buginese text on Windows with Uniscribe. Other renderers will be needed (but Pango does not know that reordering rule too, and none of the tested browsers on Windows are working). It seems that the script is supported only on MacOS, where there are effectively commercial Buginese fonts designed for Mac (example one font from Xerox : I've not tested it, I would need a Mac before even buying that font). > 2) The scripts show in the "Designed for" column in the Fonts control panel > in Windows 7 does not make use of the UnicodeRanges fields in the OS/2 table. > There are a few reasons for this: > - that data is not all that reliable since there's no consistent practice in > how it is set (there's no metric to decide when a bit should or shouldn't be > set); > - the UnicodeRanges fields are not scalable into the future (they were > exhausted with Unicode 5.1); and > - the UnicodeRanges fields are typically set based on some sense of "can > display" whereas what we were thought was much more useful to users was to > indicate "was designed for". For example, MS Gothic _can_ display English > text, but we think it's not a particularly useful choice for English users > since that's not the audience it was designed for. The intent is to give > useful recommendations that help users differentiate relevant options from > distracting noise. > Rather than using the OS/2 data, the Fonts cpl uses metadata outside the > font. Unfortunately, it has it only for a certain set of fonts that were > known when we shipped to be on most systems; so, if you add a Buginese font, > the metadata will not include that font. It's strange : many new international fonts have been added after the release of Windows 7. And the CPL explorer extension still detects that the fonts support some scripts. How does it perform the test? By counting the mapped glyphs? If so it could easily detect Buginese by counting that there are at least 28 glyphs mapped from code points in the Buginese block. >> This is the case for all ulUnicodeRange bits defined now after >> bit number 87, i.e. the Deseret block of the UCS, meaning that >> the validator and the Windows 7 text renderer and Fonts >> Explorer are still only based on the (now very old) Unicode 4.1 >> of... 2003 (with the Deseret additions) or even before in 1996 >> with Unicode 3.1 only. Who's late ? > > Font Validator may be out of date; as mentioned, I'll pass that on to the > relevant team. As for the Fonts control panel, as mentioned it doesn't use > ulUnicodeRange fields at all; but you have spotted a bug in our metadata: > Deseret should be listed for the Segoe UI Symbol font. OK, is it possible to have the Saweri and Code2000 fonts recognized (these two free fonts are widely advertized as a possible solution for the Buginese edition of Wikipedia, but for now this edition mostly use the Latin script for that language). I was asked on Wikipedia to design a test page for the script, but I was completely unable to do that. All I could make was to try adapting the page presenting the [[Lontara script]] with: - a few text samples (but not sure that the samples are logically encoded, it seems that they are visually encoded in some places, and one word is most probably incoherent with its Latin transcription), - and in the Unicode block chart where the vowel e is effectively rendered after the base glyph: the chart on English Wikipedia currently uses a dotted circle symbol (but there's no warranty that reordering would occur with that symbol in a compliant renderer), whereas the French Wikipedia page presents all Buginese diacritics with the Buginese base letter ka (U+1A00 : it should really work). This brought me to the question of testing other South-East Asian Brahmic scripts, like Hanunoo, Buhid, Javanese, or Balinese. It seem that they have the same rendering problem in a few cases for prepended vowels (plus other problems remaining in Khmer and Burmese for some contextual forms). The rendering problem will be recurring with all other pending Brahmic scripts (still not encoded) that feature prepended diacritics. Why can't we have now a registered OpenType feature for handling those mandatory contextual reorderings (at least for the most frequent cases), waiting for a full support of the script in text renderers?