Around 9 o'clock on Jul 5, Brian Stell wrote:

> The OpenType/TrueType spec has been actively in use for a while http://
> www.microsoft.com/typography/otspec/os2.htm#cpr and certainly hints at a
> set of languages. Seeing as many web pages are designed on/for Microsoft
> Windows systems it seems likely that many web pages will implicitly use
> these language sets.

As you know, I'm currently using the OS/2 code page range bits to identify 
font languages.  The problem is that these incorporate both too little (what,
no Samoan?) and too much information (korean wansung vs korean johab).

In addition, much information on the web is tagged with RFC 3066 language 
tags, both HTML and email.

While RFC 3066 is imperfect, largely because ISO 639-2 isn't a great spec 
as it lists only 400 languages and doesn't really document the ones that 
it does list, it does map easily to locales and should sufficiently 
identify languages used by the majority of the common text people see.

One important thing to keep in mind -- with the distinction between strong 
and weak family bindings, applications now are in full control of the font 
used when they specify a particular family name.  The font language tags 
are designed to find a suitable font for text in an unexpected language 
and serve only to pre-select fonts likely to represent the entire language 
in a single font.  Fontconfig will always support selecting fonts by 
Unicode coverage.  Except for the ransom-note appearance, fontconfig 
will always locate a font with real coverage for each Unicode codepoint.

> This seems reasonable. If a Japanese user wanted the ASCII
> text in Helvetica and the Japanese text in Mincho how would 
> they specify this?

"It depends."

For a wysiwyg editor, it would be trivial -- just select 'helvetica' for 
the ASCII portions and 'mincho' for the Japanese portions.

For a document with language tags and an application using the generic 
'sans-serif' alias, it would be easy as well, just configure your 
sans-serif alias to include 'helvetica' before 'mincho'.

For a document without language tags, and an application using just the 
generic 'sans-serif' alias, I can't think of a way to make it work except 
by hacking your locale (en-ja anyone?).

To make this work, you'd have to build some datastructure which mapped 
certain Unicode ranges to specific fonts, and then order those fonts.

One difficulty is that such a list would have to be language-specific -- 
you can't have a global list that includes both Chinese fonts 
and Japanese fonts as a font covering GB 2312 or Big5 will also contain a 
large number of Japanese glyphs.

Hmm.  One possibility is to create "strong aliases" in the font 
configuration -- allow the user to do:

        <match>
                <test name="family">
                        <string>mincho</string>
                </test>
                <edit name="fmaily" binding="strong" mode="insert">
                        <string>helvetica</string>
                </edit>
        </match>

Now the return value from FcFontSort will place Helvetica before Mincho.  
However, simplistic applications which can only deal with a single font 
will be lost here -- they'll get helvetica and be unable to display 
Japanese at all.

Keith Packard        XFree86 Core Team        HP Cambridge Research Lab


_______________________________________________
Fonts mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/fonts

Reply via email to