Re: [Fonts] Combining characters

2003-09-06 Thread Jungshik Shin
On Sat, 6 Sep 2003, Anuradha Ratnaweera wrote:

 Let me put this in a simple point form using a hypothetical example:

 Now, if I want to render character 51 of X inplace of the composite
 character 4001+4010, how should I proceed?  Is there a way to map
 unicode sequences to actual (physical) fonts.  Prefarably in the form:

 4001,4010 - X,51

  Your problem is not new and has been worked on for many years
by a number of people and today we have a few satisfactory solutions.

  You crossposted to a few lists I subscribed to. Although
I've already answered to you on gtk-i18n list, here I'm gonna
give you some URLs:

  http://www.microsoft.com/typography/specs/default.htm
  http://www.pango.org (and the source code of Pango
  available at http://cvs.gnome.org. Take a look at
  files in pango/modules/indic and pango/modules/thai. You can
  also take a look at the ICU source code)
  http://graphite.sil.org
  http://developer.apple.com/fonts/

  Jungshik
___
Fonts mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/fonts


[Fonts] Re: After-XTT's extension of the encoding field.

2003-08-14 Thread Jungshik Shin
On Thu, 7 Aug 2003, Mike FABIAN wrote:

 Jungshik Shin [EMAIL PROTECTED] :

  On Sat, 2 Aug 2003, Chisato Yamauchi wrote:
 
Have you seen CJK's *TYPICAL* fonts.dir of TrueType fonts?
  It is following:
 
   Not many people would be fond of tweaking fonts.dir/scale files
  these days :-)

 It can be automatically generated.  The /usr/sbin/fonts-config script
 on SuSE Linux generates such TTCap entries automatically into the
 fonts.dir if it detects that xtt is enabled in /etc/X11/XF86Config.

 That sounds nice.  It'll certainly make things easier. However, it could
make some people frustrated if it just overwrites the existing fonts.dir
(I don't know whether fonts-config on SuSE Linux does that or not)
that was 'hand-tweaked' to their satisfaction. In the past, I made it
a rule to back up fonts.dir/fonts.scale after losing heavily customized
fonts.dir/fonts.scale to an automated tool a couple of times.

 I agree that the old X fonts are broken beyond repair and we should
 move on to use fontconfig/Xft as much as possible.

 The old font system must be kept for backwards compatibility of course
 but it is probably just a waste of effort to add more extensions the
 X11 core font system.

  Much better said than mine. This is exactly what I meant but apparently
my choice of words was not that good. If I had thought that support
for X11 core fonts need to be removed _now_, I wouldn't have spent my time
on gb18030.2000-1 issue (Xfree86 bug 441) let alone fixing  bugs in CJK
font encoding files for freetype module last year.

  Jungshik

___
Fonts mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/fonts


Re: [Fonts] Re: Problem of Xft2

2003-08-14 Thread Jungshik Shin
On Sat, 9 Aug 2003, Jungshik Shin wrote:
 On Fri, 8 Aug 2003, Pablo Saratxaga wrote:

  That being said, it would be nice to have the ability to do
  user-configuration
  of glyph substitutions in gtk2; eg telling that when a given font  is
  choosen, then characters of range 0x00-0xff should be ignored, and taken
  from font  instead. The ascii range of some CJK fonts is simply
  too ugly... or even bugged in some cases.

   That doesn't need to be that complex. Simply allowing CSS-style
 fontlist is more than enough. That is, offering a UI for specifying an
 _ordered_ list of fonts (instead of just one font, generic or specific)
 should work well. That is, by putting a good Latin(-only) font, a
 Cyrillic(-only) font, and a Greek(-only) font before a CJK font followed
 by a generic font (e.g. Serif), you can get the best of all fonts.
 This UI needs to be a part of the system-wide 'control panel'.

  I have to correct myself. This does not work well when font selection
is done in tandem with 'lang' ('lang' given a very large weight) and
_without_ actually going through a run of text to render, which is often
the case.

What you described may be necessary in the following scenario.  Suppose we
specify Courier, MingLiu' for a block of text marked as 'zh-TW'. Because
Latin letters in CJK fonts are not so good, we specify 'Courier' before
'MingLiu' expecting Latin letters to be rendered by Courier and Chinese
characters to be rendered by MingLiu[1]. If the font selection is made
solely based on the font list (ordered) and lang. (with 'lang' given a
large weight), only 'MingLiu' would be selected because 'zh-TW' is not
covered by Courier. As a result, all characters end up being rendered by
MingLiu.  Char-by-char font selection doesn't have this problem. However,
it's likely to be slower.  Going through a run of text before choosing
a font/a set of fonts may work better but it may be even slower. Staying
in a single font as long as possible is another possibility.

  Of course, if 'lang' is not taken into account and just the ordered
list of fonts is used in glyph/font search, we'd not have the above
problem. On the other hand, unless the font list is carefully selected,
one may get ransom-note style rendering in some cases.

Jungshik

[1] In some case, exactly the opposite is desired under the premise that
glyphs of Latin letters in a CJK font are designed to match well with
CJK characters in the font. This works well just as it is now.

___
Fonts mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/fonts


Re: [Fonts] Re: After-XTT's extension of the encoding field.

2003-08-14 Thread Jungshik Shin
On Fri, 8 Aug 2003, Mike FABIAN wrote:
 Jungshik Shin [EMAIL PROTECTED] :
  On Thu, 7 Aug 2003, Mike FABIAN wrote:

  It can be automatically generated.  The /usr/sbin/fonts-config script
  on SuSE Linux generates such TTCap entries automatically into the
 
  make some people frustrated if it just overwrites the existing fonts.dir
  (I don't know whether fonts-config on SuSE Linux does that or not)

 Yes, it does.

details on how Mike's font-config script works... snipped

  Thanks you for the details. It seems that you gave
a lot of thought to the script and that it meets my need (if I have to
tweak fonts.scale/dir files ever again :-))

 Jungshik
___
Fonts mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/fonts


[Fonts] Re: Problem of Xft2

2003-08-14 Thread Jungshik Shin
On Fri, 8 Aug 2003, Pablo Saratxaga wrote:
 On Fri, Aug 08, 2003 at 06:59:43PM +0900, Chisato Yamauchi wrote:

But Gtk2 has not complete font-substitution mechanism.
  Therefore, Gtk2 is insufficient in CJK environment.

 GTk2, using pango, has builtin fontset mechanism.
 (it is always enabled, and automatically build, depending on language
 and language coverage of available fonts).

  Certainly this is true as long as you use Pango, but not all
Gtk2 applications use Pango.  Moreover, the font selection widget in Gtk2
does not have the UI  to let users specify multiple fonts (CSS-like).
Apparently, Qt has this UI according to Yamuchi-san.

  So I *NEVER* use Gtk2-mozilla.  It has no flexibility of a
  font setting.

 Mozilla doesn't use Gtk2/pango text rendering mechanisms to render
 html pages.
 So, you cannot judge the font abilities of Gtk2 toolkit with mozilla.

  Well, when rendering html/xml pages, Mozilla has its own 'fontset/font
substitution' mechanism of a sort (based on fontconfig in case of
Xft build. X11core build is very complicated partly because it has to
support the CSS-style font list in its own without any help of fontconfig
fielding through 'the jungle of XLFD-based fontnames.) that is very
similar to what you wrote above about Pango. Otherwise, how could it
support CSS-style font list?


 Gtk may choose automatically a font that looks funny, but at least a character
 is always displayed in a readable way, I prefer it that way.

  I guess just saying Gtk(2) is a bit misleading. Gnome-terminal
is a Gtk(2) application, but by default it doesn't use Pango and
it does not do 'automatic font substitution' as you described. Set
Gnome-terminal font to 'Courier' and see how CJK characters (or any character
not covered by Courier) are rendered. They all come in empty boxes.


 That being said, it would be nice to have the ability to do user-configuration
 of glyph substitutions in gtk2; eg telling that when a given font  is
 choosen, then characters of range 0x00-0xff should be ignored, and taken
 from font  instead. The ascii range of some CJK fonts is simply
 too ugly... or even bugged in some cases.

  That doesn't need to be that complex. Simply allowing CSS-style
fontlist is more than enough. That is, offering a UI for specifying an
_ordered_ list of fonts (instead of just one font, generic or specific)
should work well. That is, by putting a good Latin(-only) font, a
Cyrillic(-only) font, and a Greek(-only) font before a CJK font followed
by a generic font (e.g. Serif), you can get the best of all fonts.
This UI needs to be a part of the system-wide 'control panel'.

Falling short of that, applications like Gnome-terminal should at least
(the same is true of Konsole) offer a way to specify East Asian font
separately (double/full-width) as is done by xterm, vim, OpenOffice and
MS Office. Because Gnome-terminal and Konsole don't have this feature,
I still prefer to work in xterm for which I can specify my favorite
font for single-width characters along with my favorite font for
double-width characters (with '-fw' option. I'm gonna add '-faw' option
to xterm)

Jungshik
___
Fonts mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/fonts


[Fonts] Re: Terminal versus X11 fonts

2003-08-14 Thread Jungshik Shin
On Thu, 7 Aug 2003, Steve Sullivan wrote:

 For example, the Terminal edit current profile gui shows
 the Miriam font, but Miriam isn't listed by xfontsel or xlsfonts.

  There are two separate font systems, the X11 core font system and
the client-side system with Xft/fontconfig.  What you get with
xlsfonts/xfontsel is X11 core fonts.  'Terminal' in RedHat 9 uses the
client-side font system (Xft/fontconfig based).

You can make Miriam and other fonts available as X11 core fonts
with freetype/Xtt/type1 backends if they're of a type supported by them.

http://www.xfree86.org/4.3.0/fonts.html has all the gory details  about
XF86 font systems.  For (After) X-TT, see http://x-tt.sourceforge.jp/

  A lot of people believe that the client-side font system is the way to go
(although the core font system will be around for a long time to come) so
that you may consider writing your application with the client-side font
system (especially, if I18N - internationalization - is important
to your projects/programs). You may also want to take a look at
http://fontconfig.org and http://www.pango.org

  Jungshik
___
Fonts mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/fonts


Re: [Fonts] After-XTT's extension of the encoding field.

2003-08-02 Thread Jungshik Shin
On Sat, 2 Aug 2003, Chisato Yamauchi wrote:

   Although the pliability of handling such special fonts is also important,
 non BMP plane in XLFD is now the most important problem.  Confusion is
 already seen such as linux-utf8 list.  An official definition should be
 indicated right now.  Why has XFree86 left this?

  That's because XFree86 is moving away from 15year-old XLFD-based
approach. As Owen wrote, we'd better let that poor thing rest in peace
and move along. With fontconfig/Xft, we don't need to worry about XLFD
any more except for the sake of backward compatibility.  For non-BMP
characters, there isn't much issue with back. comp.  to worry about.

If you take a look at Mozilla's gfx/src/gtk/nsFontMetricsGTK.cpp
and gfx/src/gtk/nsFontMetricsXft.cpp
(or gfx/src/windows/nsFontMetricsWin.cpp) at
http://lxr.mozilla.org/seamonkey, you'll know what I mean.
Mozilla developers have put tremendous amount of 'heroic' efforts to make
CSS-style font selection work with XLFD-based font names. However,
a much simpler and shorter fontconfig based code(in
nsFontMetricsXft.cpp) works better that nsFontMetricsGTK.cpp (for XLFD-based
font names).

Adding yet another field to make XLFD more complex doesn't help a bit
in this respect. Besides, in your example (GT fonts), I don't see why
you need to extend XLFD. Couldn't you just use different numbers in
the last field of XLFD?

gt21.ttf -gt-mincho-medium-r-normal--0-0-0-0-c-0-gt.2000-0.1
gt22.ttf -gt-mincho-medium-r-normal--0-0-0-0-c-0-gt.2000-0.2
gt23.ttf -gt-mincho-medium-r-normal--0-0-0-0-c-0-gt.2000-0.3

Instead of the above, the following should work as well, shouldn't it?
Am I missing something?

gt21.ttf -gt-mincho-medium-r-normal--0-0-0-0-c-0-gt.2000-1
gt22.ttf -gt-mincho-medium-r-normal--0-0-0-0-c-0-gt.2000-2
gt23.ttf -gt-mincho-medium-r-normal--0-0-0-0-c-0-gt.2000-3


   Why do we persist in X-TT?  The reason is that libfreetype.a
 does not useful at all in CJK.  Especially the following two points are fatal.

  Well, X-TT's 'competitor' is not freetype module, but fontconfig
(+FT2 + Xft)

   - Handling a proportional multi-bytes fonts is too slow.
 (The loading speed of libfreetype.a is 20 times slower than
  that of X-TT 1.4; I show a benchmark in next email.)

  For the with TTCap option case, the option has been set to
  fc=0x3400-0xe7ff:fm=0x5a00.  This particular option setting
  indicates that xtt handles the glyphs that are within the CJK
  region (in unicode) with constant spacing, whose metrics are
  similar to that of 0x5a00.

  This is a nifty idea that can be utilized in Freetype2 and/or
fontconfig, but it seems to me that the fact that there's that much difference
in the perf.  between two cases is yet another indication that
X11 core fonts have to go away.


   - The modification of a font(such as auto italic and double striking, etc.)
 cannot be used at all.

   That is, libfreetype.a should also have all options of TTCap.


  Yeah, TTCap is useful, but it appears that we're trying to solve the
wrong problem turning away from the real issue. The real problem is
that we don't have quality CJK fonts in multiple styles.
Anyway, fontconfig offers 'artificial slanting' although it doesn't make
much sense to have 'italic' or 'slant' typefaces for CJK.

As for 'artificial bold',  there's a patch somewhere, but hasn't been
accepted because Freetype2 reportedly will come up with a better
solution for 'artificial bold'.

   Have you seen CJK's *TYPICAL* fonts.dir of TrueType fonts?
 It is following:

 Not many people would be fond of tweaking fonts.dir/scale files
these days :-) Why would they when just dropping truetype fonts in
fontconfig in one of directories listed in the font search path
works like a charm?


  Jungshik

P.S. If merging X-TT and freetype module is not gonna happen soon,
it would be nice if X-TT makes use of fontenc library used by freetype
library. With fontenc library, freetype module doesn't have to hardcode
font encoding to Unicode mapping tables. Because font encodings are not
hard-coded, it's easy to add a new encoding although these days we don't
care much. Moreover, it'll cut down the size of X-TT significantly.
___
Fonts mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/fonts


[Fonts] Re: two different gb18030.2000-1 : Sun/Mozilla/Java vs RH

2003-07-09 Thread Jungshik Shin
On Wed, 9 Jul 2003, Yu Shao wrote:

 Jungshik Shin wrote:

 On Tue, 8 Jul 2003, Yu Shao wrote:

 I don't get you here, the first version of the patch was made for Red
 Hat 7.3, at that time we have to use Mozilla with X core font. Since
 then the patch has been there almost unchanged.

 GB18030.2000* aliases were added purely because we want Mozilla working

(you made gb18030.2000-0 an alias to gbk-0, but you also made a new identity
mapping for gb18030.2000-1. They're different/separate issues that cannot
be aggregated with '*'.)

 As I wrote,  Mozilla's GB18030Font1 is NOT your gb18030.2000-1.enc BUT
Sun's gb18030.2000-1 (and what's proposed by James Su and Roland Mainz).
There's NO dispute about gb18030.2000-0. The question is about
gb18030.2000-1 (not '0'). With this difference, how could you make Mozilla
(non-Xft build) work with your gb18030.2000-1? Probably, it gave you
an impression that it worked either because you also had iso10646-1
fonts  or because you haven't checked BMP characters _outside_
the repertoire of gb18030.2000-0 with Mozilla.

 About the identical mapping in RedHat's GB18030.2000-1, it is because
 the inside compound encoding part is treating them as ISO10646 codes.
 
   This is a bit confusing.  How am I supposed to interpret this together
 with  the first sentennce in your reply? Do you need RH8's
 version of gb18030.2000-1.enc or not?

  This question of mine is about Compound text encoding. You began your
reply with  the following.

 Because RedHat XFree86 18030 patch's compound text encoding part was
 based on James Su's patch which was derived from UTF-8' code, it doesn't
 really need GB18030.2000-0.enc and GB18030.200-1.enc to be functioning.

   and then ended it with 'About the identical mapping
 the inside compound ... is treating them as ..'. To me it appears
to contradict each other.

   How would you propose the conflict between RH's gb18030.2000-1.enc and
 Solaris/Mozilla/Java's gb18030.2000-1 be solved?  Could you add your
 comment to http://bugs.xfree86.org//cgi-bin/bugzilla/show_bug.cgi?id=441 ?

 What GB18030 compound encoding code has XFree86 decided to use? right
 now, there is even no GB18030 X locale definition in CVS,
 there is no
 conflict, just totally depends on how to approach the compound text
 encoding part.

Let me make it clear. The conflict is not inside XFree86 but
between RH8's gb18030.2000-1 on the one hand and Sun's and Mozilla's
gb18030.2000-1 (and what James Su and Romland Mainz proposed) on
the other hand.  It's regrettable that your patch hasn't been discussed
in open forums like fonts/i18n list of XFree86 IIRC (my memory sometimes
doesn't serve me well so that I may have missed it).

  Do you care which of two gb18030.2000-1's  is included
in XFree86, do you? If you don't care, you're willing to replace
RH's gb18030.2000-1.enc with that based on Sun's/Mozilla's/Java's
(as suggested by James Su in
http://www.mail-archive.com/fonts%40xfree86.org/msg01343.html
or by Roland in http://bugs.xfree86.org/cgi-bin/bugzilla/show_bug.cgi?id=441)

  Independent of compound text encoding, it's bad that gb18030.2000-1
has two different meanings. That's what I want to resolve here.
If you agree to go with Sun's version, we won't have to worry about
having to figure out which is which (although x11 core fonts
become less and less important...)


  Jungshik

___
Fonts mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/fonts


[Fonts] re: a new font encoding file for XF86 : gb18030.2000-2? (fwd)

2003-07-05 Thread Jungshik Shin
Hi,

I sent the following to James Su to seek his opinion, but it was bounced. Now
I'm sending to 1i8n and fonts list expecting him or other Chinese experts to
pick this up.


Jungshik


Hi,

Could you make a comment on
http://bugs.xfree86.org//cgi-bin/bugzilla/show_bug.cgi?id=441?

It's about adding a new font encoding file to XF86 for BMP characters
NOT covered by gbk-0/gb18030.2000-0.enc and gb18030.2000-1.enc that you
proposed and was/were accepted. I don't think it's necessary, but your
expert opinion would be great to have. I tried to add you to CC of bugzilla,
but you're registered there so that I'm writing this instead.

Thank you,

Jungshik



___
Fonts mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/fonts


[Fonts] Re: [Fonts]Xft patch for halfwidth glyphs in monospace CJK fonts

2003-02-20 Thread Jungshik Shin
On Tue, 10 Dec 2002, Anthony Fok wrote:

Thank you for bringing up this important issue.


 I was assigned with the task of dealing with s p a c e d - o u t  CJK
 fixedPitch font issue in konsole.

 In addition to Konsole, gnome-terminal, Mozilla-xft(for rendering
text/plain or a portion of html documents with font style set to monospace),
vim-gtk2 and a lot of other programs that require 'fixed-width' fonts
have the same problem with CJK 'fixed-width' (actually 'bi-width') fonts.
I was about to look into Xft/Pango to see if I can
solve this problem because fixing
it on application program side seems ineffcient, but 'googled' it to
find this message that was sitting in my mailbox unread. The follow-up
yours is missing (because I had a network outage for a few days), but
found it in the archive.

It seems like the patch mentioned by Ken is more
ambitious than Anthony's (http://www.kde.gr.jp/~akito/patch/fcpackage/2_1/)
and it is probably harder to put that into upcoming 4.3.0 release.
Therefore, I'm wondering what Keith thinks of adding Anthony's
or similar patch to Xft. CJK fixed-width font issue is serious
for CJK users and it'd be very nice to take care of it before
the release of 4.3.0.

 TrueType fonts with the fixedPitch flag set to true to mean that:

* All CJK glyphs have the same fullwidth
* The ASCII glyphs and other special glyphs have the same halfwidth

 I have submitted a small patch to the FreeType mailing list to deal with the
 halfwidth monospace font issue,

  Has it been committed?

 and it turns out that Xft has the same
 issue.  It took me a while to figure out that it was not konsole or Qt.  :-)


 Any idea on how to deal with the 15 / 2 = 7;  7 + 7 = 14 issue?  :-)

How about rounding up to the nearest even number before dividing it
by 2?


  Jungshik

___
Fonts mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/fonts



Re: [Fonts]Re: Xprint

2002-12-10 Thread Jungshik Shin

On 10 Dec 2002, Juliusz Chroboczek wrote:

JS   Even with this weakness, Xprint is by far the best printing
JS solution available at the moment for Mozilla under Unix/X11
JS because postscript printing module of Mozilla does not work very
JS well yet

JC Xprint might work for CJK fonts,

  It does work for CJK now. Especially version 0.8 of Xprint with
truetype font support works pretty well. Even the PS output
produced by 0.7 with X11 bitmap fonts doesn't look that bad.

JC although I'm a little bit suprised at  your enthusiasm for the thing.

  I'm not so  enthusiastic about it as you may think. A better
word to characterize what I think about it is
ambiguity.  See my postings to mozilla-i18n newsgroup
news://news.mozilla.org/netscape.public.mozilla.i18n. When I wrote
'by far the best', I meant _as of now_ it gives the best match between
the print out and the screen rendering. For CJK web pages, Mozilla PS
module can't do that because only *one* PS font for each language can be
specified. That is, on the screen, Mozilla(especially Mozilla-Xft) can
be a  good implementation of CSS, but on the print out, it cannot.
Xprint is not perfect, but it's better than printing out everything(CJK
and non-Western European) in a single font (specified in pref. file
which has to be hand-edited
by end-users.). Besides, complex script cannot be printed out at all by
Mozilla under Unix without Xprint. With Xprint, it's possible to print
out web pages in complex scripts  provided that  you can render them
on the screen with Mozilla-X11core. That's a big difference.

JC There is no way, though, how Xprint
JC could work for complex scripts without standardising on glyph
JC mappings.

  As I understand it, Xprint is a specialized form of X11 server
combined with some X clients. Therefore, I think it has all sorts of
weakness found in server-side font model we have been moving away from.
It's not fast and nor efficient (compared with client-side font technology)
and it doesn't support 'modern' CSS-based font selection/resolution at
the same level as provided by fontconfig. Nonetheless, it works _now_.

  As for complex script rendering, it's possible to print them out
as I wrote above and my test with Old Korean showed. (see
 http://bugzilla.mozilla.org/show_bug.cgi?id=176315). Standardizing
on glyph mapping is not a requirement if we just deal with a single
application program(e.g. Mozilla). Mozilla-X11 has a way to map the last
two fields of XLFD to a  mapping between a string of Unicode characters
and a sequence of glyphs. That's what Mozilla-X11 uses to render Indic
scripts, Thai and Hangul Conjoining Jamos. (Mozilla doesn't yet support
opentype fonts at least under X11. Some Pango code was borrowed but
that's not from pango-xft but from pango-x). Because Xprint module of
Mozilla shares many things with Mozilla-X11corefont/Mozilla-Gtk, without
doing anything, Xprint just works when it comes to printing out web pages
in Indic scripts, Thai and Old Korean.

  Of course, I'm well aware that we have to use opentype fonts with
gsub/gpos tables for complex script rendering.  However, we also need a
short-term solution that works now.  For instance, there is not a single
opentype font freely available for old Korean. The situation is much
worse than that for Indic scripts for which free opentype fonts began
to emerge. In the meantime, we have to resort to font-specific-encoding
hacks.

JC There is also no way[1] how Xprint could implement
JC dynamically generated fonts, as required for example by CSS2.

 I'm a bit confused as to what you meant by 'dynamically generated
fonts'. Did you mean 'web fonts'?  Can you tell me what you meant?

JC The right approach is obviously to do incrememtal uploading of fonts
JC to the printer at the PS level, as the Mozilla folks are trying to do.

  I totally agree with you provided that the font resolution mechanism
is tied with fontconfig.

JC I'm a little bit suspicious about their choice to use Type 42 CIDFonts

  Given that truetype fonts are much easier to come by than genuine
CID-keyed fonts for CJK (which is also true of truetype fonts vs PS
type 1 fonts for European scripts although to a lesser degree), I guess
the choice is all but inevitable(perhaps OpenOffice also adopted this
approach). Do you have a better idea?  Judging from your reservation about
the rasterization on the host side, what you're thinking of cannot be
converting all the glyphs into bitmap and putting them in the PS output.
Anyway, I believe this 'mini-project' for Mozilla printing has be 'glued'
with fontconfig in CSS2 font resolution so that the screen rendering
and PS output use the same set of fonts.

What I can think of as an alternative to embedding type 42 PS font(type
2 CIDFont) is just to refer to CID-keyed fonts/type 1 fonts in the
PS output and let a real PS printer or ghostscript do the rest of the
job. This is similar to what the present PS module for Mozilla does.
However, in order to get a faithful 

[Fonts]Re: Xprint

2002-12-09 Thread Jungshik Shin


On Mon, 9 Dec 2002, Michael B. Allen wrote:

 Roland  Mainz  has  released  a  new  version  of  Xprint and appears to be
 actively  working  on  another.  The mozilla website has some nifty looking
 internationalized  screenshots  displaying Turkish, Chinese, etc. I've been
 using  an Xprint CUPS setup for sometime now with great success.


   http://xprint.mozdev.org/screenshots.html

 Yeah, Xprint works great (it can even be used to
print out old Korean page with U+1100 Hangul Jamos) It solved a
long-standing problem in X11(well, commercial Unix have some solutions
for this), the enormous gap between what you see on the screen and what
you get on paper(especially for non-European scripts).  Because Xprint
is an Xserver specialized for printing and  shares many things with
the main X server for screen rendering, what you see on the screen is
faithfully replicated in what you print out with Xprint as long as two
X servers(one for screen and Xprint) have access to the common set of
fonts. However, the fact that Xprint is a specialized form of X*server*
is also a weakness. You may know that the whole Linux (and FreeBSD and
other Unix that rely on XFree86) community is moving away from the server
side font and toward client-side font technology (fontconfig and Xft.
http://fontconfig.org) With fontconfig and Xft, Unix/X11 finally got
on par with Windows and MacOS in terms of font support. Arguably,
this is the greatest development in X11 that happened in the last
10years. Mozilla-Xft is finally able to support CSS at the same level
with Mozilla-Win and Mozilla-MacOS(no more need to tinker with XLFD
and things like that).  The problem of the server-side font becomes
very obvious when you search for some Japanese(Chinese, Korean) words
in Google (they don't have to be CJK, but to make sure that you get a
truly multilingual page in UTF-8 that requires multiple fonts to render)
and see Mozilla-X11core struggle (sometimes it can take almost 10 seconds
at my PIII 750MHz with 384MB) to render the page. (Or, open up the font
selection dialog box in Mozilla-X11core and compare that with the font
selection dialog box in Mozilla-Xft/Mozilla-Windows/ Mozilla-MacOS.
You can repeat the experiment with Mozilla-Xft.) Mozilla-Xft renders the
page instantaneously.  Also try to print the page with Xprint. Mozilla
doesn't respond for as long as 30 seconds (depending on the complexity
and the length of pages) until Xprint is done with searching for fonts to
'render' the page.

  Even with this weakness, Xprint is by far the best printing solution
available at the moment for Mozilla under Unix/X11 because postscript
printing module of Mozilla does not work very well yet(it works but
is far behind what you can get with Mozilla-Windows and Mozilla-MacOS
where the OS-level printing infrastructure  is far superior to that
under Unix/X11. Well, on some commerical Unix, it may be better.)
It would be even greater if it's possible to combine Xprint somehow
with fontconfig(although not likely). Better still is to write something
like XftPrint(or XftPS) which would do to printing what Xft does to the
screen rendering . There's an on-going project in Mozilla to directly use
Freetype2 and embed type42 truetype fonts in PS output.  This might be
where fontconfig can come in to better support CSS in Mozilla printout
as is done on the screen by fontconfig+Xft in Mozilla-Xft.

 I hope the Linux  distros  jump  on  the bandwagon and start shipping
 it along with an
 Xprint enabled Mozilla (Red Hat's mozilla RPMs do not have Xprint enabled).

  I'm not sure  why RH disabled Xprint in their Mozilla RPM.
Xft, Xprint and PS printing module can coexist in Mozilla without
much problem as far as I can tell. Perhaps, that blocking I mentioned
above may not be acceptable?

   Jungshik Shin

P.S. I'm CCing to fonts list of XF86.


___
Fonts mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/fonts



Re: [Fonts]Xft/fontconfig and Non-BMP characters

2002-12-03 Thread Jungshik Shin
On Sun, 1 Dec 2002, Jungshik Shin wrote:

 While trying to make Mozilla-Xft support non-BMP characters with fonts
 like CODE2001.TTF (with pid=3/eid=10 Cmap), I found that freetype
 and Xft need a little change. Details are sent to linux-utf8 list
 (http://mail.nl.linux.org/linux-utf8/2002-12/msg0.html) and Bugzilla

  Extending XftTextExtents16() to support UTF-16 is similar to

  Attached is my patch(a bit revised) to extend XftTextExtents16 to
support UTF-16 and to fix a typo in fstr.c of fontconfig(which makes the
conversion from UTF-16 to UCS-4 not work correctly for characters in
even numbered planes with the 17th bit in UTF-32 unset.)

  Keith, would you take a look?

  I also sent it to [EMAIL PROTECTED] and got a patch seq. #5522.

  Jungshik

___
Fonts mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/fonts



Re: [Fonts]Xft/fontconfig and Non-BMP characters

2002-12-03 Thread Jungshik Shin



On Tue, 3 Dec 2002, Jungshik Shin wrote:

   Attached is my patch(a bit revised) to extend XftTextExtents16 to
 support UTF-16 and to fix a typo in fstr.c of fontconfig(which makes the
 conversion from UTF-16 to UCS-4 not work correctly for characters in

  Sorry I forogot to attach it. This time, it's really attached.

  Jungshik

Index: xc/lib/fontconfig/src/fcstr.c
===
RCS file: /cvs/xc/lib/fontconfig/src/fcstr.c,v
retrieving revision 1.10
diff -u -r1.10 fcstr.c
--- xc/lib/fontconfig/src/fcstr.c   2002/08/31 22:17:32 1.10
+++ xc/lib/fontconfig/src/fcstr.c   2002/12/04 03:10:13
@@ -282,8 +282,8 @@
 */
if ((b  0xfc00) != 0xdc00)
return 0;
-   result = FcChar32) a  0x3ff)  10) |
- ((FcChar32) b  0x3ff)) | 0x1;
+   result = (FcChar32) a  0x3ff)  10) |
+ ((FcChar32) b  0x3ff))) + 0x1;
 }
 else
result = a;
Index: xc/lib/Xft/xftextent.c
===
RCS file: /cvs/xc/lib/Xft/xftextent.c,v
retrieving revision 1.9
diff -u -r1.9 xftextent.c
--- xc/lib/Xft/xftextent.c  2002/10/11 17:53:02 1.9
+++ xc/lib/Xft/xftextent.c  2002/12/04 03:10:14
@@ -147,6 +147,11 @@
free (glyphs);
 }
 
+#define IS_HIGH_SURROGATE(u) (((FcChar16) (u)  0xfc00L) == 0xd800L)
+#define IS_LOW_SURROGATE(u)  (((FcChar16) (u)  0xfc00L) == 0xdc00L)
+#define SURROGATE_TO_UCS4(h,l) (FT_UInt) (h)  0x03ffL)  10) | \
+((FT_UInt) (l)  0x03ffL)) + 0x1L)
+
 void
 XftTextExtents16 (Display  *dpy,
  XftFont   *pub,
@@ -156,6 +161,7 @@
 {
 FT_UInt*glyphs, glyphs_local[NUM_LOCAL];
 inti;
+intnglyphs = 0;
 
 if (len = NUM_LOCAL)
glyphs = glyphs_local;
@@ -169,8 +175,19 @@
}
 }
 for (i = 0; i  len; i++)
-   glyphs[i] = XftCharIndex (dpy, pub, string[i]);
-XftGlyphExtents (dpy, pub, glyphs, len, extents);
+{
+   if (IS_HIGH_SURROGATE(string[i])  i + 1  len  
+   IS_LOW_SURROGATE(string[i + 1]))
+   {
+   glyphs[nglyphs++] = XftCharIndex (dpy, pub, 
+   SURROGATE_TO_UCS4(string[i], string[i + 1])); 
+   ++i;
+   }
+   else 
+   glyphs[nglyphs++] = XftCharIndex (dpy, pub, string[i]);
+}
+
+XftGlyphExtents (dpy, pub, glyphs, nglyphs, extents);
 if (glyphs != glyphs_local)
free (glyphs);
 }



[Fonts]Xft and Non-BMP characters

2002-12-01 Thread Jungshik Shin


Hi,

While trying to make Mozilla-Xft support non-BMP characters with fonts
like CODE2001.TTF (with pid=3/eid=10 Cmap), I found that freetype
and Xft need a little change. Details are sent to linux-utf8 list
(http://mail.nl.linux.org/linux-utf8/2002-12/msg0.html) and Bugzilla
(http://bugzilla.mozilla.org/show_bug.cgi?id=182877). Below is
a part of my message to linux-utf8 list related to Xft.

-
  I also have to extend XftTextExtents16() included in  fcpackage-2.1
to deal with UTF-16 (instead of UCS-2). Xft2 has XftDrawStringUtf16() in
addition to XftDrawString16() (the latter is for UCS-2).  I thought about
adding XftTextExtentsUtf16(), but it appears that it's more convenient for
programs like Mozilla which uses UTF-16 for internal string representation
when XftTextExtents16() is extended to support UTF-16. Again, there's a
little speed penalty.

  Below are links to FT2 patch (against 2.1.3) and Xft patch
  (against fcpackage 2.1)

 http://bugzilla.mozilla.org/attachment.cgi?id=107852 : FT2 patch
 http://bugzilla.mozilla.org/attachment.cgi?id=107858 : Xft patch

There are a couple of screenshots  along with Mozilla patch and
a couple of sample pages with non-BMP characters at

 http://bugzilla.mozilla.org/show_bug.cgi?id=182877
---

 Extending XftTextExtents16() to support UTF-16 is similar to
the extension of Win32 'W' APIs to support UTF-16. Some people may not
like it. However, it seems not so bad an idea and I even think that
XftDrawString16 may as well be extended in a similar manner. I'm not a
big fan of UTF-16, but neither am I very much against it.

 IMHO, it'd be very nice if  either extending XftTextExtents16() or
adding a new function XftTextExtentsUtf16 (a la XftDrawStringUtf16() )
is done before the release of XFree86 4.3.

  Keith, what would you say?

  Jungshik



___
Fonts mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/fonts



[Fonts]Re: making editable charset/lang in fonts.conf

2002-10-23 Thread Jungshik Shin



On Tue, 22 Oct 2002, Keith Packard wrote:

Thank you for your explanation.

 Around 12 o'clock on Oct 22, Jungshik Shin wrote:

1. get a pattern from an application(fontconfig client)
2. apply configuration-specified editing rules to the pattern.
For each font:
  3. read in font properties from fc-cache or (directly from font if
   fc-cache is not present)
  4. measure the distance between the pattern and each font

 Fontconfig reads the font properties at startup time, and thereafter only
 when they change (it checks file mod times when fonts are listed).

  I see. So, step 3 should be at the top.

 What we could do is add a set of rules executed when the patterns are
 loaded  although I'm not sure that's precisely what you want,


  More specifically, you meant 'the patterns holding font properties
are loaded from font-cache files', didn't you? If so, that's what I want.

match target=font
  test qual=any name=familystringFAMILY/string/test
  edit name=charset mode=MODE
  charset./charset/edit
/match

where 'MODE' can be 'add'('append/prepend' just do), 'subtract' or
'assign' (or something similar to that effect). Because 'charset'
is already taken for Base85 representation of the coverage, a new
property (that has to be translated to charset internally) name
(e.g. coderange) might be used or 'charset' can be overloaded to mean
a more human-readable representation of the coverage in fonts.conf.
(sth. like [0x-0x]). For instance, I want the following  to be
applied to font properties of 'Gulim Old Hangul Jamo' (a hack-encoded font
of which character/glyph assignment has NOTHING to do with actual Unicode
character assignment) read off from fc-cache BEFORE  matching against
an application-provided pattern (by measuring the distance) begins.

match target=font
  test qual=any name=family
stringGulim Old Hangul Jamo/string
  /test
  edit name=charset mode=subtract
charset0x4e00-0x5400/charset // remove hack-encoding code points
  /edit
  edit name=charset mode=add
charset0x1100-0x11ff/charset // Hangul Jamos
  /edit
  edit name=charset mode=add
charset0x302e-0x302f/charset // Hangul Tone marks
  /edit
  edit name=charset mode=add
charset0xac00-0xd7a4/charset // Hangul syllables
  /edit
  edit name=lang mode=assign
stringko/string
  /edit
/match

Another example is for Baekmuk Batang which doesn't have glyphs
for U+1100-U+11FF, but can be used to render them nonetheless.

match target=font
  test  name=family
stringBaekmuk Batang/string
  /test
  edit name=charset mode=add
charset0x1100-0x11ff/charset // Hangul Jamos
  /edit
  edit name=charset mode=add
charset0x302e-0x302f/charset // Hangul Tone marks
  /edit
/match


 it would  significantly impact application startup performance.

  Would just adding the feature to fontconfig have this significant
negative impact even in absence of editing rules for this feature in
fonts.conf?  Or, would that negative impact manifest itself ONLY when
fonts.conf actually has editing rules for this feature? If the latter
is the case, the decision/choice would fall on end-users, wouldn't it?
If they think they can exchange a performance hit at application start-up
for a feature they desperately need, they would go for it.  Otherwise,
they wouldn't put any editing rule to be applied at font-properties
loading stage.


 It seems like you want to select fonts based on Unicode coverage of the
 desired Hangul representation.

  Actually, that's related but not exactly what I want.  I may have get
you confused because last week I mentioned multiple ways of representing
Hangul along with what I'm talking about here.   Or, am I
misunderstanding you?

  I'd like to override 'charset' property(Unicode coverage) of some
fonts  detected by fontconfig and stored in font-cache because the
detected value of 'charset' property doesn't represent their 'true' ability
due to hack-encodings used in them. In the example above, 'Old Gulim
Hangul Jamo' is detected to cover [U+4E00-U+52xx] and [U+-U+007F],
but I like that detected Unicode coverage to be modified by rules
specified in fonts.conf to reflect its 'true' ability ( [U+1100,U+11FF],
[U+AC00-U+D7A4], etc)

 Arguably, this could be useful for more general cases than I made
it sound.  Some fonts have precomposed  Latin/Cyrillic/Greek letters
with diacritics, but they may not have combining diacritics themselves.
When a client of fontconfig like Pango tries to render a sequence of
a base character and a diacritic mark with such a font, Pango *seems*
to end up having two separate fonts, one for the base character(the font
specified in an application ) and the other for the diacritic mark even
though the first font (spec. in an application) has a glyph for the
precomposed charater made up of the base character and the diacritic
mark. Although this kind of problem can be solved at a level different
from fontcofing fontconfig could well help here.

 You could easily add

Re: [Fonts]fontconfig peculiarity(??)

2002-10-18 Thread Jungshik Shin



On Fri, 18 Oct 2002, Keith Packard wrote:

 Around 7 o'clock on Oct 18, Jungshik Shin wrote:

  For some unknown reason, 'New Gulim' is picked up by 'fontconfig' or 'Xft'
  for a certain characters when CODE2000 is explicitly requested by
  applications like Mozilla and gedit (via Pango) More specifically, those
  certain characters are U+115F(Hangul leading consonant filler) and
  U+1160(Hangul trailing consonant filler).

 Fontconfig has a kludge to weed out fonts with broken encoding tables;
 such fonts often have encoding table entries pointing at blank glyphs
 which aren't supposed to be blank.  It checks each glyph in the encoding
 and ignores those which are inappropriately blank.

 which are expected to be blank, that list was derived from a similar table
 in Mozilla.  Blank glyphs not in the table are assumed to represent broken

  Keith, we talked about this a month ago (Sep. 7th) on this very list
:-) You came up with  a much more extensive  list of characters than
Mozilla's blank glyph list. I also filed a bug for Mozilla-Windows
(http://bugzilla.mozilla.org/show_bug.cgi?id=167136).   You must have
forgottent about it. :-) I added those two characters to the blank glyph
list  /etc/fonts/fonts.conf then. In addition, both Ngulim and Code2000
have blank glyphs for both characters. The only difference is that in
Ngulim they're both *spacing*(width  0) while in Code2000 only U+115f
is spacing and U+1160 is non-spacing(width=0). So, even if my blank
glyph list doesn't have them, there's no reason I can think of Ngulim is
preferred over Code2000 for those characters. If they're equal on this
count, the explicit request seems to have to take precedence, doesn't it?

 One possible explanation is that Code2000 isn't marked as supporting
'ko' in font-cache for some reason while Ngulim is. However, both fonts
have more or less similar coverage of Korean characters (the full set
of precomposed syllables and Hangul Conjoining Jamos and other symbols
in KS X 1001). So, this is a bit mytery, too.

 weren't included in the table.   This means that no font will ever be
 listed as supporting these glyphs, so Mozilla will pick the first font in
 the match list to draw them with, expecting that this will produce a
 missing glyph indication.

  BTW, could it be possible to 'deceive' or 'force' fontconfig
to believe that a certain font covers a certain range of Unicode
even if it doesn't appear to? I guess it's not possible at the moment,
but wouldn't it be nice to add it? What I'm thinking of is something
like this:

match target=font
  test qual=any name=familystringGulim Old Hangul Jamo/string/test
  edit name=coverage mode=assign binding=strong
  coderanges./coderanges/edit
/match

where coderanges  are a comman-separated list of unicode code points
(integer) or code ranges (sth. like [0x-0x]).

I found in font cache file that charset property does exactly the
thing I want to do with 'coderanges'. If so, would it be possible
to use 'charset' to achieve what I described above?  Well, I've
gotta figure out how  'charset' represents Unicode ranges.

Some fonts have a hack-encoding (although advertised as in Unicode)
and their apparent Unicode coverage cannot be guessed at all by
fontconfig based on Unicode cmap. An application or library aware of this
hack-encoding can do some hack with them, though. However, fontconfig does
not appear to return a requested font and come up with a fallback after
'intelligent guess' even if explicitly specified because what it thinks
a font with hack-encoding can cover does not match at all the range of
Unicode an application want to draw with the font.

It'd be also nice to be able to do  something similar with 'lang' tag.
I thought the following line would *make* fontconfig *believe* (*ignoring*
what it finds out with OS lang tag and orthography map) that 'Gulim Old
Hangul Jamo' is suitable for Korean,  but it doesn't seem to work. Did
I do anything wrong?

---
match target=font
test qual=any name=familystringGulim Old Hangul Jamo/string/test
edit name=lang mode=assign binding=strongstringko/string/edit
/match
---

  Both of these certaily look like a hack, but some applications
(perhaps mathml, Indic script handling, Korean  alphabet handling...)
need them until OTFs are widely available. Related problems are talked
about at

  http://bugzilla.mozilla.org/show_bug.cgi?id=126919#c315 and comments
references therein.

  http://bugzilla.mozilla.org/show_bug.cgi?id=95708


   Jungshik

___
Fonts mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/fonts



Re: [Fonts]fontconfig peculiarity(??)

2002-10-18 Thread Jungshik Shin
On Fri, 18 Oct 2002, Keith Packard wrote:

 Around 12 o'clock on Oct 18, Jungshik Shin wrote:

  One possible explanation is that Code2000 isn't marked as supporting 'ko'
  in font-cache for some reason while Ngulim is.

  This explanation only makes sense when those two chars are NOT
included in the blank glyph list, doesn't it?  As I wrote, they've have
been in the blank glyph list in my fonts.conf since early September.

  Hmm, things are getting more interesting. After I removed Ngulim.ttf
from my font path and then put it back (I ran fc-cache before testing),
suddenly Mozilla picks up U+1160 glyph from Code2000. The same is true of
'gedit' when Code2000 is specified as a font to use. Is it at the
whim of electrons whirling around inside my computer :-) ?


 If your font specification includes language, this would cause Ngulim to
 be preferred over Code2000 if both are added to the pattern in the config
 file.  If the application explicitly names 'Code2000' as a family name,
 then the language shouldn't matter.

  The page in question (http://jshin.net/i18n/korean/hunmin.html
and http://jshin.net/i18n/korean/hunmin_comp.html) specifies font-family
to be CODE2000 explicitly with CSS. I assume this will make Mozilla with
Xft enabled ask fontconfig for that font explicitly.

  As for Pango(gedit), I'm less certain because I don't know whether
Pango specifies language when sending  fonts request down(or up) the
road.

  Therefore, my original mystery still remains a mystery :-)

 Code2000 isn't marked as supporting Korean as it is missing a large number
 of Han glyphs, totalling some 3136 characters from the KSC 5601-1992
 encoding.  Many Korean documents will not be completely covered by this

  Sorry I didn't check Han glyphs only checking that it has the full set
of precomposed Hangul syllables(11,172 of them.). As I suggested before, a
kind of multi-level orthography check may be necessary to cope with situations
like this. Or, would it be possible for users to override manually
what fontconfig *detects* (both code range coverage and lang) in
fonts.conf as suggested in my prev. email?

   Jungshik

___
Fonts mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/fonts



[Fonts]blank glyph list in fonts.config

2002-09-07 Thread Jungshik Shin


Since the release of a new CODE2000 font(by James Kass at
http://home.att.net/~jameskass) with glyphs for Hangul Jamos, I've
been trying to test how it works with various browsers. Mozilla
with direct access to  truetype fonts works fine, but Mozilla
with Xft patch has a problem with U+115F(Hangul leading consonant
filler) and U+1160(Hangul vowel filler). In CODE2000, the former
is a spacing(non-zero width) _blank_ glyph while the latter is a
non-spacing(zero-width/combining) _blank_ glyph. When Mozilla with
Xft patch is used to render http://jshin.net/i18n/korean/fillers.html
(or http://jshin.net/i18n/korean/hunmin.html), U+115F and U+1160 are
rendered with hollow boxes instead of spacing and non-spacing(combining)
blanks seemingly because they're not listed among characters allowed
to have blank glyphs.   It's 'seemingly' because Mozilla with Xft patch
has this problem while 'gedit' doesn't have this problem.
Anyway, adding U+115F and U+1160 to the list in fonts.config
solved the problem.

Two screenshots are put up at

http://linux.mizi.com/~ganadist/filler1.png  (with U+115F/U+1160 added
 to blank glyph list)
http://linux.mizi.com/~ganadist/filler2.png  (without )

Mozilla for MS-Windows has a similar problem and I came up with
a similar fix that works. See
http://bugzilla.mozilla.org/show_bug.cgi?id=167136.

I'm not sure adding U+115F/U+1160 to the blank glyph list is the best
way, but it works. Keith, could you consider this?

Thank you,

Jungshik

___
Fonts mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/fonts



Re: [Fonts]blank glyph list in fonts.config

2002-09-07 Thread Jungshik Shin




On Sat, 7 Sep 2002, Keith Packard wrote:

 Around 9 o'clock on Sep 7, Jungshik Shin wrote:

  I'm not sure adding U+115F/U+1160 to the blank glyph list is the best
  way, but it works. Keith, could you consider this?

 The blank glyph list is supposed to be filled with all of the Unicode
 values which have an empty visual representation.  It's a hack to work
...

 I adapted the data I found in Mozilla for this purpose, hence the similar
 issues you found in the two programs.

  Thank you for going through the Unicode  tables to
come up with a more extensive list.  I've just posted your list to bugzilla
bug 167136 mentioned previously.

 I'm reading through the Unicode tables looking for other blank values,
 so far I've found:

 Unicode range added? comments

 U+180B - U+180E   no  (but I don't have a Mongolian font to check against)
 U+200C - U+200F  yes  (the Unicode description isn't clear)
 U+2028 - U+2029 no  (these seem like they're supposed to be drawn)
 U+202A - U+202F  yes  (these also appear blank from the description)
 U+3164 yes  (HANGUL FILLER, similar to U+1160)
 U+FEFF   yes  (byte order detector (ZERO WIDTH NO-BREAK SPACE))
 U+FFA0 yes   HALFWIDTH HANGUL FILLER (similar to U+3164)
 U+FFF9 - U+FFFByes   INTERLINEAR ANNOTATION marks for furigana

 Rules for inclusion -- if a font could reasonably draw these as blank,
 they should be included in the list.  The idea is to ignore empty glyphs
 which should always have some visual representation.

  I think that U+200C/U+200D(ZWNJ, ZWJ) are meant to be used mainly(
though not exclusively. Latin ligature formation may also be controlled
by them.) with Indic scripts and fonts for Indic scripts are supposed to
have some OT tables built-in to map a sequence of characters including
ZWNJ/ZWJ to appropriate glyph(s). As for U+200E/U+200F and U+202A -
U+202F, I guess a lower-level layer like fontconfig is never supposed to
see them because they have to be taken care of at a higher level(layout.
e.g. Pango?).  Nonetheless, it seems like it's harmless(except for a
little performance hit, if any) to include them in the blank glyph list.
The same appears true of U+FFF9 - U+FFFB.

  BTW,  although depcecated, U+206A - U+206D seem to have to be included
as well.  U+206E and U+206F may or may not have to be added. I'm not
sure what they're for.

  Jungshik

___
Fonts mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/fonts



Re: [Fonts]Korean orthography for fontconfig

2002-08-14 Thread Jungshik Shin




On Wed, 14 Aug 2002, Owen Taylor wrote:

 The current Korean orthography looks like a combination
 of KSC-5607.1987 with the complete Hangul Syllables
 area of Unicode.

 I'm sorry to be 'pedantic'.  Strictly speaking, this way of talking
about Korean orthography (in terms of precomposed syllables) is not quite
right.  You have to say what consonants and vowels are allowed/required in
modern Korean orthography just like you talk about what alphabetic letters
are required of any given language represented with Latin/Greek/Cyrillic
alphabets.

 However, there are fonts out there that only have
 the Hangul syllables in  KSC-5607.1987 ... one example
 would be the freely available 'Baekmuk Batang' font;

  Not any more. A new set of Baekmuk fonts with
the full coverage of 11,172 precomposed modern syllables have been
available for quite a while (over two years?)  although they may not
have been included in popular Linux distributions made outside Korea.
You can get them at
ftp://ftp.mizi.com/pub/baekmuk/baekmuk-ttf-2.1.tar.gz.
In addition to having the full set of 11,172 syllables (precomposed,
modern, complete), several glitches have been fixed.

 such fonts are *not* currently recognized as supporting
 Korean.

 Nonetheless, you do have a point and I totally agree with you
on it.

 If this was just a matter of preferring fonts with
 all the Hangul syllables in Unicode when all other things
 are equal, then this wouldn't be a big problem, but

  This is a reasonable thing to do.


 it's more serious than this:

  - You can't specify such a font in a generic alias,
and have it preferentially selected for Korean language
tags.

  - You can't specify such a font in a generic alias,
and have it selected at all if you have fonts
with the complete orthography.

  - fontconfig statements like disable hinting for
Korean fonts don't work properly with such a font.

  These are certainly problematic.

 I think the right thing to do is probably just to use
 only the KSC-5607.1987 syllables in the Korean orthography;
 my understanding is that they are sufficient for the
 vast majority of modern Korean text.

  I would omit 'vast'. :-).

   Thanks to the dominance of MS-Windows in Korea as the leading
desktop platform, Koreans are not any more restricted to 2350
syllables. (in the past, they resort to JOHAB encoding to achieve the
same.)  MS Windows supports CP949 (an extension of EUC-KR based on KS X
1001:1998) and ordinary Korean users have no way whether a syllable they
type in belongs to KS X 1001:1998 or not. The result is that more and
more documents (especially in web BBS', emails and on-line chatrooms where
'colloquial' - it'd better be called 'slang' of the net subculture often
times cryptic to people like me.) Korean with intentional/unconcious use
of non-orthography-compliant syllables is widely 'spoken'.) in Korean
include syllables outside KS X 1001:1998. (see
http://bugzilla.mozilla.org/show_bug.cgi?id=131388).

  Even under Linux, there's no more restriction because ko_KR.UTF-8
locale can be used with Korean input method Ami
(with my patch to allow input of  all 11,172 syllables:
http://jshin.net/faq/ami-1.0.11.utf8.patch.gz. It'd be nice if
distributions like RH and Mandrake pick up this patch so that Linux
users can be on par with MS Windows users.) Probably, the same is
true of MacOS X.

   Jungshik

___
Fonts mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/fonts



Re: [Fonts]Korean orthography for fontconfig

2002-08-14 Thread Jungshik Shin




On Wed, 14 Aug 2002, Owen Taylor wrote:
 Jungshik Shin [EMAIL PROTECTED] writes:

  On Wed, 14 Aug 2002, Owen Taylor wrote:
 
   The current Korean orthography looks like a combination
   of KSC-5607.1987 with the complete Hangul Syllables
   area of Unicode.
 
   I'm sorry to be 'pedantic'.  Strictly speaking, this way of talking
  about Korean orthography (in terms of precomposed syllables) is not quite
  right.  You have to say what consonants and vowels are allowed/required in
  modern Korean orthography just like you talk about what alphabetic letters
  are required of any given language represented with Latin/Greek/Cyrillic
  alphabets.

 I'm not sure I understand your objection here.


 But it is just a matter of terminology...

  I'm sorry I got you confused. For a moment, I forgot
that 'orthography' in fcpackage context has a specialized meaning
different from its usual meaning. I was way too 'pedantic' writing
the paragraph above from the point of view of an 'amature linguist'.
In Korean orthography standard (both of ROK and DPRK), only consonants
and vowels allowed  are enumerated as opposed to listing all their
possible combinations because listing consonants and vowels are more
than enough. However, in fcpackage context, the situation is different.

 I'd say they definitely are composed syllables. And since it is possible
 to render Korean syllables by combining pieces at rendering time
 (Pango can do this for core X fonts, e.g.),

  I have more to ask/suggest  about Pango's rendering of U+1100 Jamos
and other issues in Korean rendering(e.g. Uniscribe-like OT support for
Korean). I'll try to do that soon offline.


   However, there are fonts out there that only have
   the Hangul syllables in  KSC-5607.1987 ... one example
   would be the freely available 'Baekmuk Batang' font;

Not any more. A new set of Baekmuk fonts with
  the full coverage of 11,172 precomposed modern syllables have been
...
  ftp://ftp.mizi.com/pub/baekmuk/baekmuk-ttf-2.1.tar.gz.
  In addition to having the full set of 11,172 syllables (precomposed,

 I just downloaded that, and it looks like the 'Dotum' font
 still only covers the KSC-5607.1987, just like in the
 baekmuk-ttf-2.0.tar.gz that Red Hat ships currently.

   You're right. Dotum still has only 2350 syllables.
Now this brings us back to the problem you raised. Basically, I agree
with you that fonts with only KS C 5601-1987 coverage have to regarded
as supporting Korean by fontconfig. Especially, this loosening of the
criteria is also required by bdf/pcf fonts or bdf-turned-sbit-only
TTFs(that will replace bdf/pcf fonts sometime in the future according
to what's been discussed today).

  How about introducing 'level' concept to fontconfig?
Characters in level 1 are absolutely required (in case of Korean,
2350 Hangul syllables and some more in symbol block of KS X 1001:1998).
Level2 has some optional characters (for Korean, it'd be additional 8000+
syllables and 4800+ Hanjas in KS X 1001:1998), Level3 has even rarer
characters (for Korean, it'd be Hanjas in KS X 1002) and so on


   I think the right thing to do is probably just to use
   only the KSC-5607.1987 syllables in the Korean orthography;
   my understanding is that they are sufficient for the
   vast majority of modern Korean text.
 
I would omit 'vast'. :-).
 
 Thanks to the dominance of MS-Windows in Korea as the leading
  desktop platform, Koreans are not any more restricted to 2350
...
  http://bugzilla.mozilla.org/show_bug.cgi?id=131388).

 I defer to your expertise in this area.

  I just like to make sure that this was only meant to tell you
the current situation in Korean materials on the net and that
I still agree with your suggestion about 'ko.orth' file in
fcpackage.


  locale can be used with Korean input method Ami
  (with my patch to allow input of  all 11,172 syllables:
  http://jshin.net/faq/ami-1.0.11.utf8.patch.gz. It'd be nice if
...

 Is there any reason that it hasn't gotten into the standard AMI?

  I also like to know :-). I sent the patch to both the
maintainer/author of  Ami and the Ami mailing list where he is active
in late April/early May, but somehow I haven't heard back from him.
Perhaps, I'll once more try to contact him.

  BTW, to make it work under ko_KR.UTF-8, XLC_LOCALE file for
ko_KR.UTF-8 should list ksc5601.1987-0 before jisx0208.1983.-0


   Jungshik


___
Fonts mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/fonts



Re: [Fonts]Font family name problem

2002-07-22 Thread Jungshik Shin




On Mon, 22 Jul 2002, Keith Packard wrote:

 Around 8 o'clock on Jul 22, Brian Stell wrote:

  Will there be a way to get the localized name using the ascii only name?

  How about the other way around? Given a localized name+lang, would
it be possible to get the ascii name? Put differently, would there be
a way to access the mapping from a localized name+lang to a font(or
ascii name/canonical name)?

 Yes.  The representation of the names internally includes all of the
 localized names along with the postscript name (which is always ASCII), any
 match or list result will include all of these names. I would sort the
 names so that any English or Latin names would come first in the list.

  Reading this, I think it should be possible, but is there an API
for that?

  A number of web pages have embedded or separate CSS with
only 'localized font (family) names'. Web browsers or any other applications
accessing those CSS' need to map localized names+lang (assuming that
lang info. is available in one way or another) to fonts. Although
it appears that they can roll out their own for this purpose,
wouldn't it be nice to have this in fontconfig?

  Thanks,

  Jungshik

___
Fonts mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/fonts



Re: [Fonts]Re: [I18n]Using current locale in font selection

2002-07-09 Thread Jungshik Shin




On Tue, 9 Jul 2002, Keith Packard wrote:

 Ok, so now what do I do with applications which haven't called
 setlocale (LC_ALL, )?  Do I:

   a)  call setlocal (LC_ALL, ) myself?

  I'm afraid this can have an unexpected side effect, which could
surprise/upset some application program developers.

   b)  use $LANG or $LC_CTYPE?

  If this road is taken, it has to be determined which env.
variables have to be refered to in what order. AFAIK, SUS and POSIX say
that it's implementation-dependent. Since XF86 is used with many OS',
it'd be best to follow the 'local' convention. Then, I don't know how
to figure it out without calling setlocale(LC_CTYPE,). In case of Glibc,

If $LC_ALL is set, use it
else if $LC_CTYPE is set, use it
else if $LANG is set,  use it.

   c)  Ignore the locale information and leave the
   font language preference unset?

  This might well be the best course along with documenting
that setlocale() should be called to make font matching/selection locale
dependent or that better still is to explicitly provide lang info when
invoking font selection APIs.

  Jungshik

___
Fonts mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/fonts



[Fonts]Re: [I18n]Using current locale in font selection

2002-07-08 Thread Jungshik Shin




On Mon, 8 Jul 2002, Keith Packard wrote:


 Around 14 o'clock on Jul 8, Owen Taylor wrote:

  +locale = (FcChar8 *)setlocale (LC_CTYPE, NULL);

 Don't you mean LC_MESSAGES?

  I believe it should be LC_CTYPE. Some people like me
have the following because English menu and (error) messages are easier
to understand than not-so-good translation.


  LC_CTYPE=ko_KR.eucKR
  LC_MESSAGES=C
  LC_PAPER=en_US   # because the US doesn't use ISO std. paper size
  .

  or

  LC_CTYPE=ko_KR.UTF-8
  LC_MESSAGES=en_US.UTF-8
  LC_PAPER=en_US.UTF-8
  .


 If so, I think we should be able to use this
 return value almost raw; stripping out the language and territory codes and
 passing them in as FC_LANG, right?

  Did you mean that only codeset part is relevant here and we can
go without relying on lang and territory codes? The codeset  doesn't
carry any lang-specific information if UTF-8 locale is used.

   Jungshik

___
Fonts mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/fonts



Re: [Fonts]Automatic 'lang' determination

2002-06-29 Thread Jungshik Shin




On Sat, 29 Jun 2002, Jungshik Shin wrote:
 On Fri, 28 Jun 2002, Keith Packard wrote:

  I'm confused by this; my exposure to Chinese fonts says that simplified
  Chinese and traditional Chinese have significant overlap in Unicode
  codepoints, but that the glyphs are quite a bit different in appearance.

   I doubt this is the case. As far as I can tell

  I found this needs some clarification.  If glyphs of 'A', 'B'
and 'C' from Times Roman Latin-1  font are compared with corresponding
glyphs from New Century Schoolbook Latin-2 font, they look certainly
different. However, that does not mean that you cannot use Times Roman
Latin-1 font to render a run of text in one of languages Latin-2 is meant
for as long as Times-Roman Latin-1 font has _all_ the glyphs necessary in
that particular run of text.

  I believe the same thing can happen between two fonts for
zh-TW and zh-CN. If glyphs from font A for zh-TW are compared with glyphs
from font B (with different design principles) for zh-CN, they for sure
look different. However, they're different not because font A is for zh-TW
and font B is for zh-CN but because they're designed to appear different.

  Chinese and traditional Chinese have significant overlap in Unicode
  codepoints, but that the glyphs are quite a bit different in appearance.

  To make this kind of comparison meaningful, you have to compare
two fonts, one for zh-TW and the other for zh-CN, made by a _single_
foundry with the _identical_ design principles and look and feel
(something like Adobe Times Roman Latin-1 font and Adobe Times Roman
Latin-2 font).

  In practice, it's hard to find two fonts that satisfy the crieteria I
outlined here.  However, ISO 10646 code charts for Han characters should
do almost as good a job.  That's why I suggested comparing glyphs for
PRC and Taiwan in the ISO 10646 Han character chart.

   Jungshik Shin

___
Fonts mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/fonts



[Fonts]Han unification(SC and TC)(was..Re: Automatic 'lang' determination)

2002-06-29 Thread Jungshik Shin

On Sat, 29 Jun 2002, Keith Packard wrote:

Ooops. My message crossed yours in mail :-)

 Around 9 o'clock on Jun 29, Jungshik Shin wrote:

  IMHO, most problems with Han Unification arise not from using a _single_
  font targeted at one of zh_TW/zh_CN/ja/ko to render a run of text in
  another but from mixing _multiple_ fonts (with _drastically different_
  design principle and other differences like baseline) to render a single
...

 Yes, I agree -- this is true in Western languages as well where the


  We agree with each other on this point, but still get to different
conclusions about zh-CN and zh-TW. I'm afraid that's because you have
been misinformed about what Han unification has done about simplified
forms and traditional forms of Chinese characters.


  Suppose there's a document tagged as zh_TW that explains how PRC government
  simplified Chinese characters to boost the literacy rate after WW II. If a
  Big5 font (that doesn't cover all characters in the doc) is selected
  instead of a GBK/GB18030 font (with the full coverage), simplified Han
  characters(not used in Taiwan but only used in PRC) in the doc have to be
  rendered with another font (most likely GB2312/GBK/GB18030 font).

 A correct version of this document would tag individual sections of the
 document with appropriate tags.  This way, the zh_TW sections could be
 presented in a traditional Chinese font while the mainland portions are
 displayed with simplified Chinese glyphs.

  Well, even without language tagging, that would happen, which
I regard as _ugly_ for the reason I gave in my previous message.
Language tag or not, the result would be just as ugly as using TimesRoman
Latin-1 font for most characters with a couple of characters rendered with
Palatino Latin-2 font.  My hypothetical document would not have separate
sections for zh-TW and zh-CN, but rather occasional simplified forms of
Chinese characters (absent in Big5 fonts but present in GB2312/GBK/GB18030
fonts) would pop up among traditional forms of Chinese characters
(present in _both_ Big5 font and GBK/GB18030 fonts).

  IMHO, tagging the whole document as 'zh-TW' is perfectly valid
and rendering it with GBK/GB18030 (with the full coverage of characters
in the document) is better than mixing two fonts, one with Big5 coverage
and the other with GBK/GB18030 coverage. The latter would happen if you
exclude GBK/GB 18030 fonts for zh-TW text rendering.

  Tagging individual simplified forms of Chinese characters
with 'lang=zh-CN' in the sea of traditional forms of Chinese characters
would only lead to a less-desirable result than otherwise possible.


   I'm not sure what you meant by 'glyph forms are more likely
  simplified'. You might have misunderstood some aspects of Han Unification
  in Unicode/10646.  In Unicode, simplified forms of Chinese characters are
  NOT unified with corresponding traditional forms of Chinese characters.

 You're right -- I didn't believe this to be the case.  I had heard that the
 unified portion within the BMP do co-mingle simplified and traditional
 forms, but that the non-BMP Han extension provide separate codepoints for
 each.

  I'm afraid what you have heard of BMP section is misleading if
I understood you correctly. Whether in BMP or not, simplified forms of
Chinese characters are NOT UNIFIED with traditional forms of Chinese
characters. (let me copy my message to John H. Jenkins @Apple who knows a
lot more about Han Unification than I do.)  AFAIK, most complaints about
Han unification does NOT come from zh-CN vs zh-TW BUT from zh-CN/zh-TW
vs ja. For Han characters common in both zh-CN and zh-TW, there's no
significant difference in appearence between zh-CN and zh-TW. Although
many Japanese would not agree with me, I don't think there's any
significant difference across CJKV.  (again, ISO 10646 Han chart is a
good reference along with ROC MOE's Han character variant dictionary at
http://140.111.1.40) To me, Han Unification should have gone further (not
less) in a sense and it's worrisome to me that non-BMP includes too many
glyph variants (a whole bunch of them coming from Korean Buddist text :
see http://www.sutra.re.kr)  that should have been unified in my eyes.

 If even BMP codepoints are separate,
 then it should be possible to create
 a large set of codepoints which could mark fonts as suitable for the
 display of simplified Chinese which are distinct from the set of
 codepoitns suitable for the display of traditional Chinese.   That would
 be nicer than my current kludge of marking any font suitable for
 traditional chinese as unsuitable for simplified Chinese.

How about this?

   if covers most of GB 18030
  good for both zh-CN and zh-TW
  (and possibly good for ko)
   elif covers most of GBK
  good for both zh-CN and zh-TW
  (and possibly good for ko)
  not good for ja
   elif covers most of Big5,
  good for zh-TW
  (and possibly good for ko)
  not good for ja
   elif covers most

Re: [Fonts]Automatic 'lang' determination

2002-06-29 Thread Jungshik Shin

On Sat, 29 Jun 2002, Yao Zhang wrote:

 It should be

   if (covers_almost_all_of (GB2312))
   font supports SIMPLIFIED Chinese
   if (covers_almost_all_of (Big5))
   font supports traditional Chinese

  After sending my prev. message, I read this and I have to
agree with this. This is better than what I sent earlier.  Just forgetting
about GB18030/GBK coverage and concentrating on GB2312 and Big5 coverage
is simpler as well as better.

  Jungshik Shin

___
Fonts mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/fonts