Re: Bidi status/help

Mikhael Goikhman Sat, 27 Apr 2002 16:35:15 -0500 (CDT)

On 26 Apr 2002 18:07:37 -0700, Nadim Shaikli wrote:
> 
> On Thu, 25 Apr 2002, Olivier Chapuis <olivier chapuis free fr> wrote:
> >
> > On Wed, Apr 24, 2002 at 12:41:38PM -0700, Nadim Shaikli wrote:
> > > 
> > > On Wed, 24 Apr 2002 06:56:05 +0200, "Olivier Chapuis" wrote:
> > > > 
> > > > - Displaying  string with a core *-iso10646-1 font: at the present
> > > > time this should work only with fvwm compiled with MB support, an utf8
> > > > locale and a libc that support utf8 locale (also X should reconize
> > > > the locale as an utf locale: see 
> > > > /usr/X11R6/lib/X11/locale/loacle.alias).
> > > > Unfortunately, I cannot test this as my libc does not support utf8
> > > > locale. Mikhael, can you confirm that this work?
> > > 
> > > Well, I'm not sure why all that is needed (I'm simply trying to display
> > > a few glyphs not alter my keyboard mapping, or specify date/currency
> > > settings, etc).  By the way, I'm able to use multiple Arabic-enabled
> > > utilities without having to touch any of what is noted above (locales,
> > > libc support, etc) given the availability of the fonts and a rendering
> > > engine (glyph displayer) and Bidi (fribidi) - one such application is
> > > vim-6.0/6.1.
> > 
> > Yes and you use a special font made for vim ... Maybe you start vim with
> > some options? Moreover, I think that vim use an other method than fvwm
> > to display strings. Also some applications (KDE2&3/GNOME2) use only utf8,
> > but this is not possible with fvwm (we cannot ask everybody to use only
> > utf8 in configuration files). Moreover, most of these applications use
> > (lib)iconv which can cause portability problem (basically we should also
> > ask everybody with a non (or old) GNU system to install gnu libiconv).
> 
> Just in passing, those fonts are not specific to vim and no special
> options are needed to use 'em (ie. start options).


This is not quite correct. You add to .vimrc these lines:

  set guifont=-misc-fixed-medium-r-normal--20-200-75-75-c-100-arabeyes-1
  set encoding=utf-8

So you explicitly request utf-8. In fvwm it is currently determined
automatically by font and locale.

> BTW, couldn't those methods (libiconv and others) be things that could
> be included during configure time ?

libiconv is already auto-configure'd like any other non-portable thing.

> > Here the problems:
> > 
> > 1 - if we want to make some bidi conversion on a string we _must_
> > know the encoding used by the string. The only good general method I
> > found is to look at the font name of the used font and to extract
> > the charset/encoding from this name. There is a precise specification
> > here from the X Consortium, the last two "items" of a font name should
> > be the CHARSET_REGISTRY and the CHARSET_ENCODING. Then, if you use
> > a font which does not respect this standard no conversion will be
> > done. If you use a font which use a strange charset name as say
> > "my_arabic-36" we may "fix" fvwm so that fvwm understand this
> > name in one second (if it correspond to an encoding that fribidi
> > support). This may become configurable in a special way in the
> > future. A simple alternative would be to add a new fvwm Style
> > (and new module config options) to specify the charset of the
> > font ... (basically xterm use this method with the  -u8 option),
> > but I do not like so much this idea. I prefer that the font
> > define the encoding used by the user.

I think too that a font should define the encoding used by the user
(together with locale if a font information is not enough), although it
may sound strange. The logic is that a user would not use that font if it
is not suitable for his text to display.

An alternative would be to require to specify an encoding together with
every string. Or adding Encoding option. But in the end we will find that
a user should usually specify Encoding and Font in the same place (for
example in MenuStyle, or like with .vimrc), so there is a duplication.

> A couple of things -
> 
> a. You can run Bidi on any string irrespective of its encoding and
>    it ought to work (only encodings that it cares about will be
>    touched - namely 8859-6 and 8859-8) - so, unless I'm missing a
>    detail, I don't see why fvwm needs to know the encoding of the
>    strings at all (except in the case where it needs to display
>    'em, of course :-)

Because a string "ÓáÑ" should only be Bidi'd if its encoding is iso8859-6,
but not koi8-r. Even vim should know string encoding (you specified utf8).

fvwm should also know to translate between charsets used by locales to the
ones fribidi_parse_charset knows about. For example on Solaris "8859-6",
on hpux "iso86", but fribidi_parse_charset knows only about "iso8859-6",
at least we can't risk. It also seems that fribidi does not know about the
iso8859-6.8x charset, still fvwm supports it. If you have any info on the
subject (fribidi or iso8859-6.8x), share it.

> b. I think we are discussing an optimization regarding the loading of
>    the fonts, no ? Meaning, why not simply load all the fonts a user
>    specifies even if it were a 10646 with english, russian, arabic,
>    japanese, hebrew, chinese, etc when a user specifies "all" (or
>    similar somehow within fvwmrc file)
> 
> > 2 - After fvwm have determined the charset and load the font it should
> > display the string (forget bidi here). With an unibyte font (as iso8859-X)
> > there are no problems. Why do you not use iso8859-6 fonts with fvwm?
> > Do you need others characters than ASCII and Arabic one's?
> 
> Yup, Arabic requires Form-B glyphs rendering iso8859-6 useless for display.
> 
> In brief (beyond the realm of fvwm really, but you asked :-), Arabic has
> complexities which few other languages have in which characters are
> displayed differently based on their location within the word and sentence.
> All files are stored with 8859-6 encodings, but for those applications
> which don't use an external display engine (and most don't :-) one is
> required to get the extra display glyphs from somewhere else (Unicode's
> Form-B -- http://www.unicode.org/charts/PDF/UFE70.pdf).

It is good to know that iso8859-6 is fully capable for storing Arabic
without a loss, I only knew that iso8859-6 glyphs are not enough for
displaying.

> > With a multibyte font (as iso10646) locale is important, as fvwm use a
> > (powerful) method from X to display string and load font. Good fonts
> > are automatically loaded (no need to specify the charset) and string
> > are well displayed without the need to think to much :o)
> > So with iso10646, at the present time, you should use an utf8
> > locale (or good ttf font and XFree-4.good_version, here yet an
> > other method is used).
> > 
> > Now as utf8 grow in importance and a lot of system does not support
> > utf8 locale I will see if fvwm can support utf8 on system which does
> > not support utf8 locale. You may be happy with this but you should
> > wait a bit. Ooops, this works now on my machine but I cannot commit
> > the change before Monday, this should work on any X server (I have
> > added a 4th method to display string ...).
> 
> OK, I think you are saying that with 10646 fonts the locale will
> specify which subsections will be loaded and ought to be used as
> the charsets to display within fvwm, right ?  What will a person
> need to do in order to display say 4 languages (english, arabic,
> japanese, russian) all within the same the 10646 font file - a
> single locale won't suffice or will it ?  Seems like a more dynamic
> solution ought to be sought.

I am not an autority, but I think that locale has nothing to do with
displaying different languages at once, it only defines a natural language
and other local properties that a user wants to pass to his applications.
I.e. with ru_RU.utf-8 /bin/date writes in Russian, with ar_JO.utf-8 - in
Arabic, still both outputs use the same unicode font.

I don't think that locale is used anywhere for font loading optimizations
although it may.

> Before I got this email, I was thinking more along a way to specify
> the usage of utf8 within fvwm2rc (or fvwmrc :-); something along the
> lines,
> 
>   Style * Charset utf8                // meaning all of 'em (*utf8* :-)
>    -or-
>   Style * Charset english:arabic:sv.UTF-8:iso8859-7
> 
> Just an idea.

The first line is similar to the Encoding option I mentioned above,
but I don't understand what exactly should the second line do.
I see here natural language names, locale names and charset names. :)

Regards,
Mikhael.
--
Visit the official FVWM web page at <URL:http://www.fvwm.org/>.
To unsubscribe from the list, send "unsubscribe fvwm-workers" in the
body of a message to [EMAIL PROTECTED]
To report problems, send mail to [EMAIL PROTECTED]

Re: Bidi status/help

Reply via email to