On 18 Dec 2002 18:57:32 -0800, Nadim Shaikli wrote: > > > > With regard to your 'naskhi' font - if it contains the required > > > Form-B glyphs (U+FE70 - U+FEFF), then the following ought to work, > > > > > > Style Arabic Font *-naskhi-medium-*-iso8859-6/iso10646-1 > > > > If it is iso8859-6 font (not unicode), it can't be promoted to iso10646-1. > > iso8859-6 is a subset of iso10646-1 -- again, iso8859-6 alone is > simply not usable; its visually incorrect without shaping and one > is not able to shape sans Form-B glyphs (there are a plethora of > posts regarding this topic on the 'net - I can certainly send you > the links if you like so as not to go on a tangent on this forum).
I know the theory well now. But you miss my point. I want everything to work, including iso8859-6. You can't deny that there is such charset (and encoding) and as you said it is not losing for Arabic. So if a user requested iso8859-6 fonts (without non iso8859-6 characters, of course) he don't want to see question marks for valid iso8859-6 characters. Don't worry about this, I may later fix one-byte Arabic charsets myself. Or not fix, if you are against supporting all existing Arabic fonts. :) > > Nadim, you seem to imply that the only valid way to write Arabic is > > unicode. But this is not correct. Here is a valid Arabic that is not > > unicode: env LANG=ar_JO.iso8859-6 date > > I don't have any Arabic locale on this machine - sorry. But I do > indeed imply and state that Arabic should be used with UTF-8 and > nothing else (not even CP-1256 :-) I'm actually curious to why > fvwm doesn't simply default to UTF-8 at all times ? If you use iso10646-1 fonts, it defaults to utf-8, is not it? > > We supported all iso encodings. I see no valid reason to stop to support > > iso8859-6. I think the problem is that once shaping is applied fribidi > > (or is it iconv?) can't go back to iso8859-6 and uses question marks then, > > so we should only apply shaping for unicode encoding of original strings. > > I don't think its a question of support. Fvwm is doing the right > thing. I view this as "faulty/missing font" issue. The font file > you were using simply doesn't have the _required_ Form-B glyphs and > thus Arabic can't be displayed properly. Its like wanting to display > chinese without having the correct chinese glyphs and getting question > marks instead. What you say is that all existing CP1256 and iso8859-6 one byte fonts should show question marks and never Arabic glyphs that they contain. I don't know, it is not hard to fix this situation. > Out of curiosity, how do > 'env LANG=ar_JO.iso8859-6 date' and > 'env LANG=ar_JO.UTF-8 date' differ ? The first returns regular one-byte Arabic and English characters, totally 40 bytes. The second returns the same text, but in utf-8, 56 bytes. Only ascii characters (the first 127) are the same in both encodings. Try: cat Arabic+English-utf8-encoded-file | iconv -f utf-8 -t iso8859-6 By the way, FVWM supports CP-1256 encoding without problems, as far as I can see, when I set CP-1256 encoded title using: env LANG=ar_JO.iso8859-6 date | iconv -f iso8859-6 -t cp1256 I even see it correctly shaped (I think) if I use unicode font like: Style Arabic Font StringEncoding=CP1256:*-arabeyes-*/iso10646-1 The reason arabeyes is not recognized as iso10646-1 is bugs in this font. > iso8859-6 is an code-table representation (ie. an assignment of > integer numbers to characters) where-as UTF-8 is a representation > format (sequence of bytes). So I'm not sure what you mean above > by "support all iso encodings". Actually, here I meant more "support all charsets, i.e. fonts". > In other words, my ability to do > StringEncoding=iso8859-6 and StringEncoding=UTF-8 seems a bit like > comparing apples to oranges. No, they are not apples and oranges. iso8859-6 is both charset and encoding, like any other one-byte charset, where one char is one byte. There are iso8859-6 encoded texts (short) and utf-8 encoded texts (use more bytes). It is possible to convert such text in one direction, but not always in another. It is always possible to convert between iso8859-6 and cp1256 texts except for maybe some funky chars, I guess. > I can understand the following encodings > UTF-8, USC-2, USC-4 and UTF-16, but don't quite understand a setting > akin to 'StringEncoding=iso8859-6' (unless fvwm is mapping names to > encodings which is what I thought it did - "convenience magic"). If you have text stored in some encoding (iso8859-6 or cp1256), you may find it useful to be able to convert it to something else, like utf-8 to use with unicode fonts. FVWM allows this using StringEncoding. Out of curiosity, do you have Arabic text files? Are they all in one or another unicode encoding? I read literature in several languages, but I should yet encounter utf-8 text. If there are one byte encodings and there is only one language (except for English) unicode is a waste of space. :) Of course, to see one byte encoded text, you should replace "set encoding=utf-8" in your .vimrc that I know you have. Regards, Mikhael. -- Visit the official FVWM web page at <URL:http://www.fvwm.org/>. To unsubscribe from the list, send "unsubscribe fvwm-workers" in the body of a message to [EMAIL PROTECTED] To report problems, send mail to [EMAIL PROTECTED]