from:"Juliusz Chroboczek"

Re: [I18n] Keyboard support for Sorbian

2004-03-23 Thread Juliusz Chroboczek

EW> I have written an xkb file for Upper and Lower Sorbian

Wow.

(bugzilla.xfree86.org.)

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n] UTF-8 ICCCM properties

2004-03-23 Thread Juliusz Chroboczek

MK>  WM_CLIENT_MACHINE
MK>  WM_ICON_NAME
MK>  WM_NAME

These are properties, not selections.  They are not polymorphic, and
cannot be negociated.

MK> The only alternatives I could think of are all a bit ugly, such as
MK> adding

MK>   WM_CLIENT_MACHINE_UTF8
MK>   WM_ICON_NAME_UTF8
MK>   WM_NAME_UTF8

MK> all of type UTF8_STRING.

This is exactly what the freedesktop.org WM specification mandates.

  http://freedesktop.org/Standards/wm-spec/1.3/

See for example the property _NET_WM_NAME.

Please do not (single-sidedly) define Yet Another Convention.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n] Unicode keysym questions

2004-03-08 Thread Juliusz Chroboczek

MK> The X.Org Foundation has given me access to their CVS just last week to
MK> ammend the X11 protocol specification and to make this convention
MK> official. I was on a phone conference with them last Monday and they all
MK> agreed that adding the 0x0100 convention to the standard would be
MK> most sensible.

Congratulations, Markus.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n] Unicode keysym questions

2004-03-06 Thread Juliusz Chroboczek

AK> while trying to develop a keymap which includes mathematical symbols, I am
AK> wondering about the exact status of the "UCS keysyms" 0x0100 and
AK> above... Are these already standardized? 

They are under discussion at X.Org.

AK> Do any X servers except XF86 currently use them?

You can use them with any X server if you configure your modmap/xkb
right.  However, I believe that the stock XFree86 server is the only
server that will generate them in a default configuration.

AK> And... how exactly should they be interpreted by clients? Should
AK> there be any difference between for example "eacute" and "U00E9"?

(I assume you mean keysym 0x1E9 by the latter.)

Only the former should be used.  The Unicode keysyms should only be
used for symbols that don't otherwise have a keysym assignment.

However, your application should be robust and accept the two keysyms
as synonymous.  This is especially true because of the messup with
Vietnamese in XFree86 4.4.

AK> Should a client interpret a U001B as an escape keystroke,

No server should ever generate keysym 0x11B.  All bets are off if
it does.

As to your application, a reasonable thing to do is to beep when it
sees keysym 0x11B.  On the other hand, it would be a good idea to
do be robust on keysyms such as 0x1E9, as such keysyms are very
likely to be generated by servers with broken configs (as you've
noted).

AK> I also noticed that the Compose-Files of 4.3.0 in UTF-8 locales use the
AK> U keysyms even for characters that have old keysyms (all the accented
AK> latin-{12...}  chars).

If so, this is most definitely a bug.  Could you please file it?  (If
you want to put me in a CC, I'm [EMAIL PROTECTED] for the XFree86 bugzilla.)

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n] patches to luit for (hkscs + gb18030) support

2004-01-07 Thread Juliusz Chroboczek

> -if(c >= 0) {
> +if(c != -1) {

Why's that?  Are you fixing an unrelated bug?

Other than that, I have no objection after a quick read.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n] mkfontscale (CVS) broken (?) for koi8-r and koi8-u encodings

2003-11-16 Thread Juliusz Chroboczek

MV> when mkfontscale tests a font for covering a koi8 encoding, it
MV> ignores the lack of linedrawing and pseudo-math characters in the
MV> font.

MV> But the lack of 0x00b2, SUPERSCRIPT TWO, is not ignored.

[...]

MV> The patch fixes that, plus adds one more koi8 variant and corrects
MV> a typo in comment.

I agree (except for the ``typo'' in the comment).  Egbert, please
apply.  (Message-ID: <[EMAIL PROTECTED]>)

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n] A question on handling UTF-8

2003-09-04 Thread Juliusz Chroboczek

MH> I'm currently wrestling with handling UTF-8 text output.

Please consider switching to the Xft library.

MH> Creating a fontset and using 'Xutf8DrawString' to display a string with
MH> Western and Japanese Katakana works for me.

Good.

MH> o What is the current status of the utf8 functions?

They are part of XFree86 since 4.1, and are reliable since 4.2.

MH> o Are the UTF-8 functions available in a separate library for non-XFree86
MH>   systems?

I do not think so.  As they delve quite deeply into Xlib internals,
building such a library might be a lot of work.

MH>   (Are they, for example, related to a library named 'Xutf8' which
MH>   could be used as a fallback?)

Reference?
 
MH> o Why does [XmbDrawString] fail to display the Katakana when
MH> Xutf8DrawString succeeds, which seems to indicate that my fontset
MH> is correct?

Don't know.

Please let us know when you find out.

MH> o Does it fail because I need to update something? I.e. files in
MH>   '/usr/X11R6/lib/X11/locale/en_US.UTF-8'? I'm currently using the
MH>   XFree86 projects pre-compiled binaries of XFree86-4.3.0.

Which does include JISX0201.1976-0:GR in en_US.UTF-8.  Sorry, can't help you.

MH> o 'XOpenIM(display, NULL, NULL, NULL)' fails for LANG=en_US.UTF-8.

Yep.  That's expected brain-damage, IMs are locale-specific.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n] re: a new font encoding file for XF86 : gb18030.2000-2? (fwd)

2003-07-09 Thread Juliusz Chroboczek

YL> 1. To my knowledge the gb18030.2000-0 and gb18030.2000-1 encodings are
YL> invented by Sun and used in their Solaris 9. The only application on Linux
YL> that supports them is Mozilla (maybe Java1.4 as well?) at the request of
YL> Sun (see mozilla bug 72525). IMHO, if you want to extend the system to add
YL> such as gb18030.2000-2, it's probably a good idea to consult with Sun just
YL> so that it will be compatible with any potential Sun's own extension.

1. Sun implements GB18030.2000.
2. Mozilla implements GB18030.2000 for compatibility with Sun.
   Because Mozilla is cross-platform, the support finds itself on Linux.
3. XFree86 should implement GB18030.2000 for compatibility with
   Mozilla.

Interesting process.

Juliusz

P.S. Not that I care either way.
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n] chinese locale slow Application startup speed.

2003-07-05 Thread Juliusz Chroboczek

LC> I port Tiny-X to XScale, I can show chinese
LC> When I export LC_ALL=zh_CN , some application startup speed is too slow, 
LC> but some application startup speed is OK.

Do your applications still use core fonts?

Please read the introduction to 

  hw/xfree86/doc/README.fonts 

keeping in mind that large (in number of glyphs) scalable core fonts
may cause the behaviour that you're seeing.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n] Unknow Font Foundry??

2003-01-27 Thread Juliusz Chroboczek

A> But these fonts I have successfully installed on both Mandrake and
A> Redhat. This, I try to install on Peanutlinux.

A> Any further suggestions what might be missing in X or etc.

There's nothing missing in X.

The fonts use non-standard foundry codes.  ttmkfdir and mkfontscale
will work fine, but they will give the fonts XLFD names starting with
`-misc-'.

You can use these fonts by using the `-misc-...' name, or else by hand
editing fonts.scale to insert a reasonable foundry name instead of
`-misc-'.  Neither the X server nor the Xft library will have any
problem using these fonts.

If you send me the Copyright string of these fonts together with the
foundry code, I'll modify mkfontscale to deal with them.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n] Unknow Font Foundry??

2003-01-27 Thread Juliusz Chroboczek

A> I tried to install Lao OT fonts and got the following error message.
 
A> (none):/usr/share/fonts/defaults/TrueType# ttmkfdir > fonts.scale
A> unknown font foundry code JG
A> unknown font foundry code LSW

The fonts use font foundry codes that were not registered with
Microsoft.  Both ttmkfdir and mkfontscale will generate XLFD foundry
names that are merely ``misc''; the fonts will still work.

Could you please use ttfdump and tell me what the ``Copyright'' string
in those fonts says?  You'll find ttfdump somewhere on ftp.freetype.org.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]special characters like greek omega?

2003-01-02 Thread Juliusz Chroboczek

AT> How is the way to get an Omega printed in a terminal? (in X)
AT> My LANG is set to en_US. And on my keyboard map I defined alt+Z to
AT> 'print' Greek_omega.  But I get no output.

Run uxterm rather than xterm and switch to a UTF-8 locale (for example
en_US.UTF-8).


Juliusz

___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]Re: Decent french keyboad layout in XFree86

2002-12-12 Thread Juliusz Chroboczek

Before your contribution can even be considered for inclusion, you
should put the definition file on the web somewhere, send the URL to
this list and ask for people to test it.  If you don't have web space
to spare (why?  what's wrong with free.fr?), you can also send it to
this list, although some people don't like submissions filling their
mailbox.

(Can anybody confirm the issues that Nicolas is describing ?  I use a
hacked US keyboard myself.)

(The idea of putting oe on the ^2 key is interesting, although I'd put
ESC there myself.)

Send it to fixes at xfree86.org with an explanation of what your patch
does.  Remember that the committers do not necessarily read this list,
and that most of them are not French speakers.

Then follow CVS, and complain loudly if the commit went wrong.

Finally, implement an all new keyboard definition framework and get it
to replace Xkb.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: Solution. was:[I18n]XFree86 Xutf8LookupString BUG with Solarix X server.

2002-12-03 Thread Juliusz Chroboczek

IP> And I still doubt is the ASCII characters a special case or they
IP> should be processed in the same way.

>From your description only, it does indeed sound like there's no
reason to treat ASCII separately.  If you've got the time and
inclination to do so, you could wait until 4.3.0 is out, then submit a
patch that removes the special-casing and see whether any complaints
come it.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]Missing locale?

2002-11-30 Thread Juliusz Chroboczek

BDS> If South Africa isn't part of Europe, do I still need Euro font
BDS> support?  I would actually like having Euro font support in X
BDS> (for, say, e-mails to other countries), but I thought it wasn't
BDS> necessary for these entries to exist to support that.  (I'm an
BDS> i18n newbie).

Depends on your mailer.  However, you probably don't want to convert
to using an ISO 8859-15 locale unless you're in the European Union.

In the long term, we hope to use UTF-8 locales everywhere, so you can
have both the Euro and the Tughrik symbol whatever your locale.

>> en_ZA.ISO-8859-15   en_ZA.ISO8859-15
>> en_ZA.iso885915 en_ZA.ISO8859-15
>> en_ZA.utf8  en_ZA.UTF-8

It really cannot harm to add these.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: Solution. was:[I18n]XFree86 Xutf8LookupString BUG with Solarix X server.

2002-11-29 Thread Juliusz Chroboczek

Wow.  Could you please explain what's going on here?

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]XFree86 Xutf8LookupString BUG with Solarix X server

2002-11-28 Thread Juliusz Chroboczek

'A knot!' said Alice.  'Oh, do let me help to undo it!'

Could you please put an xscope dump on the web somewhere ?

Thanks,

Juliusz

P.S. You'll find xscope on http://keithp.com/~keithp/download/ .
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]Syriac keyboard layout

2002-11-24 Thread Juliusz Chroboczek

PS> Only use named keysyms if they already are in keysymdef.h

Yes.

The converse is also true: do not use a Unicode keysym if there is a
named keysym.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]Font encoding files for GB18030 font.

2002-11-24 Thread Juliusz Chroboczek

> STARTMAPPING cmap 3 4
> # the identity mapping
> ENDMAPPING

I believe this is wrong, and should be removed.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]Syriac keyboard layout

2002-11-15 Thread Juliusz Chroboczek

ES> -- for example, Arabic-Indic numbering, plus, minus, Arabic shadda to
ES> name a few.

ES> Would be it safe to assume that those charactes should be assigned a
ES> unicode codepoint instead of directly going to keysymdef.h?

No.

If a key has an ad hoc keysym definition, it *must* use it for
compatibility with legacy applications and libraries.

If a key has no ad hoc keysym definition and corresponds to a Unicode
codepoint, it *must* use the Unicode keysym for compatibility with
current XFree86 libraries.

Having multiple keysyms for a single character is where madness lies.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]Locales and charset encodings

2002-11-12 Thread Juliusz Chroboczek

DW> Is there a definitive way to determine what character
DW> encoding XmbLookupString() is going to return text in?

You cannot.  That's the whole point of the CSI ideology.

(Code Set Independence.  n.  The notion that a client is not supposed
to know anything about the character encoding.  Implies that Arabic
doesn't exist.)

Actually, you can.  Use nl_langinfo(3), but it's an SUSv2 interface
(i.e. it's not in POSIX), and its use is frowned upon by true CSI
believers.

DW> Would there be a way to _force_ the encoding to, say 
DW> UTF8?

That's what the Xutf8* functions are designed for.  If you need a
different encoding than UTF-8, use Xutf8* in combination with
iconv(3).

Note that the use of Xutf8* is not portable, as they are only included
in the XFree86 libraries.  A very partial emulation of some Xutf8
functions for legacy implementations of X11 can be found in the
XFree86 tree in xc/programs/xterm/xutf8.c.  Please feel free to
distribute this file with your application (no strings attached,
although I'd expect you to contribute any improvements you make to it
back to XFree86).

Happy CSI hacking,

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]Xutf8LookupString and KeySyms > 0x010000000

2002-11-11 Thread Juliusz Chroboczek

>>> But my question is always unanswered.
>>> Will Xutf8LookupString supports Unicode KeySyms without strings

Good question.  I've always assumed it did, but I've really got no
idea.

JL> So can I modify the X11 Xutf8LookupString function ?

If it doesn't have this functionality yet, go for it, by all means.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]Who maintains kinput2? I have a fix to apply.

2002-11-06 Thread Juliusz Chroboczek

I've forwarded that to Hideki Hiura, he may know what to do with it.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]Default fonts for xterm

2002-08-26 Thread Juliusz Chroboczek


TK> So far, XTerm uses *-iso8859-1 fonts as default.  Though the
TK> default settings have been good, I think it is not always good now
TK> or in future, because current XTerm supports UTF-8 mode and it can
TK> activate UTF-8 mode automatically.

Currently, there are two defaults files for XTerm, XTerm.ad and
UXterm.ad.  XTerm's class name depends on whether or not it was
invoked as uxterm.

Shouldn't we consider automatically switching XTerm's class depending
on the mode?

Juliusz

___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n](no subject)

2002-08-12 Thread Juliusz Chroboczek


ZL> Under X, you may want to follow XIM (the mechanism you are asking)
ZL> to implement a input method server.

No, you don't.  XIM is a morass of complexity which should only be
used for scripts that need it.  Avoid its use for simple alphabetic
scripts.

Juliusz

___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]xterm i18n patch

2002-08-12 Thread Juliusz Chroboczek


TK> I found another i18n patch for xterm (X.org xterm, not XFree86's one).

TK> http://www-106.ibm.com/developerworks/linux/library/l-ltc27.html

Interesting.  It does not look like an XTerm variant at all -- it's a
VT100 terminal emulator (not based on XTerm's emulator) packaged as a
library, that comes with two sample frontends (Xaw and FB).

It looks infinitely cleaner than XTerm, but less featured and based on
the C locale model (``internally CSI'', which you already know my
opinion about).  I'll have a closer look later.

Beware, though, it doesn't come under any of the licenses we're used
to (``IBM Common Public License Version 0.5'' -- is the license still
in beta?).

Juliusz

___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [Fonts] [I18n] language tags in fontconfig

2002-08-12 Thread Juliusz Chroboczek


KP> While I've never seen ñ in my limited exposure to French,

Neither have I (with over 20 years exposure to the language).  I
suggest you remove it -- it's not in the Adobe Standard encoding, so
some fonts may lack it.

KP> The only questionable thing I believe I've done is to eliminate the OE 
KP> ligatures and Y with diaeresis from the French list -- those aren't in 
KP> Latin 1, and I wanted to permit Latin-1 fonts to be marked as supporting 
KP> French.

The oe ligatures are needed, and they are in both Adobe Standard and
CP 1252 -- hence, both Type 1 and TrueType fonts should contain them;
I suggest you add them back.  Never mind Y-dieresis, on the other hand
I can only think of one place name with a y-dieresis.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]Displaying chinese in an xterm

2002-08-03 Thread Juliusz Chroboczek


BB>  xterm -u8 -fn \
BB>  -misc-fixed-medium-r-normal--16-120-75-75-c-0-unicode-0 \
BB>  -e luit -g2 'GB 2312'

BB>  the xterm will display chinese but the xterm is somewhat insane.
BB>  the text is spread out to a follows

BB>   should be:->test
BB>   is   :->t  e   s  t

That typically happens when you use fonts that have f*cked up metrics.
As Tomohiro suggested, please use correct fonts.

Juliusz

___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]Test implementation of luit extension

2002-07-04 Thread Juliusz Chroboczek


TK> I wrote a new patch by following your advise.

That's really cool; thank you so much.

Unfortunately, I do not have time to review it right now; I know how
frustrating it is to get no feedback, but please be patient for a few
days.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]Test implementation of luit extension

2002-06-29 Thread Juliusz Chroboczek


TK> You mean, is->other should be a fifth slot (G0, G1, G2, G3, and other)?

Yes.  is->other should be a pointer to a Other charset.

TK> And, I don't know how 'return to previous charset' escape sequence.
TK> Does ISO-2022 or luit have save/load or push/pop behavior on charsets?

Here's my understanding of the ISO 2022 ``other'' model.

If is->other is NULL, we're doing ISO 2022.  An escape sequence may
then invoke a non-ISO 2022 charset, in which case is->other becomes
non-NULL.

In non-ISO 2022 mode, ISO 2022 sequences are invalid; however, there
may be an escape sequence called ``return to previous'', which is
simply implemented as setting is->other to NULL.

As you see, there's no need for a stack of charsets, a single
is->other field is enough.

TK> Do you think ISO-2022 escape sequences and control characters should
TK> be valid in GBK and other non-ISO-2022 encodings?  (I think both way
TK> have their merits and demerits and I don't have strong opinion on
TK> which is better.)

Only insofar as is needed for Emacs compatibility, IMHO.

TK> On what purpose ISO-2022 parser should be reentrant?

Hack value.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]Test implementation of luit extension

2002-06-25 Thread Juliusz Chroboczek


Nice one.  Thanks a lot.

A few minor objections, though.

First, the right term is not ``nonfontenc'' -- you do use fontenc in
the GBK code.  The right term should be ``other'', following ISO 2022
terminology.

I don't agree with the way you reuse the Ck slot for an ``other''
encoding; the way you do it, you won't be able to implement the
``return to previous charset'' escape sequence.  Instead, you should
add a new slot to the ISO 2022 state structure, say is->other.
Initially, is->other is NULL; when an ``other'' charset is invoked,
is->other points at said charset.  THe ISO 2022 parser checks
is->other, and skips all normal processing if it is non-NULL.

Finally, I object with the stacks being static.  I've gone to quite a
bit of effort to make the ISO 2022 parser reentrant (there's nothing
that prevents multiple ISO 2022 parsers being used simultaneously),
and that means that global variables should be avoided.  You should
put the stack in the ISO 2022 state, and allocate/free it on demand.

Thanks again,

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]FIRSTINDEX missing

2002-06-25 Thread Juliusz Chroboczek


JS>  Thank you for the kind explanation. I'll refer to your answer to
JS> make my case for keeping FIRSTINDEX intact font encoding files in
JS> RedHat 7.x stronger.

Note that the next version of RedHat will likely use mkfontscale ra-
ther than their hacked version of ttmkfdir, so the problem should go
away.

(For the record, the problem with FIRSTINDEX is not due to Joerg's
ttmkfdir, but rather to RedHat's specific version thereof.  Apparent-
ly, the RH maintainers chose to remove all FIRSTINDEX entries rather
than fixing their bug.)

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]FIRSTINDEX missing

2002-06-24 Thread Juliusz Chroboczek


JS> For jisx0212.1990-0.enc and gb2312.1980-0.enc, FIRSTINDEX should be
JS> be '0x20 0x20'. For gbk-0.enc, it appears to have to be
JS> '0x81 0x40'.

I'd rather you sent the patch -- this way, you'll become the contact
person if something's wrong.  You're doubtless better qualified than I
am to do that.

>> (FIRSTINDEX lines should *not* be included in linear encodings.)

JS>   Is it an requirement in matrix encodings or optional?

It's optional but strongly recommended.

Including a FIRSTINDEX entry has two effects, one negative, one
positive:

  - generation of the default glyph is suppressed;
  - the bounds of the encodings are passed to clients.

The reason for the first point is that we always put the default glyph
at (0,0).

The first point is why FIRSTINDEX should not be used with linear
encodings.  The second point vastly improves performance for matrix
encodings (by avoiding the sending of 0x2020 empty metrics) and makes
some (rare) applications change behaviour (e.g. xfd will directly send
you to the right page).

Juliusz

___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

[I18n]FIRSTINDEX missing

2002-06-21 Thread Juliusz Chroboczek


Hi,

While testing the FreeType 2 backend the other day, I noticed that
quite a few of the matrix encodings in fonts/encodings/large still
fail to include a FIRSTINDEX line.  I'm wondering if somebody
competent could fix that.

(FIRSTINDEX lines should *not* be included in linear encodings.)

Juliusz

___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]bug in gbk-0.enc.gz

2002-06-18 Thread Juliusz Chroboczek


TK> XFree86's table has additional codepoints to U+E7xx and U+E8xx,
TK> which CP936 does not have.  I don't know how to handle these
TK> codepoints.  (left unremoved?)

I suggest going ahead and removing them.  If somebody complains, we'll
know what they are for.

Juliusz

___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n][call for comments] XTerm patch to invoke luit

2002-06-11 Thread Juliusz Chroboczek


TK> I have to study about your suggestion and how to use
TK> XtAppAddConverter.

Don't bother, then.  Just encapsulate the parsing into a separate
function (the code is already spaghetti-like enough).

>> Why do you copy the argument into locale_string, rather than directly
>> doing a strcasecmp on the argument?

TK> I thought strcasecmp was not portable...

int
my_strcasecmp(char *a, char *b)
{
while(a && b) {
if(tolower(a) != tolower(b))
return 0;
}
return (!a && !b);
}

No need to do a copy (which I find confusing).

Juliusz

  
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n][call for comments] XTerm patch to invoke luit

2002-06-06 Thread Juliusz Chroboczek


Nice work.  Just a few minor comments.

I would suggest modularising the parsing of the argument.  The
officially sanctioned way is to define a converter, say
CvtStringToTristate, and register it with Xt.  See lib/Xt/Converters.c
and man XtAppAddConverter(3x).  Or modularise it by hand (just putting
it in a separate function).

Why do you copy the argument into locale_string, rather than directly
doing a strcasecmp on the argument?

It looks like falling back to non-luit operation when luit fails is
implemented -- good.

Other than the above, I don't have any objections right now.

Juliusz

___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n][call for comments] XTerm patch to invoke luit

2002-06-06 Thread Juliusz Chroboczek


Cool.  Some more reading ;-)

TK>   1. conventional 8bit
TK>  (8bit encodings are supported by changing fonts)
TK>   2. UTF-8
TK>  (UTF-8 is supported)
TK>   3. UTF-8 with luit
TK>  (various encodings are supported)

XTerm, as you know, can be configured to use either Render fonts or
core fonts.  For the foreseeable future, core still dominate.

The core fonts system has significant performance problems with
Unicode fonts.  Thus, when using core fonts, XTerm in 8-bit mode is
vastly faster and uses less resources than in UTF-8 mode.

I would therefore strongly recommend that ``medium'' mode should be
the default when using core fonts.  Feel free to make ``true'' the
default when using Render fonts.

Juliusz

P.S. And please remember to bully Thomas into changing the XTerm
terminfo entry to use VT 220-style AC.
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]Please do not use en_US.UTF-8 outside the US

2002-05-06 Thread Juliusz Chroboczek


JS>   I had to make up ko_KR.UTF-8 different from en_US.UTF-8 to make my
JS> transition to ko_KR.UTF-8 work as I intended.

Fair point.

Of course, the long-term solution is to use font technologies that do
language-dependent and contextual font and glyph substitution.
Client- or server-side.

Juliusz

___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]Please do not use en_US.UTF-8 outside the US

2002-05-04 Thread Juliusz Chroboczek


>> Given that en_US.UTF-8 is the only instance of a locale file with UTF-8
>> in its name,

MK> In Xlib, yes. This should be extended, I think.

Please check the locale.dir file, which remaps all known UTF-8 locales
to use the data from en_US.UTF-8.

I agree with you that there is a bug, the directory en_US.UTF-8 should
be simply called UTF-8, as it is the common locale directory for all
UTF-8 locales.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]gbk-0.enc patch

2002-03-22 Thread Juliusz Chroboczek



WKP>  STARTMAPPING unicode
WKP> -UNDEFINE 0 0x
WKP> +UNDEFINE 0 0xFEFE

This doesn't make sense to me.  What is it you're trying to achieve?

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]'xterm' terminfo definition conflicts with EUC encodings

2002-03-20 Thread Juliusz Chroboczek


TK> It defines alternate character set-related capabilities like:

TK> enacs=\E)0
TK> smacs=^N
TK> rmacs=^O

TK> which are indeed using G1 of ISO-2022.  This works well for 8bit
TK> encodings which use G0 and G2.  However, G1 is used in several
TK> encodings such as EUC-JP and EUC-KR.  Thus, using these terminfo
TK> capabilities will overwrite the initial settings of G1 in these
TK> encodings and break proper displaying.  This is a real problem in
TK> xterm+luit, though it doesn't matter for legacy xterm because it
TK> doesn't support EUC encodings.

This is a known issue.  Note that it is not peculiar to luit: the same
is true of kterm, which is why the kterm terminfo does not include ACS
capabilities.

TK> The root of this problem is the definition of terminfo capabilites
TK> of enacs, smacs, and rmacs which conflicts with EUC encodings.
TK> How can we change the definition?

Very simple.  We need to change the definition of xterm to use VT 220-style
ACS.  Namely

  :smacs=\E(0:rmacs=\E(B:

(enacs is not needed).  This will work for all locales that put ASCII
in G0 and point GL at G0 -- which includes all locales used on modern
Unices.

Additionally, we need a new definition for ``xterm with no ACS'' for
encodings that do not satisfy the above condition (e.g. ISO-2022-JP).
This is quite simple to add:

xterm-nacs|xterm without alternate characters:\
smacs@:rmacs@:enacs@:tc=xterm:

I have been arguing for these changes to be made to ncurses for months
(since the first releases of luit), but Thomas is afraid of what they
might do to backwards compatibility.  I would suggest that you take it
up with him.

Regards,

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]Add the LA symbols

2002-03-07 Thread Juliusz Chroboczek


MD> I have modified the es symbols file to reflect the latin-american
MD> keyboard layout. Is it already on the latest XFree86 release?

  http://www.xfree86.org/cvs/
  http://cvsweb.xfree86.org/cvsweb/xc/programs/xkbcomp/

MD> If not, to whom do I send it to get it commited?

  fixes at xfree86.org

        Juliusz Chroboczek
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]So, will Bidi+Xterm happen ?

2002-03-01 Thread Juliusz Chroboczek


NS>  - What are your comments on mlterm, patch27, biditext (have you
NS>  used 'em) ?

I haven't, as I don't speak any Semitic language.  Which is exactly
why I cannot make up an opinion before I see a description of what
they are supposed to do.

NS>  - Irrespective of various minor issues, what are they missing ?

Documentation.

You see, presenting a static text that is encoded in some implicit
directionality Hebrew encoding is not difficult.  The difficulties
arise when you start dealing with cursor movement, multi-column text,
etc.  I would like to learn how all of that should be done in the
presence of bidi text, and none of the bidi terminal emulator projects
sketched above seem to deal with the issue.

I've been reiterating the same point: while enough people ask for
terminal-level bidi support to make it appear as a desirable feature,
nobody has given an explanation of how terminal-level bidi is supposed
to work.

NS> 'cat' works great on mlterm as does 'more',

Neither `cat' nor `more' require video terminals -- they work fine on
hardcopy devices.  The very minimum that should work are `less' and
`readline'.  But there are further issues: what happens with
full-screen applications -- does the use of (terminfo) ti/te disable
bidi support?  If so, which sequences exactly have this effect?  Does
usage of a bidi-enabled terminal require the use of a specific
terminfo/termcap entry?

These are the most basic questions -- and nobody has answered them
yet.  There are many further issues to be decided upon, and we're
still waiting for at least an outline of the proposals.

NS> I'd still contend that if mlterm (and/or patch27) is not an
NS> acceptable solution, then we (and the authors) need to know why
NS> and how to go about making it "more" acceptable

A description of what exactly mlterm and patch27 do, and how they
compare with ISO 6427, would go a long way towards making the
discussion more interesting.

Regards,

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]So, will Bidi+Xterm happen ?

2002-02-26 Thread Juliusz Chroboczek


>> > 2.A bidi solution for the X Server is in the works-based on Open/X CTL

>> Is that true?

AC> Yes, check earlier posts (circa mid-last week).

It was my understanding that CTL (as available commercially for a few
years already) is a toolkit-level client-side solution.  If I'm wrong,
please point me at the relevant information.

Juliusz

___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]So, will Bidi+Xterm happen ?

2002-02-26 Thread Juliusz Chroboczek


NS> Having read this widely popular thread :-) I'm still fuzzy about what
NS> the consensus was (if any)

I don't think there's any consensus yet.  If I understood correctly,
the discussion may be summarised as follows:

  - a number of people think that we should ``do BiDi in the terminal
emulator''.  They were not willing (or not able?) to define what it
means to ``do BiDi in the terminal emulator'', instead pointing at the
Unicode BiDi algorithm (which is not designed for terminal emulators,
and may or may not be a sound basis for such applications).  Nobody
produced an informed comment on ISO 6429 BiDi.

  - everyone appeared to agree that there's a need for BiDi at the
curses/slang level.  This means that the terminal emulator-level BiDi,
if any, must be switchable.  For some reason, nobody seems interested
in implementing BiDi at the curses level (not sexy enough?).

  - some people produced analogies with luit, which to me seems to
imply a lack of understanding of what luit does (luit has *no* notion
of cursor position).  Unless I'm missing something, BiDi really
needs access to internal terminal emulator data.

You will doubtless agree that the current understanding of the issues,
as summarised above, is not a satisfactory basis for building
consensus.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]Re: user names

2002-02-15 Thread Juliusz Chroboczek


MK> You mix up English and Latin here!

AA> Latin ?
AA> I thought the suggestion here was for ASCII, which is insufficent for
AA> most European languages.

But sufficient for (Imperial) Latin ;-)  

(Marcvs, how do you spell your family name without using k?)

Jvlivs

___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]Re: Terminal BiDi semantics [was: BiDi rant]

2002-02-14 Thread Juliusz Chroboczek


TC> Sometmes a hacked-up support today is better than a planned support
TC> tommorow.

I don't agree.  Hacked-up support today will prevent proper support
from being implemented tomorrow.

TC> See again my comments regarding reading mail with pine.

I've read your comments.  I don't think that we're interested in
implementing what you're describing in XFree86 XTerm.

Juliusz

___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

[I18n]Re: Terminal BiDi semantics [was: BiDi rant]

2002-02-14 Thread Juliusz Chroboczek


TK> Besides aethetic problem whether bidi-support layer should be located
TK> in terminal emulator or other layers, 

It's not merely an aesthetic problem.

Unicode BIDI semantics are not adapted to terminal emulators.  I do
not fully understand ISO 6429 BIDI behaviour, and at any rate I'd be
nervous about implementing something that nobody seems to have an
opinion about.  I do not think that MLTerm's BIDI is documented
anywhere; if they are, please point me to the document.

I have no a priori objection to some form of BIDI support at the
terminal emulator level; but I do have an objection to some under-
specified, hacked-up support.

On the other hand, the behaviour of curses or slang or Emacs in the
presence of BIDI seem quite clear; I cannot help but wonder why it is
that BIDI developers insist on hacking up terminal emulators rather
than working on ncurses.

TK> there are a few problems for Robert Brady's patch to be merged
TK> into the xterm.

Rewriting FriBidi under a different license should be no problem.  But
the behaviour of FriBidi is profoundly inadequate for terminal
emulation.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]BiDi rant

2002-02-14 Thread Juliusz Chroboczek


MK> 

You asked for it ;-)

MK> In my honest opinion, RTL writing has no place in the computer
MK> age.

I think you're putting the issue in the wrong terms.

It is possible to write German, French or Polish using ASCII only;
heck, to a certain extent, al-`arabiyya aiDan.  The fact is, however,
that ASCII-only rendering of these languages is not culturally
acceptable, which is why we're interested in implementing support for
beautiful e-ogonek and ugly u-umlaut.

(Offtopic: who is responsible for the positioning of the ogonek in
a-ogonek in the XFree86 fonts?  Could you get it moved to a more
reasonable place?)

Like it or not, RtL is essential for acceptable support of a number of
important languages.  The question is not whether to implement RtL,
but at what level it should be implemented.

I agree with you, though, that nobody has presented a satisfactory
semantics for BIDI support at the terminal emulator level.  On the
other hand, there is a clear semantics for BIDI support at the curses
level.  The only conclusion is that the people interested in BIDI
should start with implementing it in ncurses, slang and Emacs; then
may (or may not) be the time to think about the ocmmand-line.

Regards,

Juliusz

___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n][patch] luit -encoding option

2002-02-14 Thread Juliusz Chroboczek


>> Unless I'm missing something, there's a bug: -encoding should set up
>> inputState in addition to outputState (keyboard input).

TK> I think it is OK because mergeIso2022() is called after parseOptions()
TK> where I added initIso2022().  I think mergeIso2022() is responsible
TK> for setting up inputState.

Yep.  (To be precise, only for defaulting inputState, but that's what
should be done in this case).  Sorry for forgetting about this.

Juliusz

___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n][patch] xterm to invoke luit

2002-02-07 Thread Juliusz Chroboczek


>> I think you should feel free to remove all the Amoeba support.
>> Thomas, have you heard from any Amoeba user in the last 10 years or so?

TK> I see.  Then, my patch will not touch AMOEBA code at all, because
TK> I don't think I can remove the code without knowing what it is or
TK> how it is popular.

Amoeba was an experimental distributed OS developed by Andrew
Tanenbaum (the author of Minix) in the late 80s.  I may be wrong, but
I believe that Minix and Amoeba support has been broken for a long
time, and has been removed from the X server in 4.0.  It has remained
in XTerm, because that's under the control of Thomas.

My personal opinion is that it would be better to remove Amoeba
support altogether rather than silently break it; Thomas, could you
please comment?

[Comments reordered.]

TK> I think I will not implement "-encoding" in, at least, near future.

After thinking it over, I've changed my mind on the subject.  There is
one perfectly legitimate application of ``-encoding'': it is running
luit locally when accessing a remote host and when the local host
doesn't support the remote locale.  E.g.

  luit -encoding 'ShiftJIS' ssh xenix_machine 'LC_ALL=ja_JP emacs -nw'

So I think I'll implement it, unless of course you come around to
doing it before I do.

TK> (I also think "utf8" resource for Xterm is not a very good idea, also,
TK> from exactly the same reason.  How do you think about this?)

Same thing.  Many systems do not support UTF-8 locales yet.

TK> Well, you are right.  I want users to know about locale [...]
TK> However, we are accustomed to think that we have no rights to
TK> force 8bit-language people to study about locale.

I think we should encourage all users to configure their systems
correctly, and insist that software that only works properly in some
locales should document this fact and print a warning if running in an
unsupported locale (as Xlib does).

However, one should remember that many users are stuck with OS that do
not support multibyte locales.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]Luit fix for EUC charsets

2002-02-06 Thread Juliusz Chroboczek


TK> Another improvement is to accept UTF-8 locales.  In such case,
TK> luit will work as "doing-nothing-filter".

Sorry, perhaps my last message wasn't clear.  Luit cannot ``do
nothing,'' it still needs to scan the UTF-8 for ISO 2022 escape
sequences.  But it has to do that in a non-blocking manner -- remember
it's single-threaded -- and hence needs to buffer any incomplete
character it may be seeing.

Once you provide the ability to buffer up to 4 characters (UTF-8), you
might as well implement generic variable-length encoding support,
which is why I'm planning to do both UTF-8 and GB variants (and throw
in ShiftJIS for good measure) at the same time.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n][patch] xterm to invoke luit

2002-02-06 Thread Juliusz Chroboczek


TK> +   command_to_exec --; /* This should be possible */

>> Why should this be possible?

TK> Because "command_to_exec" is set by


TK>  case 'e':
TK> if (argc <= 1) Syntax (*argv);
TK> command_to_exec = ++argv;
TK> break;

I see.  Well, I'd rather we didn't rely on such an accident of nature.

TK> in main.c 1933.  Anyway, I will change this part to use malloc() and
TK> copy argv into the buffer, to cope with "luit -- command"

Good.

TK> It is a good idea.  I am now trying to implement it by modifying
TK> execvp() and exec_file() in spawn().  However, I have no idea what
TK> is exec_file() or what is AMOEBA system.

I think you should feel free to remove all the Amoeba support.
Thomas, have you heard from any Amoeba user in the last 10 years or so?

(Offtopic: an OpenBSD developer who likes to complain was recently
complaining to me about all the support code in XTerm for obsolete
architectures which makes XTerm difficult to maintain.  He was arguing
that we should rip out all the non-POSIX code, release this version,
and see whether anyone screams.  I couldn't help but agree.)

TK> It is, addition of a command option to directly specify encoding.

The nice thing with not having such a flag is that it forces people to
configure their locales correctly -- I'm fed up with bug reports from
people who set the locale to Latin-1, use Latin-2 fonts, then complain
that fontset-aware applications don't work.

Actually the functionality is already there, albeit buried within the
hundreds of ISO 2022-related flags.  Still, if you would find such an
option useful, feel free to add it -- I suggest the name ``-encoding'',
spelled in full

TK> Another improvement is to accept UTF-8 locales.

Yep.  I'm planning to do that when I implement generic support for
variable-width charsets (GB and stuff).

As you've probably noticed, luit is currently pretty much wedded to
single and double-byte encodings (see the buffered_ku business in
iso2022.c).  I need to rework quite a bit of this code in order to
encapsulate the buffering within the charset object (which will
therefore become stateful).

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]A small bug in luit on invocation of child process

2002-02-05 Thread Juliusz Chroboczek


TK> I found that luit doesn't work with "luit  "
TK> while it works well with "luit ".  Here is a patch to
TK> fix this.

Oops -- you're right.  David, please credit this to Tomohiro Kubota.

Tomohiro, in the future could you please send patches that you feel
confident with directly to patch@ and just CC them either to i18n or
to me personally.

Thanks a lot,

Juliusz

diff -ruN luit-20020203-1/luit.c luit-20020203-2/luit.c
--- luit-20020203-1/luit.c  Sun Feb  3 10:51:02 2002
+++ luit-20020203-2/luit.c  Wed Feb  6 01:35:15 2002
@@ -386,7 +386,7 @@
 if(converter)
 return convert(0, 1);
 else
-return condom(argc - i, argv + 1);
+return condom(argc - i, argv + i);
 }
 
 static int

___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]Luit fix for EUC charsets

2002-02-05 Thread Juliusz Chroboczek


>> Please find attached an implementation of an obscure ISO 2022 feature.

TK> This patch doesn't work well, though your previous patch (for which
TK> you added '}' later) works well.

Actually, the previous patch was the corrected version.  This one was
the version that I sent before I saw your patch[1].  (Reading your
patch is what made me realise my mistake and send the corrected
version; I hope I have adequately credited you in the submission.)

In summary: please ignore the version ``Luit fix for EUC charsets''.
but use ``Luit fix for EUC charsets [overrides 5170]'' instead.

I'm looking forward to the results of your testing; once you confirm
the ``overrides 5170'' version works, I'll put up a new standalone
version of luit.

Sincere regards,

Juliusz

[1] There has been a minor time warp due to the newly-established
anti-spam measures.  Keith has been very helpful in helping me deal
with them, and such relativistic effects shouldn't happen again.
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

[I18n]Luit fix for EUC charsets

2002-02-05 Thread Juliusz Chroboczek


Hello,

Please find attached an implementation of an obscure ISO 2022 feature.
According to Tomohiro Kubota, this is needed for proper support of the
EUC-JP character set, and probably other EUC thingies.

It adds Yet Another Incomprehensible Flag to luit, which is good, as
this multitude of options makes it wonderfully clear why ISO 2022 is
not really a good idea.

Please attribute this patch to Tomohiro Kubota and myself.

Juliusz

P.S. Tomohiro, could you confirm that this does what is needed?  Thanks.


? xc/programs/luit/Makefile
? xc/programs/luit/luit
? xc/programs/luit/luit.1x.html
? xc/programs/luit/luit._man
Index: xc/programs/luit/iso2022.c
===
RCS file: /cvs/xc/programs/luit/iso2022.c,v
retrieving revision 1.3
diff -c -r1.3 iso2022.c
*** xc/programs/luit/iso2022.c  2001/12/19 21:29:02 1.3
--- xc/programs/luit/iso2022.c  2002/02/04 18:01:10
***
*** 182,188 
  is->parserState = P_NORMAL;
  is->shiftState = S_NORMAL;
  
! is->inputFlags = IF_EIGHTBIT | IF_SS;
  is->outputFlags = OF_SS | OF_LS | OF_SELECT;
  
  is->buffered = NULL;
--- 182,188 
  is->parserState = P_NORMAL;
  is->shiftState = S_NORMAL;
  
! is->inputFlags = IF_EIGHTBIT | IF_SS | IF_SSGR;
  is->outputFlags = OF_SS | OF_LS | OF_SELECT;
  
  is->buffered = NULL;
***
*** 475,480 
--- 475,483 
  if(is->inputFlags & IF_SS) {
  i = G2(is)->reverse(codepoint, G2(is));
  if(i >= 0) {
+ if((is->inputFlags & IF_EIGHTBIT) &&
+(is->inputFlags & IF_SSGR))
+ i |= 0x80;
  switch(GR(is)->type) {
  case T_94: case T_96: case T_128:
  if(i >= 0x20)
***
*** 493,498 
--- 496,504 
  if(is->inputFlags & IF_SS) {
  i = G3(is)->reverse(codepoint, G3(is));
  if(i > 0) {
+ if((is->inputFlags & IF_EIGHTBIT) &&
+(is->inputFlags & IF_SSGR))
+ i |= 0x80;
  switch(GR(is)->type) {
  case T_94: case T_96: case T_128:
  if(i >= 0x20)
***
*** 579,585 
  }
  code = *s;
  } else {
! charset = GR(is);
  code = *s - 0x80;
  }
  
--- 585,596 
  }
  code = *s;
  } else {
! switch(is->shiftState) {
! case S_NORMAL: charset = GR(is); break;
! case S_SS2: charset = G2(is); break;
! case S_SS3: charset = G3(is); break;
! default: abort();
! }
  code = *s - 0x80;
  }
  
Index: xc/programs/luit/iso2022.h
===
RCS file: /cvs/xc/programs/luit/iso2022.h,v
retrieving revision 1.1
diff -c -r1.1 iso2022.h
*** xc/programs/luit/iso2022.h  2001/11/02 03:06:43 1.1
--- xc/programs/luit/iso2022.h  2002/02/04 18:01:10
***
*** 49,54 
--- 49,55 
  #define IF_SS 1
  #define IF_LS 2
  #define IF_EIGHTBIT 4
+ #define IF_SSGR 8
  
  #define OF_SS 1
  #define OF_LS 2
Index: xc/programs/luit/luit.c
===
RCS file: /cvs/xc/programs/luit/luit.c,v
retrieving revision 1.4
diff -c -r1.4 luit.c
*** xc/programs/luit/luit.c 2002/01/09 16:14:19 1.4
--- xc/programs/luit/luit.c 2002/02/04 18:01:10
***
*** 91,97 
  "  [ -kgl gn ] [-kgr gk] "
  "[ -kg0 set ] [ -kg1 set ] "
  "[ -kg2 set ] [ -kg3 set ]\n"
! "  [ -k7 ] [ +kss ] [ -kls ]\n"
  "  [ -c ] [ -ilog filename ] [ -olog filename ] [ -- ]\n"
  "  [ program [ args ] ]\n");
  
--- 91,97 
  "  [ -kgl gn ] [-kgr gk] "
  "[ -kg0 set ] [ -kg1 set ] "
  "[ -kg2 set ] [ -kg3 set ]\n"
! "  [ -k7 ] [ +kss ] [ +kssgr ] [ -kls ]\n"
  "  [ -c ] [ -ilog filename ] [ -olog filename ] [ -- ]\n"
  "  [ program [ args ] ]\n");
  
***
*** 134,139 
--- 134,142 
  i++;
  } else if(!strcmp(argv[i], "+kss")) {
  inputState->inputFlags &= ~IF_SS;
+ i++;
+ } else if(!strcmp(argv[1], "+kssgr")) {
+ inputState->inputFlags &= ~IF_SSGR;
  i++;
  } else if(!strcmp(argv[i], "-kls")) {
  inputState->inputFlags |= IF_LS;
Index: xc/programs/luit/luit.man
==

Re: [I18n]Luit fix for EUC charsets (yet again!)

2002-02-05 Thread Juliusz Chroboczek


Never send a patch before noon.

Juliusz

*** xc/programs/luit/iso2022.c.wrongTue Feb  5 12:37:38 2002
--- xc/programs/luit/iso2022.c  Tue Feb  5 12:32:47 2002
***
*** 537,543 
  abort();
  }
  continue;
- }
  }
  if(is->inputFlags & IF_LS)  {
  i = GR(is)->reverse(codepoint, GR(is));
--- 537,542 
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n][patch] xterm to invoke luit

2002-02-05 Thread Juliusz Chroboczek


Nice work.  I fully agree in your change of terminology.

A couple of questions.

TK> +   if (term->misc.locale) {
TK> +   if (command_to_exec) {
TK> +   command_to_exec --; /* This should be possible */

Why should this be possible?

And you should be exec'ing ``luit -- command'' rather than ``luit
command''.

I'm also a wee bit worried by what happens if somebody removes luit
from the system.  XTerm should still accept to run if exec(luit)
fails, just print a warning and continue with no locale support.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

[I18n]Luit fix for EUC charsets (overrides 5170)

2002-02-05 Thread Juliusz Chroboczek


Hi David,

Here's a (hopefully) fixed version of 5170.  It also includes a bunch
of new eight-bit encodings.

Please attribute this patch to Tomohiro Kubota and myself.

Juliusz

? xc/programs/luit/Makefile
? xc/programs/luit/luit
? xc/programs/luit/luit.1x.html
? xc/programs/luit/luit._man
Index: xc/programs/luit/charset.c
===
RCS file: /cvs/xc/programs/luit/charset.c,v
retrieving revision 1.1
diff -c -r1.1 charset.c
*** xc/programs/luit/charset.c  2001/11/02 03:06:43 1.1
--- xc/programs/luit/charset.c  2002/02/05 11:09:18
***
*** 107,113 
--- 107,119 
  {"ISO 8859-7", T_96, 'F', "iso8859-7", 0x80, 0, 0},
  {"ISO 8859-8", T_96, 'H', "iso8859-8", 0x80, 0, 0},
  {"ISO 8859-9", T_96, 'M', "iso8859-9", 0x80, 0, 0},
+ {"ISO 8859-10", T_96, 'V', "iso8859-10", 0x80, 0, 0},
+ {"ISO 8859-11", T_96, 'T', "iso8859-11", 0x80, 0, 0},
+ {"TIS 620", T_96, 'T', "iso8859-11", 0x80, 0, 0},
+ {"ISO 8859-13", T_96, 'Y', "iso8859-13", 0x80, 0, 0},
+ {"ISO 8859-14", T_96, '_', "iso8859-14", 0x80, 0, 0},
  {"ISO 8859-15", T_96, 'b', "iso8859-15", 0x80, 0, 0},
+ {"ISO 8859-16", T_96, 'f', "iso8859-16", 0x80, 0, 0},
  {"KOI8-E", T_96, '@', "koi8-e", 0x80, 0, 0},
  
  {"GB 2312", T_9494, 'A', "gb2312.1980-0", 0x, 0, 0},
***
*** 316,321 
--- 322,328 
  
  LocaleCharsetRec localeCharsets[] = {
  { "C", 0, 2, "ASCII", NULL, "ISO 8859-1", NULL},
+ { "POSIX", 0, 2, "ASCII", NULL, "ISO 8859-1", NULL},
  { "ISO8859-1", 0, 2, "ASCII", NULL, "ISO 8859-1", NULL},
  { "ISO8859-2", 0, 2, "ASCII", NULL, "ISO 8859-2", NULL},
  { "ISO8859-3", 0, 2, "ASCII", NULL, "ISO 8859-3", NULL},
***
*** 325,331 
--- 332,344 
  { "ISO8859-7", 0, 2, "ASCII", NULL, "ISO 8859-7", NULL},
  { "ISO8859-8", 0, 2, "ASCII", NULL, "ISO 8859-8", NULL},
  { "ISO8859-9", 0, 2, "ASCII", NULL, "ISO 8859-9", NULL},
+ { "ISO8859-10", 0, 2, "ASCII", NULL, "ISO 8859-10", NULL},
+ { "ISO8859-11", 0, 2, "ASCII", NULL, "ISO 8859-11", NULL},
+ { "TIS620", 0, 2, "ASCII", NULL, "ISO 8859-11", NULL},
+ { "ISO8859-13", 0, 2, "ASCII", NULL, "ISO 8859-13", NULL},
+ { "ISO8859-14", 0, 2, "ASCII", NULL, "ISO 8859-14", NULL},
  { "ISO8859-15", 0, 2, "ASCII", NULL, "ISO 8859-15", NULL},
+ { "ISO8859-16", 0, 2, "ASCII", NULL, "ISO 8859-16", NULL},
  { "KOI8-R", 0, 2, "ASCII", NULL, "KOI8-R", NULL},
  { "CP1251", 0, 2, "ASCII", NULL, "CP 1251", NULL},
  { "eucCN", 0, 1, "ASCII", "GB 2312", NULL, NULL},
Index: xc/programs/luit/iso2022.c
===
RCS file: /cvs/xc/programs/luit/iso2022.c,v
retrieving revision 1.3
diff -c -r1.3 iso2022.c
*** xc/programs/luit/iso2022.c  2001/12/19 21:29:02 1.3
--- xc/programs/luit/iso2022.c  2002/02/05 11:09:19
***
*** 182,188 
  is->parserState = P_NORMAL;
  is->shiftState = S_NORMAL;
  
! is->inputFlags = IF_EIGHTBIT | IF_SS;
  is->outputFlags = OF_SS | OF_LS | OF_SELECT;
  
  is->buffered = NULL;
--- 182,188 
  is->parserState = P_NORMAL;
  is->shiftState = S_NORMAL;
  
! is->inputFlags = IF_EIGHTBIT | IF_SS | IF_SSGR;
  is->outputFlags = OF_SS | OF_LS | OF_SELECT;
  
  is->buffered = NULL;
***
*** 477,488 
  if(i >= 0) {
  switch(GR(is)->type) {
  case T_94: case T_96: case T_128:
! if(i >= 0x20)
  WRITE_1_P(SS2, i);
  break;
! case T_9494: case T_9696: case T_94192:
! if(i >= 0x2020)
  WRITE_2_P(SS2, i);
  break;
  default:
  abort();
--- 477,504 
  if(i >= 0) {
  switch(GR(is)->type) {
  case T_94: case T_96: case T_128:
! if(i >= 0x20) {
! if((is->inputFlags & IF_EIGHTBIT) &&
!(is->inputFlags & IF_SSGR))
! i |= 0x80;
  WRITE_1_P(SS2, i);
+ }
+ break;
+ case T_9494: case T_9696:
+ if(i >= 0x2020) {
+ if((is->inputFlags & IF_EIGHTBIT) &&
+(is->inputFlags & IF_SSGR))
+ i |= 0x8080;
+ WRITE_2_P(SS2, i);
+ }
  break;
! case T_94192:
! if(i >= 0x2020) {
! if((is->inputFlags & IF_EIGHTBIT) &&
!(is->inputFlags & IF_S

Re: [I18n][patch] luit SS2/SS3 (including all previous patches)

2002-02-05 Thread Juliusz Chroboczek


TK> I see.  I think your patch is almost always better than mine

I'm fully basing my modifications on your input (for which I am
extremely grateful -- on my own, I didn't even realise that SS can be
followed by GR).

I think your solution is quite perfect; the only modification I'm
making is to make the EUC behaviour the default in all locales.  This
is due to the fact that I simply cannot see any application of SS
other than EUC.  (Please correct me if I'm wrong.)

My current patch follows in the next message.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]Which font files?

2002-02-05 Thread Juliusz Chroboczek


MA> If I use a font like -*-fixed-*-*-*-*-12-*-*-*-*-*-iso10646-1, how can I
MA> trace this to particular font files like /usr/share/fonts/?.pfa?

You cannot.  Which font exactly will be chosen for an under-specified
pattern such as the above depends on a myriad of factors, including
the server version.

Actually, due to bugs in previous versions of the X server, even if
you choose a fully-specified but not overspecified specification, such
as

  -misc-fixed-medium-r-normal--12-*-75-75-p-*-iso10646-1

the chosen font may under some circumstances depend on the version of
the X server.

MA> Also, how are these iso10646 fonts usually encoded? Are these
MA> fixed ones bitmaps?

Yep.  Other Unicode fonts may be TrueTypes or CIDFonts.  (We do not
support generating core Unicode fonts from Type 1 or Speedo.)

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]current status of SCW proposal ?

2002-02-04 Thread Juliusz Chroboczek


TK> I started to use xterm+luit in ja_JP.eucJP locale for my daily use

Tomohiro, you've made my day.

TK> Though I don't know Juliusz will like it, I am planning to use a
TK> small subset of Markus' SCW proposal, i.e., only

TK>CSI 1 w

I have absolutely no objection.  What I do insist on:

  - single shifts only;
  - prefix (rather than Unicode-style suffix) sequences.

This is not only a luit matter.  Luit was designed in order to keep as
much complexity as possible outside of XTerm; by implementing
something as complex as Markus' proposal in XTerm, you'd be sort of 
destroying the whole purpose of luit.

I would be glad to hear an opinion from Thomas Dickey and Frank da
Cruz.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n][patch] luit SS2/SS3 (including all previous patches)

2002-02-04 Thread Juliusz Chroboczek


Oops, sorry, I've just sent in mine; I hadn't checked the list yet.

I don't think there's a need to make the use of SS/GR dependent on the
locale; in practice, single shifts will only happen in EUC locales.
So I've set it to true by default, and added a command-line flag.

There's a minor problem with your patch: 7 bit keyboard mode should
disable use of GR.

Other than that, it looks like there's a bug in my patch: it doesn't
properly set the high bit of the high byte in a double-byte encoding.
I'll send a revised patch ASAP.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]bdftruncate again ;)

2002-01-31 Thread Juliusz Chroboczek


ME> What I understood from you is that the actual set of glyphs are a
ME> short-term solution, which is not true in the case of Arabic.  I
ME> don't know how it can be 'open-ended'. Arabic requires a certain
ME> set of glyphs to be able to render words properly, that is a fixed
ME> number of glyphs.

I may be misunderstanding something, but I had the notion that the set
of possible ligatures for Arabic rendering is open-ended, just as it
is for Latin rendering.  There are most definitely ligatures that I
have seen which do not appear in Unicode.

I'm not claiming here that there is a demonstrated need for you to
support anything beyond basic Naskh right now; I'm only saying that
nobody knows what the future will bring.  (And think of the demo value
of having a text editor that renders Diwani...)

ME> It is understood that the core font system is both inadequate for
ME> i18n and quickly becoming obsolete.

Definitely so for the former, hopefully so for the latter.

ME> The use of the core font system as far as I am concerned is
ME> limited to applications such as xterm, and other miscellaneous
ME> applications which expect/require fixed-width fonts. Any other
ME> kind of application, say, a Word-processor or a web browser, would
ME> make use of TrueType fonts.

Core fonts can be TrueType fonts, or any technology you may want.
XFree86 4 and later supports presenting TrueType fonts as core fonts.

The keyword is *client-side* fonts, not TrueType (or Type 1, or whatever).

ME> We have asked for no features, as far as I recall.

I haven't claimed you did.  I'm simply saying that if you use the core
fonts system, you'll quickly run into its limitations; and then you'll
ask for features which we'll be unable to provide.

Regards,

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]bdftruncate again ;)

2002-01-31 Thread Juliusz Chroboczek


ME> It's a shame on who?

I used ``it is a shame'' in the sense of ``it is a pity''; while I'm
not a native speaker, I believe that this is standard usage in
English.  The intention was not to cast blame on anyone, but to imply
that if you go to the significant effort of implementing an Arabic
typesetting system, it would be a pity (colloquialy termed ``a
shame'') not to go all the way and make a system that is extensible.

ME> I think you may want to elaborate on your statement about the Unicode
ME> Arabic glyphs being a short-term solution.

In my opinion, any solution that is intimately wedded to a single
glyph encoding is a short term solution.

The set of possible ligatures and contextual forms in Arabic is
open-ended.  While it may very well be that the set of Arabic glyphs
encoded in Unicode is sufficient for your immediate needs (and perhaps
it is, who am I to know?), there is a significant chance that this
will no longer be the case in the future.

Thus, I would like to encourage you to consider a solution that uses
dynamically generated glyph encodings rather than using Unicode as a
fixed glyph encoding.  The former is not (easily) doable with core
fonts, whence my suggestion to avoid them.

The other reason I would like to encourage you not to use core fonts
is that the core fonts system is obsolete.  I do believe that we have
already pushed it to its limits.  If you do use the core fonts system,
you will request more features, something that we will have no choice
but to refuse, leading to further pleasant conversations such as this
one.

ME> It seems that this "Unicode is not a good glyph encoding" gets
ME> echoed over and over.

The set of possible glyphs for any single language (even English) is
open-ended.  Thus, unless I've missed something, there is no such
thing as a ``good'' glyph encoding for text (as opposed to dingbats).

Regards,

Juliusz Chroboczek
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]xterm to invoke luit

2002-01-31 Thread Juliusz Chroboczek


JC> 128x256 character encodings (e.g. Big 5).

Sorry, that was 96x192.  Or somesuch.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]xterm to invoke luit

2002-01-31 Thread Juliusz Chroboczek


JS> Does luit support the encodings that do not compatible with iso-2022? 

Yes, but currently only 128 character encodings (e.g. CP1252), 128x128
character encodings and 128x256 character encodings (e.g. Big 5).

GB 18030 and UTF-8 are on my to-do list; remember, however, that I
work on that stuff in my free time, and I'm not giving any deadlines.

LH> # LANG=zh_TW.Big5 xterm -u8 -e luit
LH> Warning: couldn't find charset data for locale zh_TW.Big5; using ISO 
LH> 8859-1.

Either luit or XFree86 are incorrectly installed (luit can't get at
the locale.alias file).

LH> # xterm -u8 -e luit -g2 'Big 5'

Try running luit with the `-v' flag and see what that gives.

Juliusz


___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]xterm to invoke luit

2002-01-31 Thread Juliusz Chroboczek


Tomohiro KUBOTA <[EMAIL PROTECTED]>:

Hi Tomohiro,

I agree with the general idea.  I would like to change a few details.

I would suggest one new resource:

  XTerm*utf8Filter: /usr/X11R6/bin/luit

If utf8Filter is set, and XTerm is in UTF-8 mode, XTerm will spawn

   prog args

instead of 

  prog args

In addition, I suggest that the utf8 resource should be changed to a
tri-valued flag: it can be false, true, or auto, the latter meaning
that XTerm should run in UTF-8 mode if the locale is multi-byte, and
in eight-bit mode if it is not.

The defaults for these resources should be

  XTerm*utf8Filter: /bin/luit
  XTerm*utf8: auto  (or perhaps true?)


TK>  - emulate doublewidth de-facto standard for east Asian encodings
TK>(using http://www.cl.cam.ac.uk/~mgk25/ucs/scw-proposal.html ?)

With all due respect, I refuse to implement Markus' proposal.

TK>  - luit to support more encodings
TK>(http://mail.nl.linux.org/linux-utf8/2001-11/msg00093.html)

Yep.  Boring, but must be done.

TK>  - whether xterm or luit will support BiDi or not.  Usage of fribidi
TK>may have license problem (like Robert Brady's patch).

My personal opionion is that BIDI belongs above the terminal emulator.

And the number 1 item: include your fix for SS in EUC-JP (actually a
slightly improved version).

Thanks for your comments,

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

[I18n]Cedilla: a manic text printer

2002-01-30 Thread Juliusz Chroboczek


A first beta of Cedilla, the manic text printer, is available from

  http://www.pps.jussieu.fr/~jch/software/cedilla/

Regards,

Juliusz
--
   CEDILLA
  ¸

  A best-effort text printer

  Juliusz Chroboczek


Cedilla is a simple text printer that uses Unicode internally.

Using Unicode means that the set of characters that can appear in the input
is very large, and the user may very well have no font available that
contains glyphs for the characters that he wants to print.  Cedilla attempts
to at least partially solve this problem using a number of techniques:

1. Cedilla can use an arbitrary number of downloadable fonts; for any given
   print job, only the necessary fonts will be downloaded.

2. Cedilla will use its own built-in font.

3. Cedilla will retouch existing glyphs in order to e.g. remove dots or add
   bars.

4. Cedilla will attempt to build composite glyphs (e.g. for accented
   characters) on the fly.

5. Cedilla will use fallbacks for characters that are not supported by the
   available fonts.


SOURCES OF GLYPHS

Cedilla will attempt to use glyphs from all of the following sources,
in decreasing order of preference:

1. Glyphs present in an available font.

This is the common case, and covers all simple characters but also,
depending on the used font, a number of composites.

Example: Être ou ne pas être ?

2. Built-in glyphs.

Cedilla has a built-in font which can be (partially) downloaded if
necessary.

Example: €.

3. Retouched glyphs

When no suitable glyph is available, Cedilla will sometimes attempt to
retouch an available glyph.  The main application is to produce dotless
glyphs for further composition; another use is to provide barred glyphs.

Example: (Slovenian sentence containing đ.)

Example: Tiuj ĉi arĥaismoj neniam estos elĵetitaj.

3. Glyphs composed based on data accompanying the font.

The Adobe Font Metrics (AFM) format can include positioning information for
composites, and Cedilla will be glad to use such data.  However, as few fonts
come with extra information in this form, this technique is seldom useful.

4. Glyphs composed out of components present in a single font.

If a glyph is missing from a font, but all the components needed to
construct it are present, Cedilla will build a composite glyph out of those.
While the positioning of the diacritical marks is approximate, the algorithm
used appears to be satisfactory with many standard fonts.

Example: Czy pamiętasz jak ze mną tańczyłaś Walca ?

When building such composites, Cedilla will, whenever necessary, replace
dotted letters with their dotless variants or retouch the base glyphs to
remove the dot (see Section 2 above).

Of course, Cedilla is not limited to using glyphs that are present in
Unicode in precomposed form.  For example, the second letter of the third
word in the sentence below is a Latin small e followed with ‘Combining
Vertical Line Below.’

Example: Mo lè je̩ dígí, kò ní pa mí lára.


4. Glyphs composed out of components taken from different fonts.

As this sometimes leads to esthetically unpleasant results, Cedilla tries to
avoid this technique.  It can be useful in some cases, however, and will in
particular enable printing almost legible Greek when no decent Greek font is
available:

Example: 
  Οὐχὶ ταὐτὰ παρίσταταί μοι γιγνώσκειν, ὦ 
ἄνδρες ᾿Αθηναῖοι,
  ὅταν τ᾿ εἰς τὰ πράγματα ἀποβλέψω καὶ ὅταν 
πρὸς τοὺς
  λόγους οὓς ἀκούω·
  
Note how the diacritical marks were taken from the primary font, while all
the other Greek glyphs came from the Symbol font.

(Actually, Polytonic Greek is difficult, and Cedilla needs some more work
before it can present it satisfactorily.  Given suitable fonts — not
Symbol! —, Monotonic Greek appears okay, at least to my unexercised eyes.)


5. Fallbacks.

In cases when no glyph is directly available for a character, Cedilla will
fall back on ever less satisfactory alternatives.  For example, a closing
double quotation mark will first be replaced with a sequence of two closing
single quotation marks which will, in turn, be replaced by a sequence of two
apostrophes.

The fallbacks mechanism is a rather complex beast (it currently consists of
four distinct phases), and interacts in wonderful and mysterious ways with
the glyph generation mechanisms, notably composite glyph generation.  Its
exact mechanics are still in a state of flux, and will be described when
they have stabilised.


INSTALLING CEDILLA

Please see the file ‘INSTALL’ for information on installing Cedilla.


INVOKING CEDILLA

Please see the ‘cedilla(1)’ manual page for information on invoking
cedilla.


CONFIGURING CEDILLA

Cedilla is configured by inserting fontset definitions in a file named
‘cedilla-config.lisp’ which, by default, lives in ‘/etc/’.  A fontset
definition is an invocation of

Re: [I18n]bdftruncate again ;)

2002-01-25 Thread Juliusz Chroboczek


RP> But isn't it a suitable glyph encoding for Arabic?

Only to a certain extent.

With its presentation forms for Arabic, Unicode provides a fixed glyph
set for Arabic.  While this glyph set is suitable for some simple
styles of Arabic typography, there is enough variation in
typographical traditions to make the use of Unicode Arabic glyphs a
short-term solution.

(While it is true that the same could be said of Latin typography, with
Latin ligatures are not used for on-screen display (or at least most
of us find their use incomfortable).)

Furthermore, I think it is a shame to go to the effort of implementing
an Arabic display engine and not make it support related scripts
(Syriac being the obvious example) which may not have encoded
presentation forms.

Folks, get over it: Unicode is not a good glyph encoding.  The fact
that we're using it as such is a symptom of what is wrong with Unix
internationalisation.  (``If it works for English and Japanese, it
must be international.)

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]Re: XLFD subsetting and bdftruncate

2002-01-25 Thread Juliusz Chroboczek


MK> Shall I fix the point sizes in the BDF files then?

Unless David or Keith disagree, I think it may be a good idea.  You
will of course need to provide aliases at all sizes (a scalable alias
won't do).

You may want to recall that XLFD uses 27.72 points per inch, as
opposed to PostScript's 27.

MK> XLFDs can make a grown man cry.

Indeed.

To make it clear: I was not claiming that the behaviour I described in
my previous mail is reasonable.  I am only claiming that everything is
working as designed.

It would of course be possible to fix the fallback mechanism to do
something more reasonable in this case, but I just don't have the
courage (never mind the time).

Juliusz

___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]Re: U+3000 limit

2002-01-15 Thread Juliusz Chroboczek


>> I still think that there is no point in working with core fonts any
>> more.  Core fonts are obsolescent in my opinion, and the OP really
>> should think about using Xft.

ME> Okay. Now my question is, is it safe to assume that every
ME> application will NOT use the core fonts and only use Xft?

You cannot just take any old application and assume it will support
Arabic glyph shaping.  Clients (or, more precisely, widget sets) will
need to be adapted to rpovide for glyph shaping.  Doing this will be
vastly easier and more general if you use Xft rather than core fonts.

The point of whether old clients will be able to see Arabic glyphs in
core fonts is therefore moot, as only new clients will be able to
display Arabic in a satisfactory manner.

Keith Packard:

KP> Yes, the existing XFree86 supplied X output method mechanism
KP> probably isn't sufficient to support Arabic, especially when
KP> drawing words along a diagonal baseline.

I dont think you need support diagonal baselines for the type of
typography that's common in the Arab world; it's only needed for
proper typesetting of Persian.

KP> OpenType fonts contain quite a bit of layout information; if
KP> that's sufficient for your needs...

According to my Jordanian office neighbour, Microsoft Windows does a
perfectly satisfactory job of supporting Arabic; and as far as I know
they use OpenType.

Juliusz

___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]luit and JIS X 0212 in EUC-JP

2002-01-14 Thread Juliusz Chroboczek


Tomohiro KUBOTA <[EMAIL PROTECTED]>:

TK> I think I found the reason.  In luit, SS2 and SS3 are implemented
TK> to invoke G2 and G3 to GL.  However, I have read a textbook that
TK> SS2 and SS3 are re-defined by ISO 2022 to invoke G2 and G3 to
TK> _both of GL and GR_.  I imagine this modification is for EUC
TK> encoding scheme to be compliant to ISO 2022.

Thanks, that makes sense.  I'll check your patch when I have time
(Thursday at earliest) and submit it.

TK> Here is a patch to enable _displaying_ JIS X 0212 in EUC-JP, by
TK> following the re-definition I wrote above, by modifying copyOut()
TK> in iso2022.c .

It looks okay to me, but I really don't have time to check

TK> Note that luit with my patch yet has a problem around G3 in EUC-JP.
TK> Though I think this can be done within copyIn() in iso2022.c ,
TK> a new flag whether G2/G3 is mapped into GL or GR (with SS2/SS3)
TK> may be needed because usage of GR is an exception in EUC encoding
TK> scheme.

Right.

Thanks again,

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]U+3000 limit

2002-01-11 Thread Juliusz Chroboczek


MK> You could for example simply cut the Unicode character space into 16
MK> intervals 0x0XXX, 0x1XXX, etc. and open each interval as soon as you
MK> encounter a glyph from it.

If you are a client developer, however, it would be much more
productive, both in the short and long term, to implement support for
client-side fonts, whether using Xft or not.  At this stage, spending
time on trying to work around the deficiencies of the core font
protocol is useless and probably even counterproductive.

(Am I repeating myself?)

In addition, you should recall that while opening one subfont is way
cheaper than opening the full font, opening all 16 of them is more
expensive than opening the whole font.  In particular, the FreeType
backend performs some magic that will make this counterproductive.

(If somebody convinces me subrange specifications are useful and used,
I'll disable the magic when a non-trivial subrange specification
appreas.  For now, I won't bother.)

MK> Many widget sets (e.g., Tk) open fonts only when needed, and that
MK> extends naturally to font subranges.

This approach breaks down, unfortunately, as soon as you try to go
beyond Level 1 Unicode, when you really need to know the full glyph
coverage of the available fonts.

In short: yes, there are various hacks that will allow you to push the
core fonts protocol further.  They are mere hacks, however, and in a
situation in which a proper solution is already available,
implementing them is counterproductive.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]U+3000 limit

2002-01-09 Thread Juliusz Chroboczek


[Back to the list in case somebody knows for sure.]

>> >> xfd -fn '-misc-fixed-medium-r-semicondensed--13-*-75-75-c-*-iso8859-1[65_90]'
>> 
MK> Very nice, I hadn't seen that! Works fine for my XFree86 4.0.3
MK> installation here. Since which release exactly did this work?
>> 
>> For as long as I can remember.  I've got a 3.3.6 machine here, and it
>> works.

MK> Interesting. Various people including myself had all tried it in the
MK> early days of ISO10646-1 BDF fonts and it then certainly hadn't worked,
MK> leading to bdftruncate. So who fixed it in the mean time ...?

I've always assumed it was first implemented in the X11R6 SI.  I'm
positive it wasn't implemented in X11R5.

I am 100% positive that the Type 1 and Speedo backends did implement
subranges in the X11R6.3 SI, which is what I used when first working
on xfsft.  I didn't pay much attention to bitmap fonts then.

A common mistake is to use a dash rather than an underline in the
subset specification; are you sure you didn't do that when you
formerly tried?

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]U+3000 limit

2002-01-09 Thread Juliusz Chroboczek


MK> So I think, we can now drop bdftruncate from the ucs-fonts installation
MK> procedure, as people merely have to add [0_0x31ff] to an XLFD to achieve
MK> the same effect.

MK> Any opinions?

JC> I don't think it is a good idea.

Sorry, I was in a rush yesterday evening and didn't have time to be
more explicit.

Bdftrucate should stay.  All the current UTF-8 clients that I know do
not use the subrange XLFD hack, and so dropping bdftruncate would
strongly penalise them.  Writing new clients that split a iso10646-1
font into pieces using the XLFD subrange extension is a loss of time,
as new clients should just use Xft and forget about the core fonts
Xlib support (or only use it as a fallback).

An alternative would be to add a server option to castrate iso10646-1
fonts on the fly in the BDF and PCF backends.

Juliusz


___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]U+3000 limit

2002-01-08 Thread Juliusz Chroboczek


>> Try
>> 
>> xfd -fn '-misc-fixed-medium-r-semicondensed--13-*-75-75-c-*-iso8859-1[65_90]'

MK> Very nice, I hadn't seen that! Works fine for my XFree86 4.0.3
MK> installation here. Since which release exactly did this work?

For as long as I can remember.  I've got a 3.3.6 machine here, and it
works.

MK> So I think, we can now drop bdftruncate from the ucs-fonts installation
MK> procedure, as people merely have to add [0_0x31ff] to an XLFD to achieve
MK> the same effect.

MK> Any opinions?

I don't think it is a good idea.  Most UTF-8 clients don't make use of
this feature.  New UTF-8 clients should use Xft instead.

I would therefore prefer bdftruncate to stay.

Juliusz

___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]U+3000 limit

2002-01-08 Thread Juliusz Chroboczek


Aidan Kehoe <[EMAIL PROTECTED]>:

>> The correct solution (TM) is to use Xft rather than the core protocol
>> for Arabic.

AK> I think his original message implied that he was considering
AK> implementing font subsetting, as his first link
AK> (http://www.xfree86.org/pipermail/i18n/2001-September/002448.html)
AK> suggests. 

Font subsetting is fully implemented in the BDF, PCF, Type 1, Speedo
and freetype backends.  I haven't checked the SNF or X-TT backends.

Try

  xfd -fn '-misc-fixed-medium-r-semicondensed--13-*-75-75-c-*-iso8859-1[65_90]'

I still think that there is no point in working with core fonts any
more.  Core fonts are obsolescent in my opinion, and the OP really
should think about using Xft.  (Keith, how's Xft's support for bitmap
fonts?  Is Francesco Zappa's work on BDF support in FreeType complete
enough?)

Juliusz

___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n][Q] Latin-0 compose w/o locale

2002-01-08 Thread Juliusz Chroboczek


dv> Let me rephrase my question: how in X can I work in an english
dv> environment, but still have bindings for latin 15[1], as if I were
dv> working with a french locale ?

Set LC_CTYPE to fr_FR@euro but LC_MESSAGES to either C or en_US.  Set
the other LC_* variables to your liking.

(I am running with just such a setup.)

Note, by the way, the difference between fr_FR@euro and
fr_FR.ISO8859-15: the former uses Euro rather than FF as a currency,
while the latter uses the ISO 8859-15 charset but still uses FF as the
currency.

dv> Would some kind of en_US@euro locale do it ? Such a beast doesn't
dv> exist on my Debian system.

There's no need for such a locale as you can mix&match the LC_*
variables.  However, if you're running a system with libc 2.2
(currently, testing or unstable), you can create an en_US.ISO8859-15
locale by putting the right incantations into /etc/locale.gen and
running locale-gen.  You will also need to modify XFree86's locale.dir
and locale.alias files.

(Of course, and en_US@euro locale wouldn't make sense, as the US have
not (yet) adopted the One True Currency.)

Juliusz

P.S. Tomohiro: sorry if I'm being pedantic, but you likely meant
``incorrect'' or perhaps ``unsupported'' rather than ``illegal''.

___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]U+3000 limit

2002-01-08 Thread Juliusz Chroboczek


"Moe Elzubeir" <[EMAIL PROTECTED]>:

ME> Seeing that I know _nothing_ about the internals of XFree86, I would
ME> really appreciate alll the pointers you can offer. Ranging from
ME> pieces of code that affect this to documents and papers that can help.

1. look into include/X11/Xlib.h, and search for the definition of
XFontStruct.  You will see that it contains a field called
``per_char'' which points at a linear vector of XCharStruct.  Thus,
the per_char vector uses 12 bytes for every codepoint between
first_char and last_char.

2. look into the protocol definition.  You will see that the QueryFont
request elicits a reply that contains a linear vector of XCharStruct.
Thus, the 12 * (last_char - first_char) overhead is not only memory
usage, but also protocol overhead.

Note, however, that (2) and, to a certain extent, (1) are solved by
Bruno's BIGFONT extension.  Thus, the above is only partly true.

ME> This 'truncation' of the 10x20 for 'optimization' is seriously
ME> hampering our efforts to bring Arabic support on platforms where
ME> XFree86 runs.

The correct solution (TM) is to use Xft rather than the core protocol
for Arabic.  You may want to get in touch with Owen Taylor and the
Pango project.

Another solution (hack) would be to provide an alternate font with the
Arabic glyphs.  Call it e.g. 10x20ar.

Juliusz

___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]Localized date strings in xclock -digital

2001-11-22 Thread Juliusz Chroboczek


CS> Who should I send the patch to?

You could for example check on www.xfree86.org what the address for
externally submitted patches is.  You'd find out it's fixes at xfree86.org.

Note, however, that this is not a good time for submitting a patch.  I
suggest you wait until 4.2.0 is out.  (Don't ask.)

Regards,

Juliusz

___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]Call for testers: luit in XFree86 CVS

2001-11-14 Thread Juliusz Chroboczek


>> The question is whether it can be useful for a Unix shell session.

JS>   I'm not sure I understand your point...

Luit is designed to provide support for legacy encodings to UTF-8
terminal emulators.  It is not a general-purpose encoding converter.

Viewing mails or web pages is a different matter, because every mailer
or web browser worth its weight in peanuts should be able to do
conversion of e-mails internally.  The issue is whether you know of a
scenario in which terminal-based applications may want to use the
encoding for speaking with the terminal..


>> (On the other hand, I certainly have no ideological objection to
>> including Microsoft encodings.  Luit is by design the place to put all
>> the trash that we want to keep outside of XTerm.)

JS>   What I'm afraid of is that adding more support for legacy encodings
JS> will drag out the transition to UTF-8 forever.

My personal plan is to prevent people from implementing legacy charset
support in terminal emulators.  I think that the best way to do that
is to provide as much support as possible in luit, so that people
don't feel the need to implement it elsewhere.  This will also allow
writers of terminal emulators to concentrate on providing good UTF-8
support rather than spending their time implementing legacy encodings.

Thus, I feel that adding support to luit will facilitate rahter than
hamper transition to Unicode.

JS>   An off-topic issue: did you get my new patch to ksc5601.1987-0.enc?
JS> I'm just wondering because my mail system had a temporary outage
JS> about the time I sent you that patch.

Don't send patches to me (unless you feel they may break my code).
Send them to the submission address.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]Call for testers: luit in XFree86 CVS

2001-11-13 Thread Juliusz Chroboczek


Jungshik Shin:

JS> Much more useful is, although I hate to 'endorse' MS's
JS> proprieatary extension, Windows-949/CP949/Unified Hangul
JS> Code. There are numerous web pages and emails in this encoding
JS> floating around the net disguised as EUC-KR (or a complete
JS> non-sense-name of ks_c_5601-1987).

The question is whether it can be useful for a Unix shell session.

For example IBM CP 437 is useful, for example to run DOSEMU, or
because some older PC Unices (SCO) use it on the console.  On the
other hand, I don't see how a shell session could end up being encoded
in CP 949.

(On the other hand, I certainly have no ideological objection to
including Microsoft encodings.  Luit is by design the place to put all
the trash that we want to keep outside of XTerm.)

JS>   Anyway, if you want to support JOHAB, you have all you need
JS> since ksc5601.1992-3.enc is already in XF86 CVS.

Yep.  I remember ;-)

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]Call for testers: luit in XFree86 CVS

2001-11-13 Thread Juliusz Chroboczek


>>> When I proceed compilation, I met compilation errors of:
>>> 
>>> charset.o: In function `FontencCharsetRecode':
>>> charset.o(.text+0x146): undefined reference to `FontEncRecode'
>>> charset.o: In function `getFontencCharset':
>>> charset.o(.text+0x2f0): undefined reference to `FontEncMapFind'
>>> charset.o(.text+0x302): undefined reference to `FontMapReverse'
>>> 
>>> I think I need some libraries in XFree86 CVS tree...

IM>  Luit uses libfontenc in xc/lib/fontenc

Exactly.  But are the Imakefiles set up alright if you do a full build
of the tree?

Juliusz

___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]Call for testers: luit in XFree86 CVS

2001-11-12 Thread Juliusz Chroboczek


Tomohiro KUBOTA:

>> Luit is now in the XFree86 CVS.  It is under xc/programs/luit.

TK> Why not support...

I don't want to extend luit for 4.2.0; bug fixes only in this version.
Much of what you're proposing will go into future releases of luit.
More precisely,

TK> ISO-8859-{10,13,14,16} and TIS-620?

These most definitely will go in (the parser needs to be fixed to
recognise a non-trivial intermediary byte in charset selection
sequences.)

TK> How do you think about Big5HKSCS?

I'll have a look.

TK> And I hope Shift_JIS and CP932 (Microsoft replacement of Shift_JIS),

Shift_JIS (yuck) will go in (we already agreed on that, if you
recall).  CP 932 only if somebody provides the needed mapping tables.

TK> I think JIS X 0213:2000 will be useful in future but it is not
TK> popular so far.

Only if users request it.  The people who designed this encoding
should have worked with their ISO 10646 national representative
instead.

TK> How about GBK and GB18030?  GBK is slightly more dirty than Big5
TK> because C1 region is used for the first byte for 2 byte characters.
TK> However, I think GBK is not so popular and GB18030 will be more
TK> important in future (though I imagine GB18030 support is difficult).

GBK is easy.  GB 18030 is slightly more tricky, but definitely doable.
Same comment as above.

TK> How about Johab?

Don't know.  We'll see.

TK> And, there is a difficult problem which we already started discussion:
TK> how to deal with character width problem, when luit works with XTerm?

TK> I think Markus' proposal
TK> http://www.cl.cam.ac.uk/~mgk25/ucs/scw-proposal.html can be a
TK> candidate for solution

As I've already mentioned, I strongly dislike the complexity of
Markus' proposal.  I want to use single shifts only.

TK> but I am afraid this solution can be too heavy, because luit will
TK> have to issue CSI 1 w for each doublewidth character and XTerm
TK> will have to parse it.

I don't think that will be much of a problem.  If it is, we'll see
what can be done.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]autogenerated UTF-8 Compose file

2001-11-09 Thread Juliusz Chroboczek


David Monniaux:

DM> * Ancient greek is only partially treated due to the absence of dead 
DM> keys for some of the ancient greek diacritics. I think of adding a 
DM> few (virtual) keysyms for these, as well as Compose actions following 
DM> the greek-ibycus4 input method of GNU Emacs.

You should use Markus' embedding of the Unicode space in the keysym
space for that.  (Markus, you've got the reference handy?)

DM> * Some ligatures are not yet handled.

Please don't.

Regards,

Juliusz

P.S. Tu quoque, fili, Perlum usavisti?
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

[I18n]Call for testers: luit in XFree86 CVS

2001-11-07 Thread Juliusz Chroboczek


Hello,

Luit is now in the XFree86 CVS.  It is under xc/programs/luit.

I haven't yet had time to test the integration (I'll try to do it this
week-end).  If any of you have time, I'd be very grateful if you could
give me a hand.

(No changes to better integrate XTerm with luit have been done yet;
such changes are planned for 4.3.0, not earlier.)

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]Problem on JIS X 0208 -> Unicode

2001-10-26 Thread Juliusz Chroboczek


Bruno,

BH> This changes only one direction. Please update the other direction as
BH> well. Otherwise the roundtrip mapping is not the identity. I append
BH> the complete patch for jisx0208.h.

BH> And the JISX 0212  U+007E -> U+FF5E  should be treated similarly.

Sorry for that.  Could you please send the patch to patch@ yourself?
This way, you'll automatically become the contact person.

You should explicitly mention whether you're updating or overriding
patch 5002.

Thanks,

Juliusz

___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]Problem on JIS X 0208 -> Unicode

2001-10-26 Thread Juliusz Chroboczek


JC> The attached patch changes a controversial mapping of the JIS X
JC> 0208 reverse solidus to Unicode.

Pro memoria, patch 5002.  (Just missed the millenium.)

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]Problem on JIS X 0208 -> Unicode

2001-10-26 Thread Juliusz Chroboczek


Hi David,

The attached patch changes a controversial mapping of the JIS X 0208
reverse solidus to Unicode.  The original tables (derived from tables
provided by the Unicode consortium) mapped this character to the good
old ASCII backslash (U+005C); according to Tomohiro Kubota, this is
not what Japanese users expect, preferring it to be mapped to the
compatibility character U+FF3C FULLWIDTH REVERSE SOLIDUS instead.

Jungshik Shin confirms that this is the desired behaviour.

Regards,

Juliusz



Index: xc/lib/X11/lcUniConv/jisx0208.h
===
RCS file: /cvs/xc/lib/X11/lcUniConv/jisx0208.h,v
retrieving revision 1.4
diff -c -r1.4 jisx0208.h
*** xc/lib/X11/lcUniConv/jisx0208.h	2001/08/09 19:14:08	1.4
--- xc/lib/X11/lcUniConv/jisx0208.h	2001/10/26 16:09:36
***
*** 9,15 
0x3000, 0x3001, 0x3002, 0xff0c, 0xff0e, 0x30fb, 0xff1a, 0xff1b,
0xff1f, 0xff01, 0x309b, 0x309c, 0x00b4, 0xff40, 0x00a8, 0xff3e,
0xffe3, 0xff3f, 0x30fd, 0x30fe, 0x309d, 0x309e, 0x3003, 0x4edd,
!   0x3005, 0x3006, 0x3007, 0x30fc, 0x2015, 0x2010, 0xff0f, 0x005c,
0x301c, 0x2016, 0xff5c, 0x2026, 0x2025, 0x2018, 0x2019, 0x201c,
0x201d, 0xff08, 0xff09, 0x3014, 0x3015, 0xff3b, 0xff3d, 0xff5b,
0xff5d, 0x3008, 0x3009, 0x300a, 0x300b, 0x300c, 0x300d, 0x300e,
--- 9,15 
0x3000, 0x3001, 0x3002, 0xff0c, 0xff0e, 0x30fb, 0xff1a, 0xff1b,
0xff1f, 0xff01, 0x309b, 0x309c, 0x00b4, 0xff40, 0x00a8, 0xff3e,
0xffe3, 0xff3f, 0x30fd, 0x30fe, 0x309d, 0x309e, 0x3003, 0x4edd,
!   0x3005, 0x3006, 0x3007, 0x30fc, 0x2015, 0x2010, 0xff0f, 0xff3c,
0x301c, 0x2016, 0xff5c, 0x2026, 0x2025, 0x2018, 0x2019, 0x201c,
0x201d, 0xff08, 0xff09, 0x3014, 0x3015, 0xff3b, 0xff3d, 0xff5b,
0xff5d, 0x3008, 0x3009, 0x300a, 0x300b, 0x300c, 0x300d, 0x300e,
Index: xc/fonts/encodings/large/jisx0208.1983-0.enc
===
RCS file: /cvs/xc/fonts/encodings/large/jisx0208.1983-0.enc,v
retrieving revision 1.1
diff -c -r1.1 jisx0208.1983-0.enc
*** xc/fonts/encodings/large/jisx0208.1983-0.enc	1999/05/30 02:27:56	1.1
--- xc/fonts/encodings/large/jisx0208.1983-0.enc	2001/10/26 16:09:39
***
*** 1,5 
--- 1,6 
  STARTENCODING jisx0208.1983-0
  SIZE 0x75 0x80
+ FIRSTINDEX 0x20 0x20
  STARTMAPPING unicode
  UNDEFINE 0x00 0x747F
  0x2121  0x2123  0x3000
***
*** 25,31 
  0x213D  0x2014
  0x213E  0x2010
  0x213F  0xFF0F
! 0x2140  0x005C
  0x2141  0x301C
  0x2142  0x2016
  0x2143  0xFF5C
--- 26,32 
  0x213D  0x2014
  0x213E  0x2010
  0x213F  0xFF0F
! 0x2140  0xFF3C
  0x2141  0x301C
  0x2142  0x2016
  0x2143  0xFF5C
Index: xc/fonts/encodings/large/jisx0208.1990-0.enc
===
RCS file: /cvs/xc/fonts/encodings/large/jisx0208.1990-0.enc,v
retrieving revision 1.1
diff -c -r1.1 jisx0208.1990-0.enc
*** xc/fonts/encodings/large/jisx0208.1990-0.enc	1999/05/30 02:27:56	1.1
--- xc/fonts/encodings/large/jisx0208.1990-0.enc	2001/10/26 16:09:42
***
*** 2,7 
--- 2,8 
  # This file is partly derived from data provided by the Unicode Consortium
  # Original data Copyright (c) 1991-1994 Unicode, Inc.
  SIZE 0x75 0x80
+ FIRSTINDEX 0x20 0x20
  STARTMAPPING unicode
  # override default identity mapping
  UNDEFINE 0x 0x747F
***
*** 36,42 
  0x213D	0x2015	# HORIZONTAL BAR
  0x213E	0x2010	# HYPHEN
  0x213F	0xFF0F	# FULLWIDTH SOLIDUS
! 0x2140	0x005C	# REVERSE SOLIDUS
  0x2141	0x301C	# WAVE DASH
  0x2142	0x2016	# DOUBLE VERTICAL LINE
  0x2143	0xFF5C	# FULLWIDTH VERTICAL LINE
--- 37,43 
  0x213D	0x2015	# HORIZONTAL BAR
  0x213E	0x2010	# HYPHEN
  0x213F	0xFF0F	# FULLWIDTH SOLIDUS
! 0x2140	0xFF3C	# FULLWIDTH REVERSE SOLIDUS
  0x2141	0x301C	# WAVE DASH
  0x2142	0x2016	# DOUBLE VERTICAL LINE
  0x2143	0xFF5C	# FULLWIDTH VERTICAL LINE

Re: [I18n]Problem on JIS X 0208 -> Unicode

2001-10-26 Thread Juliusz Chroboczek


Same thing in the font encoding tables used by the freetype backend
and by Luit.  The two types of tables should be kept identical.

Can other people confirm that this is indeed the right thing to do?  I
can imagine how it may break things, e.g. by making it impossible to
select a JIS X 208 reverse solidus and paste it into an 8859-1 client.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]US-ASCII part of CJK TTFs served by freetype and xtt backends

2001-10-25 Thread Juliusz Chroboczek


JS>  Are the following fonts
JS> (-watanabe-mincho-iso8859-1 and -watanabe-mincho-.jisx0208.1983-0)
JS> two separate fonts or a single font?

They are two separate core X11 fonts that come from a single font file.

(Actually, to be pedantic, they describe an infinity of X11 font
instances following a uniform pattern, but let's forget about it.)

JS> Does the XLFD definition of '-c-' font prevent them from having two
JS> different widths (*all* the glyphs in the second one are twice as wide
JS> as *all* the glyphs in the first one) ? 

No.

JS> If not, I think my point still stands.

Your point stands.

The problem is with the optimisation that the freetype backend does
for `-c-' fonts.  In order to quickly determine the (uniform) glyph
metrics, the freetype backend reads the global metrics provided in the
font file at font open time.  As the TTF format doesn't provide
metrics for subsets, there is nothing that can be done to provide
different metrics for different font instances.

I have thought of various ways to make the guessing smarter, and, if
you look carefully at the code of the freetype backend, there is a
certain amount of hairy infrastructure to allow other approaches.
However, all the ideas that I followed turned out to be worse than the
current scheme.

If you have any good ideas, I'll be glad to hear you, although, as
I've said, I don't believe that working on core fonts is a good idea
anymore.

Regards,

Juliusz

___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]US-ASCII part of CJK TTFs served by freetype and xtt backends

2001-10-24 Thread Juliusz Chroboczek


JS> However, if the last two fields of XLFD is something like iso8859-1,
JS> wouldn't it better to follow some kind of 'wcswidth convention' per
JS> 'character set+encoding' basis instead of 'ind. chars basis'?

[...]

JS> watanabe-mincho.ttf -watanabe-mincho-medium-r-normal--0-0-0-0-c-0-iso8859-1
JS> watanabe-mincho.ttf -watanabe-mincho-medium-r-normal--0-0-0-0-c-0-jisx0208.1983-0
JS> watanabe-mincho.ttf -watanabe-mincho-medium-r-normal--0-0-0-0-c-0-iso10646-1

[...]

JS> Wouldn't it be natural to expect Latin characters to have half the width
JS> of Japanese characters?

The XLFD provides a strict definition for `-c-' fonts.  This defi-
nition is implemented by the freetype backend, which will do technic-
ally correct but slightly unintuitive things if you lie to it about
the nature of your fonts.

If you desire a different behaviour, you should either try to get your
applications to work with `-p-' fonts, or push for a ``biwidth'' `-b-'
spacing type to be included in a future versions of the XLFD.

My personal opinion is that core fonts should be deprecated, and
applications move to RENDER client-side fonts instead; thus, I think
that working on revising the XLFD is a loss of time.  (In particular, I
do not intend to put any new code in the freetype backend myself,
although I'd be willing to include *correct* patches.)

(The word ``correct'' explains why the sbits patch has not yet been
included.)

Juliusz

___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]Re: Sincronization with X.Org

2001-10-19 Thread Juliusz Chroboczek


>> Unicode characters useless.  X.Org have reacted positively to
>> Markus' work, although they haven't officially rubberstamped it.

PR> I assume you are talking about the proposal described in
PR> http://www.cl.cam.ac.uk/~mgk25/ucs/xorg.html>.  Is there any
PR> indication if (when) this proposal will be included in X11?  Is it
PR> already part of XFree86?

The proposal is already supported by XFree86.  As to X.Org, ask them.

PR> Let me know if you need more information.

No, that's quite enough.  As far as I can tell, this means that under
Markus' proposal, there is no need to register new keysyms for the
missing characters -- you can simply use the ones in Unicode space.

Juliusz

___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]US-ASCII part of CJK TTFs served by freetype and xtt backends

2001-10-19 Thread Juliusz Chroboczek


Hi Jungshik, nice to meet you again,

JS>   When used under MS-Windows, glyphs for US-ASCII part (or ISO-8859-1
JS> part) are about twice as narrow as glyphs for Japanese
JS> characters. However, the width of glyphs for US-ASCII part is the same
JS> as that of glyphs for Japanese characters when it is presented to X11
JS> clients by freetype backend of XFree86 4.x. That is, the width of US-ASCII
JS> glyphs  is abnormally wide making text rendered with them very ugly.

This is the expected behaviour.

If you put a `-c-' entry in your fonts.dir, the freetype backend will
forcibly cast all your glyphs to the same width.

If you want to use a variable width font, you should be using `-p-'
instead.  Note, however, that this will lead to very long opening
times.

Regards,

Juliusz


___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]Re: asking help for gb18030 support

2001-10-16 Thread Juliusz Chroboczek


YS> Thank you so much for your help, I have made some progress
YS> today. With the attached zh_CN.gb18030/XLC_LOCALE and
YS> GB18030-0.enc files. I can see all the the characters except those
YS> from the four-byte part.

X11 core fonts only support 16-bit glyph indices.

YS> I guess, maybe still something wrong with locale definition?

You will need to use multiple fonts in your fontset.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

[I18n]Re: asking help for gb18030 support

2001-10-12 Thread Juliusz Chroboczek


[CC'd to [EMAIL PROTECTED]]

YS> This is Yu Shao, from Red Hat Asia-Pacific,

Pleased to meet you.  (Shakes hands.)

YS> at the moment, we are developing the Simplified Chinese version of
YS> Red Hat Linux, and we are having some problems with the Chinese
YS> new locale gb18030 support.

Currently, we at XFree86 have not decided whether to include native
support for GB 18030 or to promote Unicode-based locales instead and
rely on iconv and other code conversion tools to convert GB 18030
texts into UTF-8.  My personal opinion is that the latter approach is
preferable.

YS> I defined the zh_CN.gb18030 in the locale database, and also put
YS> encoding file gb18030-0.enc in, and I use FreeType module to
YS> render the chinese True Type fonts, but still no good luck. I am
YS> wondering if the encoding file gb18030-0.enc is enough for
YS> FreeType module to support this new locale, do i need to do
YS> something else?

It should be enough, as long as you include the right entries in your
fonts.scale files before generating the fonts.dir.  In released
versions of XFree86, you will also need to make sure that mkfontdir
generates an entry for your encoding in encodings.dir (this will no
longer be necessary in 4.2.0).

Please check whether xfd allows you to see the GB 18030-encoded fonts.

YS> and do I need to add coversion modules into /xc/lib/X11/lcUniConv?

These are only needed if you want UTF8_STRING to work properly in GB
18030 locales.  I wouldn't worry about it for now.

YS> I attached the patch file which is what I have done so far, can
YS> you help me to have a look?

First of all, you need a SIZE declaration in your encoding file, and
optionally a FIRSTINDEX entry.

There's probably also something missing in your XLC_LOCALE file;
unfortunately, I don't have time to work it out right now, so I'll
append it to this message and let the list work it out.

Regards,

Juliusz
diff -uNr xc.orig/nls/XLC_LOCALE/zh_CN.gb18030 xc/nls/XLC_LOCALE/zh_CN.gb18030
--- xc.orig/nls/XLC_LOCALE/zh_CN.gb18030Thu Jan  1 10:00:00 1970
+++ xc/nls/XLC_LOCALE/zh_CN.gb18030 Fri Oct 12 12:08:42 2001
@@ -0,0 +1,74 @@
+# 
+#  XLC_FONTSET category
+# 
+XLC_FONTSET
+
+on_demand_loading  True
+
+object_namegeneric
+
+#  We leave the legacy encodings in for the moment, because we don't
+#  have that many ISO10646 fonts yet.
+#  fs0 class (7 bit ASCII)
+fs0{
+   charset {
+   nameISO8859-1:GL
+   }
+   font{
+   primary ISO8859-1:GL
+   vertical_rotate all
+   }
+}
+
+#   fs1 class (Chinese Han Character)
+fs1{
+   charset {
+   nameGB2312.1980-0:GL
+   }
+   font{
+   primary GB2312.1980-0:GL
+   }
+}
+
+#  fs2 class
+fs2{
+   charset {
+   nameISO10646-1
+   }
+   font{
+   primary GB18030-0
+   substitute GBK2K-0
+   }
+}
+END XLC_FONTSET
+
+# 
+#  XLC_XLOCALE category
+# 
+XLC_XLOCALE
+
+encoding_name  GB18030
+mb_cur_max 4
+state_depend_encoding  False
+
+#  cs0 class
+cs0{
+   sideGL:Default
+   length  1
+   ct_encoding ISO8859-1:GL
+}
+
+#  cs1 class
+cs1 {
+sideGR
+length  2
+ct_encoding GB2312.1980-0:GL; GB2312.1980-0:GR
+}
+
+#  cs2 class
+cs2{
+   sidenone
+   ct_encoding ISO10646-1
+}
+
+END XLC_XLOCALE


___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]ISO 10646 Fonts and XFontSet question

2001-09-28 Thread Juliusz Chroboczek


JS> There are many other writing systems and scripts for which there's
JS> NO 'legacy' encoding defined in and outside X11.

We're aware of that (Tifinagh is the example I like to give).  The
point Keith was making (if I understood him correctly) was that core X
fonts should only be used for glyphs covered by legacy encodings.  For
new glyphs, client-side fonts (Xft or otherwise) should be used
instead.

Of course, this is only reasonable if you believe (as I, for one,
certainly do) that RENDER is being deployed fast enough to make
exclusive use of client-side fonts a reasonable approach in the
foreseeable future.  Someone at Sun may want to comment.

JS> It's 'ksc5601.1992-3' with *almost* the same (but not as complete
JS> as) coverage of *modern* Korean Hangul as iso10646-1. The encoding
JS> file for this

Patch 4912, if anyone is interested.  Note, however, that Xlib does
not currently know how to effectively use fonts with this encoding.

JS> along with a patch to the encoding file for ksc5601.1987-0

Patch 4910, which fixes a bug I am guilty for.  Note that the problem
is *not* user-visible, due to the seventh generation autonomously
intelligent self-correcting design of the fontenc layer.  The fix just
saves a few microseconds the very first time a ksc5601 scalable font
is used (as well as a couple hundred KB of storage, but what's a
little disk space between friends).

Both patches are due to Jungshik, to whom I am very grateful.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n]big5.etenc-0.enc --- one more cmap 3 4

2001-09-28 Thread Juliusz Chroboczek


JC> It's a bug.  A patch has been submitted in late August (patch
JC> nr. 4911) and should be committed soon.

Sorry, checking the archives it looks like I didn't actually attach
the patch.  A new version is patch 4957.

Juliusz
___
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

1 2 >

1 - 100 of 109 matches

Mail list logo