Re: Experiments with classical Greek keyboard input

2006-01-21 Thread Alexandros Diamantidis
[sorry for taking a few days to reply...]

* Jan Willem Stumpel [2006-01-18 14:41]:
> This does not work in my case. Also interchanging the entries (US first,
> then GR) did not work. I mean you can get the accents, but not the
> breathing signs. Strangely enough, even calling
> 
> LANG=el_GR.UTF-8 xterm
> 
> and then doing things in the new xterm, did not work! I don't understand
> why. I have the el_GR.UTF-8 locale installed.

I really wonder why... I thought if you had a ~/.XCompose file, your
locale didn't matter (except if you specifically used it in that file,
by doing 'include "%L"'). Maybe it's not used at all? You could try
strace on some X program and see if it is opened.

 $ strace -o foo xterm
 $ grep XCompose foo
open("/home/adia/.XCompose", O_RDONLY)  = 5

> to use the xkb facilities). But in the true UTF-8 spirit, we should be 
> able to read/print/enter *anything* from *any* locale, as long as it is 
> a UTF-8 one.

I agree with this sentiment... I have also had trouble in this department
as well :(

> So perhaps /etc/X11/xkb/symbols/pc/gr should really be changed to 
> include the UTF-8 'breathing' signs.

Yes, but which keysyms should be used? U0313 and U0314, which correspond
to U+0313 COMBINING COMMA ABOVE and U+0314 COMBINING REVERSED COMMA
ABOVE? The current hack which uses dead_horn and dead_ogonek? Or some
new keysyms?

* Simos Xenitellis [2006-01-18 14:40]:
> There are clashes with the reusing of dead_acute, dead_ogonek and so on 
> in many different languages, causing trouble and conflicts when having a 
> single compose file for all languages. I did not see a compelling reason 
> against creating more symbol definitions. Are there any?

Well, I don't think there is a problem with reusing dead keys for many
languages. Can you think of an example where a dead key followed by a
letter key (or some other similar sequence) should produce different
results depending on the language?

X11 keysyms are supposed, I think, to correspond to keys that really
appear on keyboards. But in the case of polytonic Greek, for instance,
we never had computer keyboards with breathing signs, did we? So these
symbols were left out. You are right, a few more symbols for dead keys
would be useful. But I don't know who is responsible for defining new
ones - X.Org maybe? Perhaps a bug should be opened about this at
bugs.freedesktop.org...

-- 
Alexandros Diamantidis * [EMAIL PROTECTED]

--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/



Re: Experiments with classical Greek keyboard input

2006-01-21 Thread Jan Willem Stumpel

Alexandros Diamantidis wrote:

[sorry for taking a few days to reply...]

* Jan Willem Stumpel [2006-01-18 14:41]:


This does not work in my case. Also interchanging the entries (US
first, then GR) did not work. I mean you can get the accents, but
not the breathing signs. Strangely enough, even calling

LANG=el_GR.UTF-8 xterm

and then doing things in the new xterm, did not work! I don't
understand why. I have the el_GR.UTF-8 locale installed.



I really wonder why... I thought if you had a ~/.XCompose file, your 
locale didn't matter (except if you specifically used it in that

file, by doing 'include "%L"'). Maybe it's not used at all? You could
try strace on some X program and see if it is opened.


I am going to investigate this further. Will reply when I get some results.


[..]
So perhaps /etc/X11/xkb/symbols/pc/gr should really be changed to 
include the UTF-8 'breathing' signs.



Yes, but which keysyms should be used? U0313 and U0314, which
correspond to U+0313 COMBINING COMMA ABOVE and U+0314 COMBINING
REVERSED COMMA ABOVE? The current hack which uses dead_horn and
dead_ogonek? Or some new keysyms?


I think it should be U0313 and U0314, because they are 'official': the
Unicode standard (http://www.unicode.org/charts/PDF/U0300.pdf) says that
313 and 314 are used as Greek psili and Greek dasia, and the
common Compose file (/usr/X11R6/lib/X11/locale/en_US.UTF-8/Compose)
already has lots of compose sequences defined which use 313 and 314,  like

: "ᾦ"
 U1FA6 # GREEK SMALL LETTER OMEGA WITH PSILI AND PERISPOMENI AND
 YPOGEGRAMMENI

In fact there are 20 different compose sequences in the file for the ᾦ
character alone! Some of them involve 5 keystrokes, using ( and ) to
input the 'breathing' signs. I've no idea who put all these definitions in.

So polytonic Greek does not really need its own Compose file; everything
is already in the common file. Using the common file would mean that
polytonic Greek could be input from any (UTF-8) locale. It's just that
the /etc/X11/xkb/symbols/pc/gr file has to reflect this. The dead_horn
and dead_ogonek can then be left alone (for whatever really horn- and
ogonek-using languages want to do with them).

Regards, Jan




--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/



character cell fonts and combining characters?

2006-01-21 Thread Rich Felker
i'm facing a problem with the seeming lack of any way to encode
correct glyph-combining information in character-cell bdf bitmap fonts
for use in x. i've heard of people doing horrible hacks like just
making the combining characters zero-width and offset to the left of
the cell, so that they naturally overstrike, but this has several
major problems that cannot be solved:

- it only works for programs that render a whole string with x, not
  actual character-cell based programs that will draw and redraw
  characters in their respective cells, i.e. terminal emulators.

- it does not allow vertical stacking of combining marks, i.e. if
  there's more than one they'll just overstrike one another and become
  illegible.

- it does not account for the fact that the base character may have to
  change form to accomodate the combining characters (either due to
  conventions of the script itself or limitations of the font size,
  e.g. uppercase latin characters having to become shorter to make
  space for accents in small fonts).

what i would like is a standardized system for (ab)using or extending
the x11 font metrics to store the data on how characters should
combine. unfortunately i don't yet understand the x font system well
enough; however i believe that the width/ascent/descent/etc
information, which are otherwise not very useful on combining
characters, could be used to store the offsets for rendering the
combined character in the correct position relative to the base
character or the previous combining character. this rendering could be
performed one-character-at-a-time by the application, and possibly
also by an extension to the x server.

as for alternate glyphs for the base character when it's used in
combinations, i don't know what to do since x seems to have a shortage
of glyph numbers already.

the only alternative i know is for unicode terminals to ignore the
whole x font system and load their own pixmaps and data tables of
combining information. i don't think anyone particularly likes this
solution, but perhaps i'm mistaken.

other ideas...?

rich



--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/