Re: Unicode Keyboard Input Linux

2004-06-15 Thread Elvis Presley
Hi All,

Thanks for your help. I'm still processing your input.

They just changed my Yahoo! mail; let's hope you get this.

Elvis

PS

User space text-mode only virtual terminals

On Mon, Jun 14, 2004 at 21:38:13 +0200, Pablo Saratxaga wrote:

 [...] It would be perfectly ok to provide only very minimalistic kernel
support (even simpler and lighter than the current one) and have a user space
'vc' loaded early in the boot process. 

Or none at all. Just move the VC mux out of the kernel and into user space:

==
--+  :  +---+  :  +--+
...-:-|console|-:-|  |
--+  :  +---+  +---+   :  |VC mux|
 :   +|tcp|---:-|  |
 :   | +---+   :  +--+
 :   | :
 :   | +---+   :  +-+ 
 :   | +--|tcp|--:-| | 
 :   | |   +---+   :  |vterm| 
 :   | | +-:-| | 
 :   | | | :  +-+ 
 :   | | | m+-+s   ++  :  +-+
 :   | | +-|ptty0|--|tty0|-:-|shell|
 :   | |+-+++  :  +-+
 :   | |   :
 :   | |   +---+   :  +-+ 
 :   | | +|tcp|--:-| | 
 :   | | | +---+   :  |vterm| 
 :   | | |   +-:-| | 
 :   | | |   | :  +-+ 
 :   | | |   | m+-+s   ++  :  +-+
 :   | | |   +-|ptty1|--|tty1|-:-|login|
 :   | | |  +-+++  :  +-+
 :   v v v : 
 : +--++---+   :  +-+
 : ...  --|IP mux|--|tcp|--:-| |
 : +--++---+   :  |vterm|
 :   +-:-| |
 :   | :  +-+
 :   | m+-+s   ++  :  +-+
 :   +-|ptty2|--|tty2|-:-|getty|
 :  +-+++  :  +-+
==

This is exactly the same situation which applies to xterms, only the VC mux
opens the console in character mode. It then forks a fixed number of 'vterms'
as child processes. Each vterm holds the character contents of its display as
well as the state of its keyboard.

Conclusion: vterms and xterms are redundant, so there is no good reason to run
them both at the same time. And xterms are more flexible.

Still, the keyboards are the same, so both could share the same, better(=X)
'keymaps' fsm.

512(=2**9) character glyphs in the vterm character buffer would be plenty for
my purposes: Latin (french, german, spanish, italian), Greek (mono- and
polytonic) and Cyrillic, but I'd have to be able to chose the unicode
characters I want, and map them to glyphs in the console-font.

(You couldn't pull the IP mux out as easily, relying on traditional Unix pipes
for IPC... that's another mailing list.)

  Unicode [...] is prejudiced against non-speakers.

 ??? [...]

I meant to say utf-8. The irony is that utf-8 also blew up the Latin-1
characters. Now everything (but English) is twice the size. (That's not true,
only the accented vowels are.)

The Perseus Project does a nice job with unicode, it has to, because there is
no national character set for poly Greek (well there is, sort of, the encoding
schemes used in academia, but they are less well known, and the unicode font
support is better).

  Why do Greek newspapers still use ISO 8859-7?

 For the same reason that a majority of English language web sites still use
windows-1252, I suppose.

I guess we'll have to ask them.

  it looks like these older character sets will be around for a long time.

 Yes, but not for that reason (to save space); they are around because there
is a lot of *OLD* data in those encodings [...]

http://www.dolnet.ta-nea.gr/ is still producing alot of new material, and their
mix is text-oriented.

I thought it might be because they were using web authoring tools based on the
older, national character set.

Wide characters are easily compressed, by the file system, or the network. In
fact, there is alot of network 

Re: Unicode Keyboard Input Linux

2004-06-15 Thread Elvis Presley
Windows Virtual Machines

Kalhmera kosme.

I'm at the library right now and our NT workstations do not have international
keyboard drivers installed.

So I have to write Greeklish.

Elvis

PS

 On Tue, 15 Jun 2004 at 16:44:48 +0200, Pablo Saratxaga wrote: 

  On Tue, Jun 15, 2004 at 05:55:18AM -0700, Elvis Presley wrote:

  Conclusion: vterms and xterms are redundant, so there is no good reason to
run them both at the same time. And xterms are more flexible.

 Yes, but there is a big difference: xterms need a running X terminal; vterms
don't.

Can you help me out? I don't have a Linux PC. Do vterms and xterms run together
on a real system, on your system?
 
  Still, the keyboards are the same, so both could share the same, better(=X)
'keymaps' fsm.

 The way the keyboards are handled is quite different (on X11 there is a high
hardware abstraction; while the linux keyboard on console interacts directly
with the kernel.

So, it looks like you get a console from the kernel whether you want one or
not. I'm thinking of those virtual terminal emulator processes. It's gotta be
possible to emulate an xterm in a vterm, then neanderthals like me can use
their stone tools.
 
  I meant to say utf-8. The irony is that utf-8 also blew up the Latin-1
characters. Now everything (but English) is twice the size. (That's not true,
only the accented vowels are.)

 And some are 3 bytes long, and some other are 4 bytes long,... But who cares?
What matters is the ability to type any letter used in any human written
language. That is a very huge improvement.

I agree. I'm on your side.

Why do Greek newspapers still use ISO 8859-7?
 
   For the same reason that a majority of English language web sites still
use windows-1252, I suppose.
 
  http://www.dolnet.ta-nea.gr/ is still producing alot of new material,

 [unknown adress]

Sorry about that. The url is:

http://ta-nea.dolnet.gr/

They don't use the 'www' prefix as an alias, and I keep forgetting their parent
company name, 'dolnet'. (I asked them to register 'tanea.gr' but they haven't.)

The Communist Party newspaper is:

http://www.rizospastis.gr/

They have a much nicer name, and they also have a 'text-only' link which does
not download images, just the text. You can get the entire daily newspaper
through http. (Only I don't know if they are using unicode, but I assume not,
it's probably ISO 8859 too. You see? I've become skeptical.)

Is there a version of Linux which runs as a Microsoft Window (not
cygwin)?
 
   ?? What you say doesn't make sense. (you can on the other hand run an
 operating system inside of a virtual computer box inside another operating
 system)
 
  I should have asked, Is there a version of Microsoft Windows which will
run a copy of Linux?

 It doesn't make any more sense in the other way either :)

 Both MS-Windows and Linux are operating systems, you can run one, or the
other, not one inside the other; they are built in order to run at the very
bottom in direct interaction with the hardware.

 They can be run inside an emulated hardware box, but not as normal
programs.

  Microsoft describes Windows as a virtual-machine operating system, and
DOS does, indeed, run, as an operating system in a window.

 I never read of MS-Windows described as a virtual-machine... And what runs
in a window is in fact command.com, which is the equivalent (in much less
powerful) of /bin/bash

  I assume a VxD would map/share the PC hardware, controlled by Windows, to
the device drivers in the Linux kernel.

 No, the linux kernel needs direct access to the hardware. What you need is to
emulate an entire system, like vmware does.

I haven't been able to determine exactly what vmware does from their website,
too proprietary, too hush-hush, but I assume they write VxDs which map the
Linux kernel to the Windows VMM, and the real hardware. Someone once told me
their product ran on the NT platform, but not Windows 98, but it was quite
expensive. (All hearsay. No personal experience.)

The heart of the Windows operating system is called the VMM(=Virtual Machine
Manager). There are alot of descriptions out there, like

http://win32assembly.online.fr/vxd-tut2.html

When the VMM is running an instance of DOS, you get direct access to the DOS
INT21 interface. Your program can even write directly into display memory, just
like the old days, when your program owned the console. The VMM manages to
control access to the real display, by remapping the real memory(=the virtual
memory address space) used by DOS, which still has that weird 20-bit memory
line.

Even 32-bit protected-mode programs designed to run under a DOS extension
called ???extenders??? --I forget the jargon-- still run in a DOS Window.
Microsoft has managed to recreate the entire the DOS OS, not just command.com.

I think you could host Linux, if you had the right VxDs.

The VMM remaps the i486 ports used by the hosted OS's device drivers, so when
the Linux kernel writes to port addresses, the VMM traps them in a 

Re: Unicode Keyboard Input Linux

2004-06-15 Thread Elvis Presley
Windows Virtual Machines

Kalhmera kosme.

I'm at the library right now and our NT workstations do not have international
keyboard drivers installed.

So I have to write Greeklish.

Elvis

PS

 On Tue, 15 Jun 2004 at 16:44:48 +0200, Pablo Saratxaga wrote: 

  On Tue, Jun 15, 2004 at 05:55:18AM -0700, Elvis Presley wrote:

  Conclusion: vterms and xterms are redundant, so there is no good reason to
run them both at the same time. And xterms are more flexible.

 Yes, but there is a big difference: xterms need a running X terminal; vterms
don't.

Can you help me out? I don't have a Linux PC. Do vterms and xterms run together
on a real system, on your system?
 
  Still, the keyboards are the same, so both could share the same, better(=X)
'keymaps' fsm.

 The way the keyboards are handled is quite different (on X11 there is a high
hardware abstraction; while the linux keyboard on console interacts directly
with the kernel.

So, it looks like you get a console from the kernel whether you want one or
not. I'm thinking of those virtual terminal emulator processes. It's gotta be
possible to emulate an xterm in a vterm, then neanderthals like me can use
their stone tools.
 
  I meant to say utf-8. The irony is that utf-8 also blew up the Latin-1
characters. Now everything (but English) is twice the size. (That's not true,
only the accented vowels are.)

 And some are 3 bytes long, and some other are 4 bytes long,... But who cares?
What matters is the ability to type any letter used in any human written
language. That is a very huge improvement.

I agree. I'm on your side.

Why do Greek newspapers still use ISO 8859-7?
 
   For the same reason that a majority of English language web sites still
use windows-1252, I suppose.
 
  http://www.dolnet.ta-nea.gr/ is still producing alot of new material,

 [unknown adress]

Sorry about that. The url is:

http://ta-nea.dolnet.gr/

They don't use the 'www' prefix as an alias, and I keep forgetting their parent
company name, 'dolnet'. (I asked them to register 'tanea.gr' but they haven't.)

The Communist Party newspaper is:

http://www.rizospastis.gr/

They have a much nicer name, and they also have a 'text-only' link which does
not download images, just the text. You can get the entire daily newspaper
through http. (Only I don't know if they are using unicode, but I assume not,
it's probably ISO 8859 too. You see? I've become skeptical.)

Is there a version of Linux which runs as a Microsoft Window (not
cygwin)?
 
   ?? What you say doesn't make sense. (you can on the other hand run an
 operating system inside of a virtual computer box inside another operating
 system)
 
  I should have asked, Is there a version of Microsoft Windows which will
run a copy of Linux?

 It doesn't make any more sense in the other way either :)

 Both MS-Windows and Linux are operating systems, you can run one, or the
other, not one inside the other; they are built in order to run at the very
bottom in direct interaction with the hardware.

 They can be run inside an emulated hardware box, but not as normal
programs.

  Microsoft describes Windows as a virtual-machine operating system, and
DOS does, indeed, run, as an operating system in a window.

 I never read of MS-Windows described as a virtual-machine... And what runs
in a window is in fact command.com, which is the equivalent (in much less
powerful) of /bin/bash

  I assume a VxD would map/share the PC hardware, controlled by Windows, to
the device drivers in the Linux kernel.

 No, the linux kernel needs direct access to the hardware. What you need is to
emulate an entire system, like vmware does.

I haven't been able to determine exactly what vmware does from their website,
too proprietary, too hush-hush, but I assume they write VxDs which map the
Linux kernel to the Windows VMM, and the real hardware. Someone once told me
their product ran on the NT platform, but not Windows 98, but it was quite
expensive. (All hearsay. No personal experience.)

The heart of the Windows operating system is called the VMM(=Virtual Machine
Manager). There are alot of descriptions out there, like

http://win32assembly.online.fr/vxd-tut2.html

When the VMM is running an instance of DOS, you get direct access to the DOS
INT21 interface. Your program can even write directly into display memory, just
like the old days, when your program owned the console. The VMM manages to
control access to the real display, by remapping the real memory(=the virtual
memory address space) used by DOS, which still has that weird 20-bit memory
line.

Even 32-bit protected-mode programs designed to run under a DOS extension
called ???extenders??? --I forget the jargon-- still run in a DOS Window.
Microsoft has managed to recreate the entire the DOS OS, not just command.com.

I think you could host Linux, if you had the right VxDs.

The VMM remaps the i486 ports used by the hosted OS's device drivers, so when
the Linux kernel writes to port addresses, the VMM traps them in a 

Re: Unicode Keyboard Input Linux

2004-06-15 Thread Danilo Segan
Today at 20:13, Elvis Presley wrote:

 I haven't been able to determine exactly what vmware does from their website,
 too proprietary, too hush-hush, but I assume they write VxDs which map the
 Linux kernel to the Windows VMM, and the real hardware. Someone once told me
 their product ran on the NT platform, but not Windows 98, but it was quite
 expensive. (All hearsay. No personal experience.)

Look at http://bochs.sf.net/, or at least do a better search of the
web.  This is not the list for such a discussion (whether Linux can
or cannot be emulated on Windows). 

 It's fascinating technology, but you'd need inside information to make it work.
 Google isn't enough. Given enough time, I'm sure these VxDs will appear out of
 nowhere, as freeware or sharewhare or whatever it's called.

Or you could go with Free Software[1] such as bochs running on a Free
platform, such as GNU/Linux (though I believe it runs even on some
proprietary platforms).  It does the complete emulation of Intel
architecture, and thus works even across incompatible architectures
(it also makes it slower, but you can't have it all). No VxD is
needed (and they're not so fascinating, it's just about dumping some
code in kernel-level, and using VM86 features of Intel CPUs).

[1] http://gnu.org/philosophy/free-sw.html

 I should have said, unicode text file support. Wordpad still does unicode,
 but only in Word format, not as a text file, so I can still edit a document in
 unicode, but I have to copy and paste it into a unicode editor to create a text
 file.

As far as I remember, Notepad on NT (New Technology ;) systems has
been doing Unicode for text files as long as it exists (or at least
since NT4, that's the first I saw it on), if we consider so-and-so
UCS-2/UTF-16 support as Unicode support.

Cheers,
Danilo

--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/



Re: Unicode Keyboard Input Linux

2004-06-15 Thread Vasilis Vasaitis
On Mon, Jun 14, 2004 at 11:39:44PM +0200, Pablo Saratxaga wrote:
 Kaixo!
 
 On Sat, Jun 12, 2004 at 09:56:52AM -0700, Elvis Presley wrote:

..[snip]..

  This is about as complicated as it gets in polytonic
  Greek, three dead keys, two pre-position, one
  post-position, 'w' representing omega, and an 'i' for
  iota subscript. 
 
 No, dead keys cannot be post-position; they must always be typed
 *before* the key they modify; that is in fact the very definition
 of a dead_key: they modify the behavioiur of what is typed after them.
 If it is typed after it is not a dead key, but just a regular key.
 
 The ways already defined in el_GR.UTF-8 X11 Compose file for U1fa2
 (, omega with psili varia and ypogrammeni) are:
 
 Multi_key bar greater grave Greek_omega   :   U1fa2
 Multi_key bar grave greater Greek_omega   :   U1fa2
 Multi_key greater bar grave Greek_omega   :   U1fa2
 Multi_key greater grave bar Greek_omega   :   U1fa2
 Multi_key grave bar greater Greek_omega   :   U1fa2
 Multi_key grave greater bar Greek_omega   :   U1fa2
 dead_iota dead_horn dead_grave Greek_omega  :   U1fa2
 dead_iota dead_grave dead_horn Greek_omega  :   U1fa2
 dead_horn dead_iota dead_grave Greek_omega  :   U1fa2
 dead_horn dead_grave dead_iota Greek_omega  :   U1fa2
 dead_grave dead_iota dead_horn Greek_omega  :   U1fa2
 dead_grave dead_horn dead_iota Greek_omega  :   U1fa2
 
 6 ways to type it with dead keys (corresponding to the six
 possible combinations of the three dead keys; but dead keys
 always after the letter)
 and 6 ways to type it with Multi_key (you press Multi_key, then
 the following keys in the given order).

  Note that, even Multi_key combinations always have the letter last,
so that, when a letter arrives, it is certain that the sequence is
complete. See my comments below.

 What you would like would be in fact:
 
 dead_horn dead_grave Greek_omega U0345 :   U1fa2
 dead_grave dead_horn Greek_omega U0345 :   U1fa2
 
 (that is, two dead keys, followed by two normal keys; a key sending
 Greek_omega and a key sending U0345 (COMBINING GREEK YPOGEGRAMMENI)
 
 I haven't tested it but if it works, it could indeed be added for
 all the cases and a layout with U0345 instead of dead_iota, if
 that is more intuitive to type.
 
  The keyboard map is therefore more than a map, it is a
  fsm, a stateful-map.
 
 That is not supported at all.
 If you need that, you need to develop an input method actually
 (like japanese or vietnamese use), that is, a program that interpretes
 what you type and produces a different input.
 
 Yes there is something of that in console (but very limited) and
 in X11 (more powerfull), but it is always linear.
 
 (also, I m' not sure if it is possible to have, for example,
 dead_horn dead_grave Greek_omega U0345 and
 dead_horn dead_grave Greek_omega sequences (that is, sequences
 that one is subset of another))

  You can't. The problem with that is that, if you wanted to type the
second sequence, the composition engine wouldn't know whether to stop
there and emit the symbol, or to wait for another symbol to complete
the sequence. So it waits. This could probably be fixed (partly): when
a symbol comes that causes the sequence to become invalid, the engine
could check the compose sequence just before the arrival of that
symbol, and emit the result. But this is not the current behaviour.

  If I change keyboards in
  midstream (using alt-a, for example), the fsm would
  output the components of an unaccepted character
  individually. How far will keymaps go?
 
 You can't.
 pressing Alt-A means (or any other key) means you broke the sequence.
 in such case you simply lost what you typed in the incomplete sequence.

  Indeed.


-- 
Vasilis Vasaitis
A man is well or woe as he thinks himself so.



--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/



Re: Unicode Keyboard Input Linux

2004-06-15 Thread Vasilis Vasaitis
On Mon, Jun 14, 2004 at 08:43:38AM -0700, Elvis Presley wrote:

..[snip]..

 Comparing characters would be easy, they compare as
 unsigned integers, but sorting them would be a
 problem, because you'd want to group all the
 (accented) vowels together, according to language
 specific rules. In Greek, this wouldn't be a problem,
 because monotonic vowels and polytonic vowels, though
 occupying different code ranges, are not mixed in the
 same word: they are essentially different languages. A
 'tonos' is not a 'oxia' or a 'varia'.

  Actually, tonos and oxia are treated as equivalents in Unicode.
Nevertheless, sorting wouldn't be a problem indeed, because it is done
according to the base letter only, punctuation is irrelevant.

 Why do Greek newspapers still use ISO 8859-7?

  If it ain't broke, don't fix it.

 nightmare), but if you're only working in Greek, why
 not stick with what you know?

  Exactly. Nothing to do with size issues, and everything to do with
that. Plus, a major operating system doesn't really support UTF-8, and
instead concentrates on UTF-16, which is unusable in UNIX/GNU systems
for most practical purposes.

 My Microsoft browser(=IE) has problems with ISO Greek
 and Windows Greek, especially capital Alpha with
 tonos: it gets confused, and displays a box.

  Well actually, this particular letter is the only incompatibility
between the two character sets. In ISO-8859-7, this letter occupies
the code point that MS Word once had hardcoded as representing the
paragraph symbol. So for Windows-1253, Microsoft put the paragraph
symbol there and moved capital Alpha with tonos elsewhere.



-- 
Vasilis Vasaitis
A man is well or woe as he thinks himself so.



--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/