Re: Unicode Keyboard Input Linux
Hi All, Thanks for your help. I'm still processing your input. They just changed my Yahoo! mail; let's hope you get this. Elvis PS User space text-mode only virtual terminals On Mon, Jun 14, 2004 at 21:38:13 +0200, Pablo Saratxaga wrote: [...] It would be perfectly ok to provide only very minimalistic kernel support (even simpler and lighter than the current one) and have a user space 'vc' loaded early in the boot process. Or none at all. Just move the VC mux out of the kernel and into user space: == --+ : +---+ : +--+ ...-:-|console|-:-| | --+ : +---+ +---+ : |VC mux| : +|tcp|---:-| | : | +---+ : +--+ : | : : | +---+ : +-+ : | +--|tcp|--:-| | : | | +---+ : |vterm| : | | +-:-| | : | | | : +-+ : | | | m+-+s ++ : +-+ : | | +-|ptty0|--|tty0|-:-|shell| : | |+-+++ : +-+ : | | : : | | +---+ : +-+ : | | +|tcp|--:-| | : | | | +---+ : |vterm| : | | | +-:-| | : | | | | : +-+ : | | | | m+-+s ++ : +-+ : | | | +-|ptty1|--|tty1|-:-|login| : | | | +-+++ : +-+ : v v v : : +--++---+ : +-+ : ... --|IP mux|--|tcp|--:-| | : +--++---+ : |vterm| : +-:-| | : | : +-+ : | m+-+s ++ : +-+ : +-|ptty2|--|tty2|-:-|getty| : +-+++ : +-+ == This is exactly the same situation which applies to xterms, only the VC mux opens the console in character mode. It then forks a fixed number of 'vterms' as child processes. Each vterm holds the character contents of its display as well as the state of its keyboard. Conclusion: vterms and xterms are redundant, so there is no good reason to run them both at the same time. And xterms are more flexible. Still, the keyboards are the same, so both could share the same, better(=X) 'keymaps' fsm. 512(=2**9) character glyphs in the vterm character buffer would be plenty for my purposes: Latin (french, german, spanish, italian), Greek (mono- and polytonic) and Cyrillic, but I'd have to be able to chose the unicode characters I want, and map them to glyphs in the console-font. (You couldn't pull the IP mux out as easily, relying on traditional Unix pipes for IPC... that's another mailing list.) Unicode [...] is prejudiced against non-speakers. ??? [...] I meant to say utf-8. The irony is that utf-8 also blew up the Latin-1 characters. Now everything (but English) is twice the size. (That's not true, only the accented vowels are.) The Perseus Project does a nice job with unicode, it has to, because there is no national character set for poly Greek (well there is, sort of, the encoding schemes used in academia, but they are less well known, and the unicode font support is better). Why do Greek newspapers still use ISO 8859-7? For the same reason that a majority of English language web sites still use windows-1252, I suppose. I guess we'll have to ask them. it looks like these older character sets will be around for a long time. Yes, but not for that reason (to save space); they are around because there is a lot of *OLD* data in those encodings [...] http://www.dolnet.ta-nea.gr/ is still producing alot of new material, and their mix is text-oriented. I thought it might be because they were using web authoring tools based on the older, national character set. Wide characters are easily compressed, by the file system, or the network. In fact, there is alot of network
Re: Unicode Keyboard Input Linux
Windows Virtual Machines Kalhmera kosme. I'm at the library right now and our NT workstations do not have international keyboard drivers installed. So I have to write Greeklish. Elvis PS On Tue, 15 Jun 2004 at 16:44:48 +0200, Pablo Saratxaga wrote: On Tue, Jun 15, 2004 at 05:55:18AM -0700, Elvis Presley wrote: Conclusion: vterms and xterms are redundant, so there is no good reason to run them both at the same time. And xterms are more flexible. Yes, but there is a big difference: xterms need a running X terminal; vterms don't. Can you help me out? I don't have a Linux PC. Do vterms and xterms run together on a real system, on your system? Still, the keyboards are the same, so both could share the same, better(=X) 'keymaps' fsm. The way the keyboards are handled is quite different (on X11 there is a high hardware abstraction; while the linux keyboard on console interacts directly with the kernel. So, it looks like you get a console from the kernel whether you want one or not. I'm thinking of those virtual terminal emulator processes. It's gotta be possible to emulate an xterm in a vterm, then neanderthals like me can use their stone tools. I meant to say utf-8. The irony is that utf-8 also blew up the Latin-1 characters. Now everything (but English) is twice the size. (That's not true, only the accented vowels are.) And some are 3 bytes long, and some other are 4 bytes long,... But who cares? What matters is the ability to type any letter used in any human written language. That is a very huge improvement. I agree. I'm on your side. Why do Greek newspapers still use ISO 8859-7? For the same reason that a majority of English language web sites still use windows-1252, I suppose. http://www.dolnet.ta-nea.gr/ is still producing alot of new material, [unknown adress] Sorry about that. The url is: http://ta-nea.dolnet.gr/ They don't use the 'www' prefix as an alias, and I keep forgetting their parent company name, 'dolnet'. (I asked them to register 'tanea.gr' but they haven't.) The Communist Party newspaper is: http://www.rizospastis.gr/ They have a much nicer name, and they also have a 'text-only' link which does not download images, just the text. You can get the entire daily newspaper through http. (Only I don't know if they are using unicode, but I assume not, it's probably ISO 8859 too. You see? I've become skeptical.) Is there a version of Linux which runs as a Microsoft Window (not cygwin)? ?? What you say doesn't make sense. (you can on the other hand run an operating system inside of a virtual computer box inside another operating system) I should have asked, Is there a version of Microsoft Windows which will run a copy of Linux? It doesn't make any more sense in the other way either :) Both MS-Windows and Linux are operating systems, you can run one, or the other, not one inside the other; they are built in order to run at the very bottom in direct interaction with the hardware. They can be run inside an emulated hardware box, but not as normal programs. Microsoft describes Windows as a virtual-machine operating system, and DOS does, indeed, run, as an operating system in a window. I never read of MS-Windows described as a virtual-machine... And what runs in a window is in fact command.com, which is the equivalent (in much less powerful) of /bin/bash I assume a VxD would map/share the PC hardware, controlled by Windows, to the device drivers in the Linux kernel. No, the linux kernel needs direct access to the hardware. What you need is to emulate an entire system, like vmware does. I haven't been able to determine exactly what vmware does from their website, too proprietary, too hush-hush, but I assume they write VxDs which map the Linux kernel to the Windows VMM, and the real hardware. Someone once told me their product ran on the NT platform, but not Windows 98, but it was quite expensive. (All hearsay. No personal experience.) The heart of the Windows operating system is called the VMM(=Virtual Machine Manager). There are alot of descriptions out there, like http://win32assembly.online.fr/vxd-tut2.html When the VMM is running an instance of DOS, you get direct access to the DOS INT21 interface. Your program can even write directly into display memory, just like the old days, when your program owned the console. The VMM manages to control access to the real display, by remapping the real memory(=the virtual memory address space) used by DOS, which still has that weird 20-bit memory line. Even 32-bit protected-mode programs designed to run under a DOS extension called ???extenders??? --I forget the jargon-- still run in a DOS Window. Microsoft has managed to recreate the entire the DOS OS, not just command.com. I think you could host Linux, if you had the right VxDs. The VMM remaps the i486 ports used by the hosted OS's device drivers, so when the Linux kernel writes to port addresses, the VMM traps them in a
Re: Unicode Keyboard Input Linux
Windows Virtual Machines Kalhmera kosme. I'm at the library right now and our NT workstations do not have international keyboard drivers installed. So I have to write Greeklish. Elvis PS On Tue, 15 Jun 2004 at 16:44:48 +0200, Pablo Saratxaga wrote: On Tue, Jun 15, 2004 at 05:55:18AM -0700, Elvis Presley wrote: Conclusion: vterms and xterms are redundant, so there is no good reason to run them both at the same time. And xterms are more flexible. Yes, but there is a big difference: xterms need a running X terminal; vterms don't. Can you help me out? I don't have a Linux PC. Do vterms and xterms run together on a real system, on your system? Still, the keyboards are the same, so both could share the same, better(=X) 'keymaps' fsm. The way the keyboards are handled is quite different (on X11 there is a high hardware abstraction; while the linux keyboard on console interacts directly with the kernel. So, it looks like you get a console from the kernel whether you want one or not. I'm thinking of those virtual terminal emulator processes. It's gotta be possible to emulate an xterm in a vterm, then neanderthals like me can use their stone tools. I meant to say utf-8. The irony is that utf-8 also blew up the Latin-1 characters. Now everything (but English) is twice the size. (That's not true, only the accented vowels are.) And some are 3 bytes long, and some other are 4 bytes long,... But who cares? What matters is the ability to type any letter used in any human written language. That is a very huge improvement. I agree. I'm on your side. Why do Greek newspapers still use ISO 8859-7? For the same reason that a majority of English language web sites still use windows-1252, I suppose. http://www.dolnet.ta-nea.gr/ is still producing alot of new material, [unknown adress] Sorry about that. The url is: http://ta-nea.dolnet.gr/ They don't use the 'www' prefix as an alias, and I keep forgetting their parent company name, 'dolnet'. (I asked them to register 'tanea.gr' but they haven't.) The Communist Party newspaper is: http://www.rizospastis.gr/ They have a much nicer name, and they also have a 'text-only' link which does not download images, just the text. You can get the entire daily newspaper through http. (Only I don't know if they are using unicode, but I assume not, it's probably ISO 8859 too. You see? I've become skeptical.) Is there a version of Linux which runs as a Microsoft Window (not cygwin)? ?? What you say doesn't make sense. (you can on the other hand run an operating system inside of a virtual computer box inside another operating system) I should have asked, Is there a version of Microsoft Windows which will run a copy of Linux? It doesn't make any more sense in the other way either :) Both MS-Windows and Linux are operating systems, you can run one, or the other, not one inside the other; they are built in order to run at the very bottom in direct interaction with the hardware. They can be run inside an emulated hardware box, but not as normal programs. Microsoft describes Windows as a virtual-machine operating system, and DOS does, indeed, run, as an operating system in a window. I never read of MS-Windows described as a virtual-machine... And what runs in a window is in fact command.com, which is the equivalent (in much less powerful) of /bin/bash I assume a VxD would map/share the PC hardware, controlled by Windows, to the device drivers in the Linux kernel. No, the linux kernel needs direct access to the hardware. What you need is to emulate an entire system, like vmware does. I haven't been able to determine exactly what vmware does from their website, too proprietary, too hush-hush, but I assume they write VxDs which map the Linux kernel to the Windows VMM, and the real hardware. Someone once told me their product ran on the NT platform, but not Windows 98, but it was quite expensive. (All hearsay. No personal experience.) The heart of the Windows operating system is called the VMM(=Virtual Machine Manager). There are alot of descriptions out there, like http://win32assembly.online.fr/vxd-tut2.html When the VMM is running an instance of DOS, you get direct access to the DOS INT21 interface. Your program can even write directly into display memory, just like the old days, when your program owned the console. The VMM manages to control access to the real display, by remapping the real memory(=the virtual memory address space) used by DOS, which still has that weird 20-bit memory line. Even 32-bit protected-mode programs designed to run under a DOS extension called ???extenders??? --I forget the jargon-- still run in a DOS Window. Microsoft has managed to recreate the entire the DOS OS, not just command.com. I think you could host Linux, if you had the right VxDs. The VMM remaps the i486 ports used by the hosted OS's device drivers, so when the Linux kernel writes to port addresses, the VMM traps them in a
Re: Unicode Keyboard Input Linux
Today at 20:13, Elvis Presley wrote: I haven't been able to determine exactly what vmware does from their website, too proprietary, too hush-hush, but I assume they write VxDs which map the Linux kernel to the Windows VMM, and the real hardware. Someone once told me their product ran on the NT platform, but not Windows 98, but it was quite expensive. (All hearsay. No personal experience.) Look at http://bochs.sf.net/, or at least do a better search of the web. This is not the list for such a discussion (whether Linux can or cannot be emulated on Windows). It's fascinating technology, but you'd need inside information to make it work. Google isn't enough. Given enough time, I'm sure these VxDs will appear out of nowhere, as freeware or sharewhare or whatever it's called. Or you could go with Free Software[1] such as bochs running on a Free platform, such as GNU/Linux (though I believe it runs even on some proprietary platforms). It does the complete emulation of Intel architecture, and thus works even across incompatible architectures (it also makes it slower, but you can't have it all). No VxD is needed (and they're not so fascinating, it's just about dumping some code in kernel-level, and using VM86 features of Intel CPUs). [1] http://gnu.org/philosophy/free-sw.html I should have said, unicode text file support. Wordpad still does unicode, but only in Word format, not as a text file, so I can still edit a document in unicode, but I have to copy and paste it into a unicode editor to create a text file. As far as I remember, Notepad on NT (New Technology ;) systems has been doing Unicode for text files as long as it exists (or at least since NT4, that's the first I saw it on), if we consider so-and-so UCS-2/UTF-16 support as Unicode support. Cheers, Danilo -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
Re: Unicode Keyboard Input Linux
On Mon, Jun 14, 2004 at 11:39:44PM +0200, Pablo Saratxaga wrote: Kaixo! On Sat, Jun 12, 2004 at 09:56:52AM -0700, Elvis Presley wrote: ..[snip].. This is about as complicated as it gets in polytonic Greek, three dead keys, two pre-position, one post-position, 'w' representing omega, and an 'i' for iota subscript. No, dead keys cannot be post-position; they must always be typed *before* the key they modify; that is in fact the very definition of a dead_key: they modify the behavioiur of what is typed after them. If it is typed after it is not a dead key, but just a regular key. The ways already defined in el_GR.UTF-8 X11 Compose file for U1fa2 (, omega with psili varia and ypogrammeni) are: Multi_key bar greater grave Greek_omega : U1fa2 Multi_key bar grave greater Greek_omega : U1fa2 Multi_key greater bar grave Greek_omega : U1fa2 Multi_key greater grave bar Greek_omega : U1fa2 Multi_key grave bar greater Greek_omega : U1fa2 Multi_key grave greater bar Greek_omega : U1fa2 dead_iota dead_horn dead_grave Greek_omega : U1fa2 dead_iota dead_grave dead_horn Greek_omega : U1fa2 dead_horn dead_iota dead_grave Greek_omega : U1fa2 dead_horn dead_grave dead_iota Greek_omega : U1fa2 dead_grave dead_iota dead_horn Greek_omega : U1fa2 dead_grave dead_horn dead_iota Greek_omega : U1fa2 6 ways to type it with dead keys (corresponding to the six possible combinations of the three dead keys; but dead keys always after the letter) and 6 ways to type it with Multi_key (you press Multi_key, then the following keys in the given order). Note that, even Multi_key combinations always have the letter last, so that, when a letter arrives, it is certain that the sequence is complete. See my comments below. What you would like would be in fact: dead_horn dead_grave Greek_omega U0345 : U1fa2 dead_grave dead_horn Greek_omega U0345 : U1fa2 (that is, two dead keys, followed by two normal keys; a key sending Greek_omega and a key sending U0345 (COMBINING GREEK YPOGEGRAMMENI) I haven't tested it but if it works, it could indeed be added for all the cases and a layout with U0345 instead of dead_iota, if that is more intuitive to type. The keyboard map is therefore more than a map, it is a fsm, a stateful-map. That is not supported at all. If you need that, you need to develop an input method actually (like japanese or vietnamese use), that is, a program that interpretes what you type and produces a different input. Yes there is something of that in console (but very limited) and in X11 (more powerfull), but it is always linear. (also, I m' not sure if it is possible to have, for example, dead_horn dead_grave Greek_omega U0345 and dead_horn dead_grave Greek_omega sequences (that is, sequences that one is subset of another)) You can't. The problem with that is that, if you wanted to type the second sequence, the composition engine wouldn't know whether to stop there and emit the symbol, or to wait for another symbol to complete the sequence. So it waits. This could probably be fixed (partly): when a symbol comes that causes the sequence to become invalid, the engine could check the compose sequence just before the arrival of that symbol, and emit the result. But this is not the current behaviour. If I change keyboards in midstream (using alt-a, for example), the fsm would output the components of an unaccepted character individually. How far will keymaps go? You can't. pressing Alt-A means (or any other key) means you broke the sequence. in such case you simply lost what you typed in the incomplete sequence. Indeed. -- Vasilis Vasaitis A man is well or woe as he thinks himself so. -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
Re: Unicode Keyboard Input Linux
On Mon, Jun 14, 2004 at 08:43:38AM -0700, Elvis Presley wrote: ..[snip].. Comparing characters would be easy, they compare as unsigned integers, but sorting them would be a problem, because you'd want to group all the (accented) vowels together, according to language specific rules. In Greek, this wouldn't be a problem, because monotonic vowels and polytonic vowels, though occupying different code ranges, are not mixed in the same word: they are essentially different languages. A 'tonos' is not a 'oxia' or a 'varia'. Actually, tonos and oxia are treated as equivalents in Unicode. Nevertheless, sorting wouldn't be a problem indeed, because it is done according to the base letter only, punctuation is irrelevant. Why do Greek newspapers still use ISO 8859-7? If it ain't broke, don't fix it. nightmare), but if you're only working in Greek, why not stick with what you know? Exactly. Nothing to do with size issues, and everything to do with that. Plus, a major operating system doesn't really support UTF-8, and instead concentrates on UTF-16, which is unusable in UNIX/GNU systems for most practical purposes. My Microsoft browser(=IE) has problems with ISO Greek and Windows Greek, especially capital Alpha with tonos: it gets confused, and displays a box. Well actually, this particular letter is the only incompatibility between the two character sets. In ISO-8859-7, this letter occupies the code point that MS Word once had hardcoded as representing the paragraph symbol. So for Windows-1253, Microsoft put the paragraph symbol there and moved capital Alpha with tonos elsewhere. -- Vasilis Vasaitis A man is well or woe as he thinks himself so. -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/