Re: mbscmp
On Mon, Feb 25, 2002 at 02:38:13PM -0500, Michael B Allen wrote: > What's the ultimate goal here? Are any of these functions *supposed* > to work on multi-byte characters, or will there be mbs* functions? I haven't tested this, nor really done anything relating to programming with i18n, but based on looking at man pages, you can use one of three functions (mbstowcs, mbsrtowcs, or mbsnrtowcs) to convert your multibyte string to a wide character string (an array of type wchar_t, one wchar_t per *character*), and then use the many wcs* functions to do various tests. My recollection of the consensus on this list is that for internal purposes, wchar_t is the way to go, and conversion to multibyte strings of char is necessary only for I/O, and there only when you can't use functions like fwprintf. However, wchar_t is only guaranteed to be Unicode (which encoding?) when the macro __STDC_ISO_10646__ is defined, as is done with glibc 2.2. Please, i18n/utf8 gurus, please correct the last few sentences! :-) - Jimmy Kaplowitz [EMAIL PROTECTED] -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
Re: Two questions about console utf8 support
On Sat, Dec 15, 2001 at 02:39:46AM +0330, Behdad Esfahbod wrote: > > Also if you are thinking out of distribution, you can simply put your > tty independent material in /etc/rc.d/rc.local Again you're thinking about Red Hat; Debian has no /etc/rc.d directory (it just uses /etc/init.d and /etc/rc?.d directly), and it doesn't have the equivalent of an rc.local script. The thing to do would be to add a script in /etc/init.d and then run something along the lines of update-rc.d scriptname defaults 99 01 to activate it. - Jimmy Kaplowitz [EMAIL PROTECTED] (also, @debian.org :) -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
Re: Two questions about console utf8 support
On Fri, Dec 14, 2001 at 06:18:34PM +, [EMAIL PROTECTED] wrote: > > > Maybe putting a ESC%G sequence in your /etc/issue (?). > > > This does almost everything necessary; however, it does not do the > > equivalent of 'kbd_mode -u' > > I could write a wrapper > > You know about the existence of unicode_start ? Yes, but I didn't want to mention it in case it was Debian-specific. Anyway, I still would have to write a wrapper around that, because I can't invoke that directly from the boot process - it doesn't accept the "start" / "stop" arguments that are used by that process. Maybe something like: -- begin file /etc/init.d/unicode -- #!/bin/sh case $1 in start) unicode_start ;; stop) unicode_stop ;; restart|reload|force-reload) unicode_stop; unicode_start ;; *) echo "Usage: $0 [start|stop|restart|reload|force-reload]" 1>&2; return 1;; esac return 0 -- end file /etc/init.d/unicode -- How does that look to you? The only question is, would this affect all the virtual consoles (if run from rcS.d) or only the boot console (i.e., tty1)? If the latter, how would it be best to adapt this to affect all of the consoles? (I think this is different depending on whether you use the framebuffer or vga text mode - I use the framebuffer.) - Jimmy Kaplowitz [EMAIL PROTECTED] -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
Re: Two questions about console utf8 support
On Fri, Dec 14, 2001 at 06:23:16PM +0330, Behdad Esfahbod wrote: > > - What is the cleanest possible way to get the console (all of the > > virtual consoles, not just the bootup one) into UTF-8 mode as early as > > possible? If it matters, I'm using the Rage 128 framebuffer driver on > > x86 Debian sid. > > Maybe putting a ESC%G sequence in your /etc/issue (?). This does almost everything necessary; however, it does not do the equivalent of 'kbd_mode -u', which sets the keyboard to UTF-8 mode. I could write a wrapper around getty that I call from /etc/inittab, but that seems ugly...I may end up doing that though. It might also be nice to get unicode going on from early in the bootup process. Should I put something in the rcS.d directory that sets everything up unicodely (kbd_mode -u plus ESC%G) ? I wonder if that would work on all my consoles, or if I'd have to wrap getty as above. I could also just put something in the /etc/profile, which would satisfy me since I don't use tcsh.. I guess I was just asking what the best was. Thank you for your suggestion, though; it's a good one. > > - I remember hearing about a tool that takes a bdf or some other full > > font format and being able to extract any arbitrary set of 512 > > characters from it and put that set into a console font. Can any of > > you point me to it? It would be nice to make various Unicode console > > fonts tailored to my tastes. > > It comes with Unicode VGA font, by Dmitry Bolkhovityanov <[EMAIL PROTECTED]> > avalable at http://www.inp.nsk.su/~bolkhov/files/fonts/univga/ Thanks for the pointer. I'll investigate it when I get a chance. - Jimmy Kaplowitz [EMAIL PROTECTED] -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
Two questions about console utf8 support
Hi all. I'm wondering two things about using UTF-8 on the console, where I spend a significant portion, though by no means the majority, of my computer time: - What is the cleanest possible way to get the console (all of the virtual consoles, not just the bootup one) into UTF-8 mode as early as possible? If it matters, I'm using the Rage 128 framebuffer driver on x86 Debian sid. - I remember hearing about a tool that takes a bdf or some other full font format and being able to extract any arbitrary set of 512 characters from it and put that set into a console font. Can any of you point me to it? It would be nice to make various Unicode console fonts tailored to my tastes. Have a good day, and keep up the high-quality i18n hackery. - Jimmy Kaplowitz [EMAIL PROTECTED] -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
Re: UTF8 Terminal Detection
On Tue, Nov 13, 2001 at 06:37:38PM -0600, David Starner wrote: > On Wed, Nov 14, 2001 at 10:00:02AM +1100, George W Gerrity wrote: > > My interest (and my interest in monitoring this e-mail group) lies in > > the possibility of getting involved in an open WYSIWYG document > > editor based on XML and UTF-8, so I (and others) can get out of the > > thrall of Word. To be successful, such an application will HAVE to be > > a) WYSIWYG; b) multi platform; c) able to read and dump rtf format, > > even if the result is crippled; d) be modular and open (source and > > APIs), both to spread the development effort and to encourage its use. > > Have you considered looking at existing systems? What about KWord? Why > does it have to XML-based? To the best of my knowledge, kword is XML, just compressed with gzip (and maybe tar)... anyway, it's some standard unix transformation of XML. - Jimmy Kaplowitz jimmy@{kaplowitz,debian}.org -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
Re: unicode in emacs 21
On Sun, Oct 28, 2001 at 05:04:22PM +, Dave Love wrote: > >>>>> "OD" == Oliver Doepner <[EMAIL PROTECTED]> writes: > > OD> There is vim 6.x now with full utf-8 support on the xterm. > > [Does `full utf-8 support' mean level 3?] Well, it handles double-width characters as well as up to two combining characters. It's the only editor I've used (including Yudit) that could display the sequence U+0283 U+034D correctly. > Emacs can do utf-8 i/o under ttys that support it, though you don't > _need_ such support -- either input or output -- to edit utf-8 text. > > OD> It is much faster than emacs on x11 of course. > > I'm surprised that's much of an issue. I assume Emacs under X is much > more capable. Well, Emacs does have more features (including some that are less essential, such as doctor mode :), but vim has quite enough for most purposes. > OD> I was happy to see Emacs 21 announced. but the unicode support > OD> does not seem to have moved forward very much > > It's moved from zero to the state where it's perfectly fine for > editing at least the Western technical text that interests me. E.g., > Kuhn's UTF-8-demo.utf works modulo the level 2 text, for which one can > add support straightforwardly at the Lisp level. It also allowed > producing coding systems for all the 8-bit charsets for GNUish > locales, which perhaps matters more in the wide world than utf-8 per > se. With some customization, I can also at least _display_ > utf-8-encoded CJK text. I can send and receive utf-8-encoded mail and > browse utf-8-encoded web sites (with the development W3 package). Vim can display the UTF-8-demo file perfectly, with no exceptions. Also, although I haven't tested this, I am told it can write as well as display utf-8 CJK text. - Jimmy Kaplowitz [EMAIL PROTECTED] / [EMAIL PROTECTED] PGP signature
Re: Vim 6.0 has been released (debian info)
On Thu, Oct 04, 2001 at 12:16:12PM +0900, Tomohiro KUBOTA wrote: > Strange... I think the newest Debian "vim" package is about 4Mbytes > and has locale support including UTF-8. That's not true on my up-to-date Debian system, running sid/unstable. The current release, 6.0.011-2 (which corresponds to upstream vim 6.0.11), is compiled with multi_byte disabled. The alpha and beta packages had it enabled, and I hereby put in my vote for it to be re-enabled. Wichert, a number of us think UTF-8 support is essential to the system of the future. If you want a minimalist version of vim without UTF-8, reintroduce vim-tiny. - Jimmy Kaplowitz jimmy@{debian,kaplowitz}.org PGP signature
Linux Distributions (was Re: Upgrading to glibc 2.2)
(warning: long, only-slightly-i18n-related email) On Sun, Sep 09, 2001 at 09:48:45AM +0100, Markus Kuhn wrote: > I've been using SuSE 7.2 with kernel 2.4.4 since mid-May and never had any > problems. I find SuSE 7 in general far more comprehensive than Red Hat 7, > which I have to use at work and which lacks a number of packages that I > got very used to on my home machine, starting with everyday trivialities > like xautolock and xearth. (By mere lines of source code, SuSE is around > 60% larger than Windows 2000 if you install everything!) Of course, then there's Debian, which has 7 or 8 thousand packages in its archive (which takes up around 35 GB on each Debian server if I remember correctly) and a very active and well-archived set of mailing lists. It has xautolock, xearth, and much more. All of this is very easy to install with apt and dselect, two very essential Debian packaging tools. They fetch and install all dependencies automatically, and, thanks to debconf, they (usually) prompt you for any answers that are needed. Many if not most Debian developers live outside the US, and many live in countries where English is not the primary language. A few examples of where Debian developers live are Sweden, France, Japan, Austria, and Croatia. Therefore there is much attention paid to i18n. There is even a project to translate the _descriptions_ of packages into other languages. The downside to Debian is that it has a horrible install process, which would frighten off almost any new user. This is even worse on non-x86 platforms. (However, Debian supports an amazing number of platforms; in the upcoming release, Debian is planning to support x86, m68k, ppc, alpha, mips big-endian, mips little-endian, ia64, hppa, and s390 at least.) It also doesn't have great documentation. On the other hands, the mailing lists are very friendly and helpful. SuSE is good, in the sense that it has an easy install process, and lots of graphical tools (the closest Debian comes to that is lots of character-cell GUIs, but if you want a graphical Debian-based system try Progeny), and there's a company to get help from if you feel more comfortable doing things that way. It also has a printed book of documentation if you buy the official version. However, it uses RPMs, which don't really have a good way to deal with dependencies other than yelling at you until you install them. Also, there are SuSE RPMs, Mandrake RPMs, Red Hat RPMs, etc., all of which are subtly incompatible. One nice feature of RPMs, though, is the fact that there is checking done when installing them to protect against tampering or file corruption. Debian can use RPMs as well as its native format, but one rarely needs to since so many things are packaged. Conclusion: Debian is not for the newbie who wants everything to be nice, pretty, easy, and Windows-like (although the upcoming release will include very nicely integrated KDE support, so you can have that too to some degree), but once you gain a certain level of proficiency, or if you want to gain that proficiency, it's very easy to maintain and use, and very well thought-out. Full disclosure: I am a Debian developer; i.e., I'm part of the Debian project. But I'm not getting paid one penny. I joined this past May because I liked it before I joined, not for some sort of compensation. - Jimmy Kaplowitz [EMAIL PROTECTED] / [EMAIL PROTECTED] P. S. - Mandrake seems to be the most popular choice for new Linux users. Two great advantages: wonderful install system including nondestructive partition resizer, and lots of graphical tools that work well if you use them exclusively. One big disadvantage: The graphical tools break and become useless if you modify the configuration files by hand. - Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
Re: Encoding conversions
On Sat, Sep 08, 2001 at 05:44:58PM -0400, Michael B. Allen wrote: > Yeah, I know. I just have never upgraded glibc on a kernel 2.2 system. The > chicken and egg thing scares me :) Well, I don't think there is any such problem; you can use either version of glibc (2.1 or 2.2) with either version of the kernel (2.2 or 2.4). I am not certain about glibc 2.1 with kernel 2.4; there might be a problem there if the glibc 2.1 package was compiled against pre-kernel-2.4 kernel headers. The glibc 2.2 package is almost certainly compiled against 2.4, so you should be fine; certainly the Debian package of glibc 2.2 that I use on my system definitely works with both versions of the kernel. So, you could upgrade glibc and then the kernel. Or, if it makes you feel better, you could just download the new kernel and glibc RPMs and all their dependencies and install them all in one "rpm -Uvh" line. > I guess I'm really waiting for RH 7.2 to come out. That would make it easier :) - Jimmy Kaplowitz [EMAIL PROTECTED] P. S. - If I said anything inaccurate or misleading or made any glaring omissions, please correct me. But I am not aware of any such. - Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
Re: Encoding conversions
On Sat, Sep 08, 2001 at 02:51:54PM -0400, Michael B. Allen wrote: > On Sat, Sep 08, 2001 at 12:02:55PM +0100, Markus Kuhn wrote: > > Any Linux user/developper interested in locales and character sets > > is today *strongly* recommended to upgrade to a glibc 2.2 based > > distribution. There have been huge improvements between 2.1 and 2.2! > > Yeah but all the new distros use kernel 2.4 which seems to be the > development kernel masqurading as the stable release. I'd like to see > some VM stability before I throw this 2.2 rock away. They only recently > discovered that page aging didn't work at all the last 9 releases (err, > something fundamentally wrong there). Markus meant _glibc_ version 2.2, which is different than _kernel_ version 2.2. Most very-up-to-date distributions such as the unstable or testing flavors of Debian use glibc 2.2 with kernel 2.4. I use this myself. There are some rough edges around the 2.4 kernels, but nothing that insurmountably interferes with use. I am quite happy with it (I have 2.4.9 on my home machine and 2.4.7 on my work machine). - Jimmy Kaplowitz [EMAIL PROTECTED] - Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
Re: How to use Unicode
On Fri, Aug 31, 2001 at 02:46:11PM +0100, Markus Kuhn wrote: > On Thu, 30 Aug 2001, Julien =?ISO-8859-1?Q?=C9LIE ?= wrote: > > I'm a French who uses Linux Red Hat 7.1. > > > > I wish I could write greek characters. However, I don't > > know what I should install in order to have Unicode compatibility > > and they what I should to if I want to write greek polytonic characters. > > LANG=fr_FR.UTF-8 xterm \ > -fn '-Misc-Fixed-Medium-R-SemiCondensed--13-120-75-75-C-60-ISO10646-1' > > will give you an xterm to display polytonic Greek characters such as > those you find in the UTF-8 files in > > http://www.cl.cam.ac.uk/~mgk25/ucs/examples/ UTF-8-demo.txt Don't you need the -u8 option for xterm, in the absence of an equivalent line in .Xresources? Forgive me if I'm wrong, this is my first post on this list. - Jimmy Kaplowitz [EMAIL PROTECTED] - Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/