Re: mbscmp

2002-02-25 Thread Jimmy Kaplowitz

On Mon, Feb 25, 2002 at 02:38:13PM -0500, Michael B Allen wrote:
> What's the ultimate goal here? Are any of these functions *supposed*
> to work on multi-byte characters, or will there be mbs* functions?

I haven't tested this, nor really done anything relating to programming
with i18n, but based on looking at man pages, you can use one of three
functions (mbstowcs, mbsrtowcs, or mbsnrtowcs) to convert your multibyte
string to a wide character string (an array of type wchar_t, one wchar_t
per *character*), and then use the many wcs* functions to do various
tests. My recollection of the consensus on this list is that for
internal purposes, wchar_t is the way to go, and conversion to multibyte
strings of char is necessary only for I/O, and there only when you can't
use functions like fwprintf. However, wchar_t is only guaranteed to be
Unicode (which encoding?) when the macro __STDC_ISO_10646__ is defined,
as is done with glibc 2.2. Please, i18n/utf8 gurus, please correct the
last few sentences! :-)

- Jimmy Kaplowitz
[EMAIL PROTECTED]
--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




Re: Two questions about console utf8 support

2001-12-16 Thread Jimmy Kaplowitz

On Sat, Dec 15, 2001 at 02:39:46AM +0330, Behdad Esfahbod wrote:
> 
> Also if you are thinking out of distribution, you can simply put your 
> tty independent material in /etc/rc.d/rc.local

Again you're thinking about Red Hat; Debian has no /etc/rc.d directory
(it just uses /etc/init.d and /etc/rc?.d directly), and it doesn't have
the equivalent of an rc.local script. The thing to do would be to add a
script in /etc/init.d and then run something along the lines of
update-rc.d scriptname defaults 99 01 to activate it.

- Jimmy Kaplowitz
[EMAIL PROTECTED] (also, @debian.org :)
--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




Re: Two questions about console utf8 support

2001-12-14 Thread Jimmy Kaplowitz

On Fri, Dec 14, 2001 at 06:18:34PM +, [EMAIL PROTECTED] wrote:
> > > Maybe putting a ESC%G sequence in your /etc/issue (?).
> 
> > This does almost everything necessary; however, it does not do the
> > equivalent of 'kbd_mode -u'
> > I could write a wrapper
> 
> You know about the existence of unicode_start ?

Yes, but I didn't want to mention it in case it was Debian-specific.
Anyway, I still would have to write a wrapper around that, because I
can't invoke that directly from the boot process - it doesn't accept the
"start" / "stop" arguments that are used by that process.

Maybe something like:

-- begin file /etc/init.d/unicode --
#!/bin/sh

case $1 in
  start) unicode_start ;;
   stop) unicode_stop  ;;
  restart|reload|force-reload) unicode_stop; unicode_start ;;
  *) echo "Usage: $0 [start|stop|restart|reload|force-reload]" 1>&2;
  return 1;;
esac

return 0
-- end file /etc/init.d/unicode --

How does that look to you? The only question is, would this affect all
the virtual consoles (if run from rcS.d) or only the boot console (i.e.,
tty1)? If the latter, how would it be best to adapt this to affect all
of the consoles? (I think this is different depending on whether you use
the framebuffer or vga text mode - I use the framebuffer.)

- Jimmy Kaplowitz
[EMAIL PROTECTED]
--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




Re: Two questions about console utf8 support

2001-12-14 Thread Jimmy Kaplowitz

On Fri, Dec 14, 2001 at 06:23:16PM +0330, Behdad Esfahbod wrote:
> > - What is the cleanest possible way to get the console (all of the
> >   virtual consoles, not just the bootup one) into UTF-8 mode as early as
> >   possible? If it matters, I'm using the Rage 128 framebuffer driver on
> >   x86 Debian sid.
> 
> Maybe putting a ESC%G sequence in your /etc/issue (?).

This does almost everything necessary; however, it does not do the
equivalent of 'kbd_mode -u', which sets the keyboard to UTF-8 mode. I
could write a wrapper around getty that I call from /etc/inittab, but that
seems ugly...I may end up doing that though.

It might also be nice to get unicode going on from early in the bootup
process. Should I put something in the rcS.d directory that sets
everything up unicodely (kbd_mode -u plus ESC%G) ? I wonder if that
would work on all my consoles, or if I'd have to wrap getty as above. I
could also just put something in the /etc/profile, which would satisfy
me since I don't use tcsh..

I guess I was just asking what the best was.

Thank you for your suggestion, though; it's a good one.

> > - I remember hearing about a tool that takes a bdf or some other full
> >   font format and being able to extract any arbitrary set of 512
> >   characters from it and put that set into a console font. Can any of
> >   you point me to it? It would be nice to make various Unicode console
> >   fonts tailored to my tastes.
> 
> It comes with Unicode VGA font, by Dmitry Bolkhovityanov <[EMAIL PROTECTED]>
> avalable at http://www.inp.nsk.su/~bolkhov/files/fonts/univga/

Thanks for the pointer. I'll investigate it when I get a chance.

- Jimmy Kaplowitz
[EMAIL PROTECTED]
--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




Two questions about console utf8 support

2001-12-13 Thread Jimmy Kaplowitz

Hi all. I'm wondering two things about using UTF-8 on the console, where
I spend a significant portion, though by no means the majority, of my
computer time:

- What is the cleanest possible way to get the console (all of the
  virtual consoles, not just the bootup one) into UTF-8 mode as early as
  possible? If it matters, I'm using the Rage 128 framebuffer driver on
  x86 Debian sid.

- I remember hearing about a tool that takes a bdf or some other full
  font format and being able to extract any arbitrary set of 512
  characters from it and put that set into a console font. Can any of
  you point me to it? It would be nice to make various Unicode console
  fonts tailored to my tastes.

Have a good day, and keep up the high-quality i18n hackery.

- Jimmy Kaplowitz
[EMAIL PROTECTED]
--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




Re: UTF8 Terminal Detection

2001-11-13 Thread Jimmy Kaplowitz

On Tue, Nov 13, 2001 at 06:37:38PM -0600, David Starner wrote:
> On Wed, Nov 14, 2001 at 10:00:02AM +1100, George W Gerrity wrote:
> > My interest (and my interest in monitoring this e-mail group) lies in 
> > the possibility of getting involved in an open WYSIWYG document 
> > editor based on XML and UTF-8, so I (and others) can get out of the 
> > thrall of Word. To be successful, such an application will HAVE to be 
> > a) WYSIWYG; b) multi platform; c) able to read and dump rtf format, 
> > even if the result is crippled; d) be modular and open (source and 
> > APIs), both to spread the development effort and to encourage its use.
> 
> Have you considered looking at existing systems? What about KWord? Why
> does it have to XML-based?

To the best of my knowledge, kword is XML, just compressed with gzip
(and maybe tar)... anyway, it's some standard unix transformation of
XML.

- Jimmy Kaplowitz
jimmy@{kaplowitz,debian}.org
--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




Re: unicode in emacs 21

2001-10-28 Thread Jimmy Kaplowitz

On Sun, Oct 28, 2001 at 05:04:22PM +, Dave Love wrote:
> >>>>> "OD" == Oliver Doepner <[EMAIL PROTECTED]> writes:
> 
>  OD> There is vim 6.x now with full utf-8 support on the xterm.
> 
> [Does `full utf-8 support' mean level 3?]

Well, it handles double-width characters as well as up to two combining
characters. It's the only editor I've used (including Yudit) that could
display the sequence U+0283 U+034D correctly.

> Emacs can do utf-8 i/o under ttys that support it, though you don't
> _need_ such support -- either input or output -- to edit utf-8 text.
> 
>  OD> It is much faster than emacs on x11 of course.
> 
> I'm surprised that's much of an issue.  I assume Emacs under X is much
> more capable.

Well, Emacs does have more features (including some that are less
essential, such as doctor mode :), but vim has quite enough for most
purposes.

>  OD> I was happy to see Emacs 21 announced. but the unicode support
>  OD> does not seem to have moved forward very much
> 
> It's moved from zero to the state where it's perfectly fine for
> editing at least the Western technical text that interests me.  E.g.,
> Kuhn's UTF-8-demo.utf works modulo the level 2 text, for which one can
> add support straightforwardly at the Lisp level.  It also allowed
> producing coding systems for all the 8-bit charsets for GNUish
> locales, which perhaps matters more in the wide world than utf-8 per
> se.  With some customization, I can also at least _display_
> utf-8-encoded CJK text.  I can send and receive utf-8-encoded mail and
> browse utf-8-encoded web sites (with the development W3 package).

Vim can display the UTF-8-demo file perfectly, with no exceptions. Also,
although I haven't tested this, I am told it can write as well as
display utf-8 CJK text.

- Jimmy Kaplowitz
[EMAIL PROTECTED] / [EMAIL PROTECTED]

 PGP signature


Re: Vim 6.0 has been released (debian info)

2001-10-03 Thread Jimmy Kaplowitz

On Thu, Oct 04, 2001 at 12:16:12PM +0900, Tomohiro KUBOTA wrote:
> Strange...  I think the newest Debian "vim" package is about 4Mbytes
> and has locale support including UTF-8.

That's not true on my up-to-date Debian system, running sid/unstable.
The current release, 6.0.011-2 (which corresponds to upstream vim
6.0.11), is compiled with multi_byte disabled. The alpha and beta
packages had it enabled, and I hereby put in my vote for it to be
re-enabled. Wichert, a number of us think UTF-8 support is essential to
the system of the future. If you want a minimalist version of vim
without UTF-8, reintroduce vim-tiny.

- Jimmy Kaplowitz
jimmy@{debian,kaplowitz}.org

 PGP signature


Linux Distributions (was Re: Upgrading to glibc 2.2)

2001-09-09 Thread Jimmy Kaplowitz

(warning: long, only-slightly-i18n-related email)

On Sun, Sep 09, 2001 at 09:48:45AM +0100, Markus Kuhn wrote:
> I've been using SuSE 7.2 with kernel 2.4.4 since mid-May and never had any
> problems. I find SuSE 7 in general far more comprehensive than Red Hat 7,
> which I have to use at work and which lacks a number of packages that I
> got very used to on my home machine, starting with everyday trivialities
> like xautolock and xearth. (By mere lines of source code, SuSE is around
> 60% larger than Windows 2000 if you install everything!)

Of course, then there's Debian, which has 7 or 8 thousand packages in
its archive (which takes up around 35 GB on each Debian server if I remember
correctly) and a very active and well-archived set of mailing lists. It has
xautolock, xearth, and much more. All of this is very easy to install with apt
and dselect, two very essential Debian packaging tools. They fetch and
install all dependencies automatically, and, thanks to debconf, they
(usually) prompt you for any answers that are needed.

Many if not most Debian developers live outside the US, and many live in
countries where English is not the primary language. A few examples of
where Debian developers live are Sweden, France, Japan, Austria, and
Croatia. Therefore there is much attention paid to i18n. There is even a
project to translate the _descriptions_ of packages into other languages.

The downside to Debian is that it has a horrible install process, which
would frighten off almost any new user. This is even worse on non-x86
platforms. (However, Debian supports an amazing number of platforms; in
the upcoming release, Debian is planning to support x86, m68k, ppc,
alpha, mips big-endian, mips little-endian, ia64, hppa, and s390 at
least.) It also doesn't have great documentation. On the other hands,
the mailing lists are very friendly and helpful.

SuSE is good, in the sense that it has an easy install process, and lots
of graphical tools (the closest Debian comes to that is lots of
character-cell GUIs, but if you want a graphical Debian-based system try
Progeny), and there's a company to get help from if you feel more
comfortable doing things that way. It also has a printed book of
documentation if you buy the official version. However, it uses RPMs, which
don't really have a good way to deal with dependencies other than yelling at
you until you install them. Also, there are SuSE RPMs, Mandrake RPMs,
Red Hat RPMs, etc., all of which are subtly incompatible. One nice
feature of RPMs, though, is the fact that there is checking done when
installing them to protect against tampering or file corruption. Debian
can use RPMs as well as its native format, but one rarely needs to since
so many things are packaged.

Conclusion: Debian is not for the newbie who wants everything to be
nice, pretty, easy, and Windows-like (although the upcoming release will
include very nicely integrated KDE support, so you can have that too to some
degree), but once you gain a certain level of proficiency, or if you
want to gain that proficiency, it's very easy to maintain and use, and
very well thought-out.

Full disclosure: I am a Debian developer; i.e., I'm part of the Debian
project. But I'm not getting paid one penny. I joined this past May because
I liked it before I joined, not for some sort of compensation.

- Jimmy Kaplowitz
[EMAIL PROTECTED] / [EMAIL PROTECTED]

P. S. - Mandrake seems to be the most popular choice for new Linux
users. Two great advantages: wonderful install system including
nondestructive partition resizer, and lots of graphical tools that work
well if you use them exclusively. One big disadvantage: The graphical
tools break and become useless if you modify the configuration files by
hand.
-
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/



Re: Encoding conversions

2001-09-08 Thread Jimmy Kaplowitz

On Sat, Sep 08, 2001 at 05:44:58PM -0400, Michael B. Allen wrote:
> Yeah, I know. I just have never upgraded glibc on a kernel 2.2 system. The
> chicken and egg thing scares me :)

Well, I don't think there is any such problem; you can use either
version of glibc (2.1 or 2.2) with either version of the kernel (2.2 or
2.4). I am not certain about glibc 2.1 with kernel 2.4; there might be a
problem there if the glibc 2.1 package was compiled against
pre-kernel-2.4 kernel headers. The glibc 2.2 package is almost certainly
compiled against 2.4, so you should be fine; certainly the Debian
package of glibc 2.2 that I use on my system definitely works with both
versions of the kernel. So, you could upgrade glibc and then the kernel. Or,
if it makes you feel better, you could just download the new kernel and
glibc RPMs and all their dependencies and install them all in one "rpm -Uvh"
line.

> I guess I'm really waiting for RH 7.2 to come out.

That would make it easier :)

- Jimmy Kaplowitz
[EMAIL PROTECTED]

P. S. - If I said anything inaccurate or misleading or made any glaring
omissions, please correct me. But I am not aware of any such.
-
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/



Re: Encoding conversions

2001-09-08 Thread Jimmy Kaplowitz

On Sat, Sep 08, 2001 at 02:51:54PM -0400, Michael B. Allen wrote:
> On Sat, Sep 08, 2001 at 12:02:55PM +0100, Markus Kuhn wrote:
> > Any Linux user/developper interested in locales and character sets
> > is today *strongly* recommended to upgrade to a glibc 2.2 based
> > distribution. There have been huge improvements between 2.1 and 2.2!
> 
> Yeah but all the new distros use kernel 2.4 which seems to be the
> development kernel masqurading as the stable release. I'd like to see
> some VM stability before I throw this 2.2 rock away. They only recently
> discovered that page aging didn't work at all the last 9 releases (err,
> something fundamentally wrong there).

Markus meant _glibc_ version 2.2, which is different than _kernel_
version 2.2. Most very-up-to-date distributions such as the unstable or
testing flavors of Debian use glibc 2.2 with kernel 2.4. I use this
myself.

There are some rough edges around the 2.4 kernels, but nothing that
insurmountably interferes with use. I am quite happy with it (I have
2.4.9 on my home machine and 2.4.7 on my work machine).

- Jimmy Kaplowitz
[EMAIL PROTECTED]
-
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/



Re: How to use Unicode

2001-08-31 Thread Jimmy Kaplowitz

On Fri, Aug 31, 2001 at 02:46:11PM +0100, Markus Kuhn wrote:
> On Thu, 30 Aug 2001, Julien =?ISO-8859-1?Q?=C9LIE ?= wrote:
> > I'm a French who uses Linux Red Hat 7.1.
> >
> > I wish I could write greek characters. However, I don't
> > know what I should install in order to have Unicode compatibility
> > and they what I should to if I want to write greek polytonic characters.
> 
> LANG=fr_FR.UTF-8 xterm \
>   -fn '-Misc-Fixed-Medium-R-SemiCondensed--13-120-75-75-C-60-ISO10646-1'
> 
> will give you an xterm to display polytonic Greek characters such as
> those you find in the UTF-8 files in
> 
>   http://www.cl.cam.ac.uk/~mgk25/ucs/examples/  UTF-8-demo.txt

Don't you need the -u8 option for xterm, in the absence of an equivalent
line in .Xresources?

Forgive me if I'm wrong, this is my first post on this list.

- Jimmy Kaplowitz
[EMAIL PROTECTED]
-
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/