Re: Bug#436923: psili/dasia problem (Aarghhh!)

2007-10-09 Thread William J Poser
Jan Willem Stumpel wrote: -- >I suspect that there are two different groups of >Greeks working on those files, independently >of one another. This has to stop. Somebody has >to knock some heads together. Judging from the lessons of history, the most effective approach is a Persian threat. Perhaps

Re: Cuneiform, and how to make the fonts "work"

2007-03-29 Thread William J Poser
For re-encoding a font to Unicode using FontForge (formerly called pfaedit), I have a little tutorial here: http://billposer.org/Linguistics/Computation/Reencoding/HowTo.html. If you do do this, please make the re-encoding available as I'm sure other people would like to use it. Bill -- Linux-

Re: Cuneiform, and how to make the fonts "work"

2007-03-28 Thread William J Poser
This font looks like it has a custom encoding. I used pfaedit to inspect the font - it will show you what glyph is at what codepoint. How to work with it depends on what you want to do. If you are accustomed to working with Unicode tools, you could re-encode the font to Unicode using pfaedit or so

orthographic imperialism

2007-03-28 Thread William J Poser
[EMAIL PROTECTED] has made several claims about writing systems for indigenous languages that I, as a linguist with a strong interest in writing systems and substantial experience working with indigenous people, not only as a linguist studying their languages but as a staff member of indigenous org

Re: How to enter accented UTF-8 character on GNOME terminal

2007-03-24 Thread William J Poser
For entering non-ascii characters, I use three techniques: (a) when the characters are part of a set used routinely, e.g. the alphabet of French, install a keyboard map specifically for that language (or, e.g., for ISO-8859-1, which includes it); (b) at the other extreme, when the charact

Re: High-Speed UTF-8 to UTF-16 Conversion

2007-03-15 Thread William J Poser
>For example, in SAX processing, Psaila finds that transcoding >takes > 50% of XML processing time. Granting that this means that UTF8 <-> UTF16 conversion is the place to look if one wants to speed up SAX, is SAX processing too slow? Another interpretation of such data is that both transcoding an

Re: c++ strings and UTF-8 (other charsets)

2007-02-28 Thread William J Poser
Although a zero byte may not be part of a C string, it may be part of a "character string literal". See section 6.4.5, p. 62, of the C99 standard. "character string literals" need not be strings. Bill -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/

Re: [Maybe OT]Missing data at the beginning of a output message

2006-08-08 Thread William J Poser
Different parts of a localized message coming out in different languages isn't too surprising. It just means that the two parts are looked up separately and that one of them is missing from a message catalog and so ends up defaulting to another language. Offhand I don't know what is going on with

Re: matching base character

2006-07-14 Thread William J Poser
Another way of getting a non-standard sort order is to use a sort utility that allows you to specify the sort order explicitly. My own contribution is msort (http://billposer.org/Software/msort.html). Bill Poser -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/l

Re: [i18n] grep framework.

2005-10-28 Thread William J Poser
Well, yes, but you just have to consider the study of manual to be a form of spiritual practice, from which eventually enlightenment will follow. The following, which I received from a friend a long time ago, may be helpful in understanding the appropriate approach. The following are some exc

Re: [i18n] grep framework.

2005-10-28 Thread William J Poser
There is a lot of information on the GNU approach to i18n at: http://www.gnu.org/software/gettext/manual/gettext.html. Bill -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/

entering Unicode via escapes

2005-10-23 Thread William J Poser
For those who for whatever reason can't readily insert UTF-8 directly or who have already got material with escapes like \x{B1}, let me note that I have written a utility that understands a wide range of such escapes and translates them into Unicode. Its free in both the lunch and beer senses, and