Re: Switching to UTF-8
Kaixo! On Mon, May 06, 2002 at 10:11:34AM +0900, Tomohiro KUBOTA wrote: > Note for xkb experts who don't know Hiragana/Katakana/Hangul: > input methods of these scripts need backtracking. For example, > in Hangul, imagine I hit keys in the c-v-c-v (c: consonant, > v: vowel) sequence. When I hit c-v-c, it should represent one > Hangul syllable "c-v-c". However, when I hit the next v, it > should be two Hangul syllables of "c-v c-v". That is only the case with 2-mode keyboard; with 3-mode keyboard there is no ambiguity, as there are three groups of keys V, C1, C2; allowing for all the possible combinations: V-C2, C1-V-C2. Eg: there are two keys for each consoun: one for the leading syllab consoun, and one for the ending syllab consoun. (I think the small round glyph to fill an empty place in a syllab is always at place C2, that is, c-v is always written C1-V-C2 with a special C2 that is not written in latin transliteration) > In Hiragana/Katakana, processing of "n" is complex (though > it may be less complex than Hangul). No. The "N" is just a kana like any other, no complexity at all involved. Complexity only happens when typing in latin letters. That is why the use of transliteration typing will always require an input method anyways, it cannot be handled with just Xkb. > > --- > Tomohiro KUBOTA <[EMAIL PROTECTED]> > http://www.debian.or.jp/~kubota/ > "Introduction to I18N" http://www.debian.org/doc/manuals/intro-i18n/ > -- > Linux-UTF8: i18n of Linux on all levels > Archive: http://mail.nl.linux.org/linux-utf8/ -- Ki ça vos våye bén, Pablo Saratxaga http://www.srtxg.easynet.be/PGP Key available, key ID: 0x8F0E4975 -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
Re: Switching to UTF-8
Hi, At Sun, 5 May 2002 19:12:31 -0400 (EDT), Jungshik Shin wrote: > > I believe that you are kidding to say about such a limitation. > > Japanese language has much less vowels and consonants than Korean, > > which results in much more homonyms than Korean. Thus, I think > > Well, actually it's due to not so much the difference in > the number of consonants and vowels as the fact that Korean has > both closed and open syllables while Japanese has only open syllables > that makes Japanese have a lot more homonyms than Korean. You may be right. Anyway, the true reason is that Japanese language has a lot of words from old Chinese. These words which are not homonyms in Chinese will be homonyms in Japanese. (They may or may not be homonys in Korea. I believe that Korean also has a lot of Chinese-origin words.) Since a way to coin a new word is based on Kanji system, Japanese language would lose vitality without Kanji. > I don't think Japanese will ever do, either. However, I'm afraid > having too many homonyms is a little too 'feeble' a 'rationale' for > not being able to convert to all phonetic scripts like Hiragana and > Katakana. > ... Since I don't represent Japanese people, I don't say whether it is a good idea or not to have many homonyms. You are right, there are many other reasons for/against using Kanji and I cannot explain everything. Japanese pronunciation does have troubles, though it is widely helped by accents or rhythms. However, in some cases, none of accesnts or context can help. For example, both science and chemistry are "kagaku" in japanese. So we sometimes call chemistry as "bakegaku", where "bake" is another reading of "ka" for chemistry. Another famous confusing pair of words is "private (organization)" and "municipal (organization)", which is called "shiritu". Thus, "private" is sometimes called "watakushiritu" and "municipal" is called "ichiritu", again these alias names are from different readings of kanji. If you listen to Japanese news programs every day, you will find these examples some day. These days more and more Japanese people want to learn more Kanji to use their abundance of power of expression, though I am not one of these Kanji learners. > I also like to know whether it's possible with Xkb. BTW, if > we use three-set keyboards (where leading consonants and trailing > consonants are assigned separate keys) and use U+1100 Hangul Conjoining > Jamos, Korean Hangul input is entirely possible with Xkb alone. Note for xkb experts who don't know Hiragana/Katakana/Hangul: input methods of these scripts need backtracking. For example, in Hangul, imagine I hit keys in the c-v-c-v (c: consonant, v: vowel) sequence. When I hit c-v-c, it should represent one Hangul syllable "c-v-c". However, when I hit the next v, it should be two Hangul syllables of "c-v c-v". In Hiragana/Katakana, processing of "n" is complex (though it may be less complex than Hangul). --- Tomohiro KUBOTA <[EMAIL PROTECTED]> http://www.debian.or.jp/~kubota/ "Introduction to I18N" http://www.debian.org/doc/manuals/intro-i18n/ -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
Re: Switching to UTF-8
On Sun, 5 May 2002, Tomohiro KUBOTA wrote: > At 02 May 2002 23:54:37 +1000, > Roger So wrote: > > I _do_ think xkb is sufficient for Japanese though, if you limit > > "Japanese" to only hiragana and katagana. ;) > > I believe that you are kidding to say about such a limitation. > Japanese language has much less vowels and consonants than Korean, > which results in much more homonyms than Korean. Thus, I think Well, actually it's due to not so much the difference in the number of consonants and vowels as the fact that Korean has both closed and open syllables while Japanese has only open syllables that makes Japanese have a lot more homonyms than Korean. > native Japanese speakers won't decide to abolish Kanji. I don't think Japanese will ever do, either. However, I'm afraid having too many homonyms is a little too 'feeble' a 'rationale' for not being able to convert to all phonetic scripts like Hiragana and Katakana. The easiest counter argument to that is how Japanese speakers can tell which homonym is meant in oral communication if Kanji is so important to disambiguate among homonyms. They don't have any Kanjis to help them, (well, sometimes you may have to write down Kanjis to break the ambiguity in the middle of conversation, but I guess it's mostly limited to proper nouns). I heard that they don't have much trouble because the context helps a listener a lot with figuring out which of many homonyms is meant by a speaker. This is true in any language. Arguably, the same thing could help readers in written communication. Of course, using logographic/ideographic characters like Kanji certainly helps readers very much and that should be a very good reason for Japanese to keep Kanji in their writing system. English writing system is also 'logographic' in a sense (so is modern Korean orthography in pure Hangul as it departs from the strict agreement between pronunciation and spelling ) and a spelling reform (to make English have a similar degree of the agreement between spelling and pronunciation as to that in Spanish) would make it harder to read written text depriving English written text of its 'logographic' nature. On the other hand, it would help learners and writers. It's always been struggle between readers vs writers and listeners vs speakers > xkb can be used. However, more than half of Japanese computer > users use Romaji-kana conversion, two-keys-one-hiragana/katakana > method. The complexity of the algorithm is like two or three-key > input method of Hangul, I think. Do you think such an algorithm > can be implemented as xkb? If yes, I think Romaji-kana conversion > (whose complexity is like Hangul input method) can be implemented > as xkb. I also like to know whether it's possible with Xkb. BTW, if we use three-set keyboards (where leading consonants and trailing consonants are assigned separate keys) and use U+1100 Hangul Conjoining Jamos, Korean Hangul input is entirely possible with Xkb alone. Jungshik Shin -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
Re: Switching to UTF-8
On Sun, 2002-05-05 at 21:00, Tomohiro KUBOTA wrote: > At 02 May 2002 23:54:37 +1000, > Roger So wrote: > > Note that the source from Li18nux will try to use its own encoding > > conversion mechanisms on Linux, which is broken. You need to tell it to > > use iconv instead. > > I didn't know that because I am not a user of IIIMF nor other Li18nux > products. How it is broken? The csconv library that IIIMF comes with doesn't work properly (at least I didn't get it to work), possibly because of endianess issues. csconv is meant to be a cross-platform replacement for iconv. > > Maybe I should attempt to package it for Debian again, now that woody is > > almost out of the way. (I have the full IIIMF stuff working well on my > > development machine.) > > I found that Debian has "iiimecf" package. Do you know what it is? It's the IIIM Emacs Client Framework. As the name implies, it's an implementation of an IIIM client in Emacs. I've never tried it out, as I don't use Emacs. :) Is it used by anyone? Last time I checked, popularity-contest said nobody was using it... > > I _do_ think xkb is sufficient for Japanese though, if you limit > > "Japanese" to only hiragana and katagana. ;) > > I believe that you are kidding to say about such a limitation. > Japanese language has much less vowels and consonants than Korean, > which results in much more homonyms than Korean. Thus, I think > native Japanese speakers won't decide to abolish Kanji. > (Please don't be kidding in international mailing list, because > people who don't know about Japanese may think you are talking > about serious story.) Sorry, it wasn't meant to be a serious comment. :) Cheers Roger -- Roger So Debian Developer Sun Wah Linux Limitedi18n/L10n Project Leader Tel: +852 2250 0230 [EMAIL PROTECTED] Fax: +852 2259 9112 http://www.sw-linux.com/ -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
Re: Switching to UTF-8
Hi, At 02 May 2002 23:54:37 +1000, Roger So wrote: > Note that the source from Li18nux will try to use its own encoding > conversion mechanisms on Linux, which is broken. You need to tell it to > use iconv instead. I didn't know that because I am not a user of IIIMF nor other Li18nux products. How it is broken? > Maybe I should attempt to package it for Debian again, now that woody is > almost out of the way. (I have the full IIIMF stuff working well on my > development machine.) I found that Debian has "iiimecf" package. Do you know what it is? > I don't think xkb is sufficient because (1) there's a large number of > different Chinese input methods out there, and (2) most of the input > methods require the user to choose from a list of candidates after > preedit. > > I _do_ think xkb is sufficient for Japanese though, if you limit > "Japanese" to only hiragana and katagana. ;) I believe that you are kidding to say about such a limitation. Japanese language has much less vowels and consonants than Korean, which results in much more homonyms than Korean. Thus, I think native Japanese speakers won't decide to abolish Kanji. (Please don't be kidding in international mailing list, because people who don't know about Japanese may think you are talking about serious story.) Even if we limit to input of hiragana/katakana, xkb may not be sufficient. For one-key-one-hiragana/katakana method, I think xkb can be used. However, more than half of Japanese computer users use Romaji-kana conversion, two-keys-one-hiragana/katakana method. The complexity of the algorithm is like two or three-key input method of Hangul, I think. Do you think such an algorithm can be implemented as xkb? If yes, I think Romaji-kana conversion (whose complexity is like Hangul input method) can be implemented as xkb. --- Tomohiro KUBOTA <[EMAIL PROTECTED]> http://www.debian.or.jp/~kubota/ "Introduction to I18N" http://www.debian.org/doc/manuals/intro-i18n/ -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/