Tomohiro KUBOTA wrote:
> [Bram Moolenaar]
> > Vim now supports UTF-8 character encoding internally. One problem I ran
> > into is that there doesn't seem to be any X input method for UTF-8 in
> > Japan. Someone suggested to me that an existing euc-jp input method
> > should be used and then convert the characters to UTF-8. That would
> > require the use of iconv().
>
> I recommend to support locale-sensible encoding, i.e., EUC-JP in
> EUC-JP locale.
OK, that was the idea for using iconv().
> The followings are reasons:
> - As far as I know, there are no UTF-8-enabled XIM server for Japanese
> now.
> - Since Japanese input engine is very complex and difficult to implement,
> there are many Japanese input engines, including free and proprietary.
> Japanese people have their own preference on input engines. Thus,
> developing a new UTF-8 XIM cannot be a solution for Japanese users
> because Japanese users will want to use their own input engines.
> - In future, all of new versions of current XIM softwares may be able
> to use UTF-8. However, some Japanese users love a certain version
> of a certain input engine, for various reasons like new one is too
> heavy, behavior is different from his/her expectation, and so on.
Wouldn't it be possible to take an existing euc-jp XIM and convert the
resulting character to UTF-8 at the very end?
> - Using UTF-8 XIM for one particular software and conventional XIM for
> other software is annoying; it is just like using QWERTY keyboard for
> one software and DVORAK keyboard for another software.
Not if both input methods use the same key sequences. After all, you are
typing the same character, only it's encoded differently.
> I recommend that the inputted string is encoded specified by LC_CTYPE
> locale. I.e., just use nl_langinfo(CODESET) and iconv(). It also can
> support UTF-8, since a user is expected to set UTF-8 locale to declare
> that he/she likes UTF-8 and wants all softwares to assume UTF-8 as
> I/O encoding.
There is a conflict here: Vim needs to obtain the preferred internal encoding
and the encoding that the XIM uses. I think only one can come from an
environment variable.
It's easy to solve this by letting the user set a couple of options, but it
would be nice if this happens automatically for most people.
I already added code to switch to using UTF-8 internally when $LANG contains
"utf-8" or "utf8". The input functions then assume that typed characters also
arrive as UTF-8. How do I find out what encoding the XIM produces?
--
Due knot trussed yore spell chequer two fined awl miss steaks.
/// Bram Moolenaar -- [EMAIL PROTECTED] -- http://www.moolenaar.net \\\
((( Creator of Vim - http://www.vim.org -- ftp://ftp.vim.org/pub/vim )))
\\\ Help me helping AIDS orphans in Uganda - http://iccf-holland.org ///
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/lists/