Hi,
I got personal mail (which was reply to my earlier mail) from eli
zaretskii, one of the maintainers of gnu emacs. the team seems to be in
need of contributors to improve the unicode support.
see below
cheers
oliver
-- Forwarded message --
Date: Fri, 26 Oct 2001 10:39:50 +0200
From: Eli Zaretskii <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Subject: Re: Unicode in Emacs
[..]
> > We need people who are prepared to work on implementing the Unicode
> > features in Emacs. Ideas are welcome, but right now we have more ideas
> > than we can manage.
>
> Maybe after my exams. But my elisp skills are poor. :-(
Don't worry, there's lots of ways you could contribute.
How's your C? Some of the work (a large part of it, actually) needs
to be done on the C level.
Even if you can only read the code (in C or Lisp) and suggest how to
modify it to implement various Unicode features, it's still useful, as
others could write the code given your advice.
> But i have done some superficial "research" on m17n, i18n and l10n ...
> see http://www.coli.uni-sb.de/~oldo/PUB/mls-reqspec.html ...
Impressive. We certainly need that kind of overall view of the issues
when discussing and coding Unicode support in Emacs. Would you like
to be subscribed to the emacs-unicode mailing list?
> > mappings, bidirectional editing, Arabic presentation forms, etc. Does
> > vim really support those?
> combining characters ... hm ahem ... http://vim.sourceforge.net/whyvim.php
Thanks for the pointer. However, I cannot see anything there besides
support for UTF-8 and other Unicode encodings. Nothing about
combining characters, and the Hebrew text in the snapshot is in the
wrong direction (which means bidirectional behavior specified by the
Unicode Technical Report #9 isn't supported). So I guess Unicode
support in VIM is still very preliminary, although better than
Emacs's.
> Are you on linux-utf8 ??
No. I don't have enough time to read another mailing list, sorry.
> Who knows the design that has been made for the unicode support?
I attach it below.
> Maybe these people could join the linux-urf8 list and
> advertise theiir concept and hopefully some others might "step in" do
> contribute some code.
I'd prefer that people who want to work on adding Unicode to Emacs
subscribe to emacs-unicode, and that the work be coordinated there.
Please feel free to forward this suggestion to linux-utf8.
> Where is the emacs-unicode mailing list??
It's hosted on gnu.org machines. I can subscribe anyone who wants to
be part of this effort.
> thank you for all the valuable work you do!
You are welcome. And thanks for raising this important issue: it
looks like a few people are interested enough that they wrote to me
and offered help.
Emacs-Unicode-990824
--
Internal Character code:
00 Unicode U+ - U+
00 Unicode 20bit (via surrogate pair)
01 Unicode 20bit (via surrogate pair)
01 0ppp 7 64kByte planes reserved for Emacs
01 1ppp 8 64kByte planes for private use
1x for private use, CNS 3-16, and CCCII
Private area is 18h - 3087FFh
--
Multibyte sequence in buffer/string:
1 byte:
0xxx
ASCII
1xxx
not used
2 bytes: 110x 10xx where x... are:
0 00 - 1 11 (0h - 7Fh)
7 bits not used
(or we may be able to use this area for holding 8-bit raw data
in multibyte buffer/string)
00010 00 - 1 11 (80h - 7FFh)
Unicode U+0080 - U+07FF
3 bytes: 1110 10xx 10xx where x... are:
00 00 - 01 11 (0h - 7FFh)
11 bits not used
10 00 - 11 11 (800h - h)
Unicode U+0800 - U+
4 bytes: 0xxx 10xx 10xx 10xx where x... are:
000 00 00 00 - 000 00 11 11 (0h - h)
16 bits not used
000 01 00 00 - 100 00 11 11 (1h - 10h)
20 bits Unicode via surrogate pare
100 01 00 00 - 101 11 11 11 (11h - 17h)
7 64kByte planes reserved for Emacs
We may map Japanese Han characters here.
110 00 00 00 - 111 11 11 11 (18h - 1Fh)
8 64kByte planes reserved for private use
5 bytes: 10xx 10xx 10xx 10xx 10xx where x... are:
00 00 00 00 00 - 00 000111 11 11 11
0h - 1Fh
21 bits not used
00 001000 00 00 00 - 00 001100 001000 01 11
20h - 3087FFh
1083391 (almost 1M) character code points for private use
00 0011