Re: Input under RH8

2002-12-07 Thread Jungshik Shin



On Fri, 6 Dec 2002, Maiorana, Jason wrote:

 First, thanks to Jungshik Shin  Mike FABIAN for your
 replies.

 You're welcome :-)

 I surmise that the current state of RH8 is that it is not
 yet suitable for entry of all languages simultaneously.
 (flaws in XIM itself being part of the problem)

 You're right. You can't do MS Windows/MacOS style IME
switching, yet, in all applications.


 I can probably setup some scripts to pop up a gedit in a
 given mode, but, with the exception of VIQR and Korean,
 I cannot yet graphically switch around to any input method
 with the version of gtk2 that comes with rh8.

   Gtk2 as shipped in RH8 has Thai(broken?), Tamil,
Cyrillic(transliterated), Innuikitut, IPA, Tigrigna-Ethiopian,
Tigrigna-Eriterian,  and Amharic input modules in addition to XIM,
Vietnamese, *broken* Korean(KSC5601) input module. For Korean, you'd
better install 'imhangul' input module at http://imhangul.kldp.net. You
can download the source by clicking 'download' in red and install it by
following the instruction in the gray box below the link for download.
If this is the first time you install 'imhangul', you have to run 'make
install' twice (it's due to a bug to be fixed.)

  You can also make use of Xkb. With its support of multiple
levels, you can add yet another 'input method' to your repertoire of input
methods accessible in gedit(a gtk2 application). As for Xkb, refer to
XFree86 I18N archive.

 Hopefully, in the near future, RH will ship all utf-8
 locales by default, and gtk2 will have a XIM wrapper
 that allows access to any input method on the system
 from any language locale.

  Alternatively, 'meta XIM server' (as implemented at the client level
by Yudit and mlterm) that lets users switch between multiple XIMs will
be handy. Then, it can be used for non-gtk2 applications as well as
gtk2 applications.

 BTW, has anybody heard of gtk2 input modules for Chinese and Japanese?
A quick googling didn't turn up anything.

   Jungshik


--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




Re: UTF-8 wakeup call

2002-12-07 Thread Keld Jørn Simonsen
On Fri, Dec 06, 2002 at 11:39:43AM -0500, Henry Spencer wrote:
 On Fri, 6 Dec 2002, Keld =?iso-8859-1?Q?J=F8rn?= Simonsen wrote:
  Actually it is funny that you call it Unicode. UTF-8 clearly comes from
  the 10646 side of UCS, Unicode did not invent it at all...
 
 It did not come from 10646 either; it came from the *Unix* side of the
 house, specifically from X/Open.  And my understanding is that it was
 originally specifically an encoding for Unicode (although the distinction
 quickly became academic because of the conversion of 10646 into a Unicode
 clone).

UTF-8 came then thru the 10646 side. Unicode was strictly 16-bit from
the outset. Per-merger 10646 was 32 bit with an 8-bit encoding made possible.
After the merger 10646 had an 8-bit encoding called UTF-1.

 Nobody except some standards zombies cared about encoding 10646, or indeed
 about any aspect of 10646; Unicode was the standard that the real world
 was clearly signing up for.  Which is why the 10646 committee, seeing the
 writing on the wall, abandoned its own efforts and aligned its standard
 with Unicode. 

Hmm, what is implemeted in Linux is really not Unicode, but 10646, and
was that from the outset, eg 32 bit wchar_t. But then you may call Linux people for
standard zombies, that is your call, and yes, we use standards like
POSIX and ISO C etc. Anyway I was writing about linux and UTF-8.

 Some of the Unicode standards guys were dead-set against any encoding
 except plain 16-bit (but which byte order? :-)), but potential *users* of
 Unicode were much more pragmatic.  UTF-8 originally came out of the desire
 for a backward-compatible encoding for use in Unix filenames. 

Yes, true. And that is then why I find it funny that the people
that were dead-set against anything other than 16 bit, now gets all
the glory for the stuff they fought so hard. The irony of history:-)

 In any case, Unicode is much the more widely-known name, and much the more
 readily available standard, and (as others have noted) also comes with a
 lot of relevant supplementary information that 10646 lacks. 

The supplementary information is much covered in 14651 and 14652.
And the specifications in Linux are then build on these standards
not Unicode tables.

  The way 10646 is coming to Linux is also much
  with the support from the ISO 14651 sorting standard and the ISO
  TR 14652 locale standard. 
 
 My understanding is that an ISO TR, by definition, is not a standard.

The definitions in ISO on what is standards in general, encompasses
ISO TRs as standadrs. It is not an ISO standard, but it is a standard
in the generic sense of the word.

  I think the proper way to characterize what we do now in Linux is
  to say ISO 10646, and probably mention Unicode in parenthesis the first
  time it appears.
 
 The pragmatic, and historically correct, way is the reverse.  ISO 10646
 delivers the ISO stamp (stomp? :-)) of approval for Unicode... but the
 standard you will find on the shelves of the people who do the work is
 labelled Unicode.

I then think you are mislabelling the use of UTF-8 in Linux, as the
Unicode standards are not adhered to. Linux UTF-8 is not Unicode
conformant, but follows ISO 10646, 14651 and TR 14652.

Kind regards
keld
--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




RE: UTF-8 wakeup call

2002-12-07 Thread Kent Karlsson


Frank T. Pohlmann:
 Actually, I tried to get people to realize the scale
 of the coming changes.
 
 
http://www.linuxuser.co.uk/articles/issue22/lu22-All_you_need_to_know_about-Unicode.pdf
 
 -Frank Pohlmann

Apart from the very overpretentious title, that article contains
a number of errors.  I will mention just one: the notion of
implementation levels (a 10646 thing; Unicode does not formalise
that) have been scarily confused with the notion of planes.

Kind regards
/kent k

--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




Re: UTF-8 gnroff mangles up syntax samples

2002-12-07 Thread Henry Spencer
On Fri, 6 Dec 2002, Larry Wall wrote:
 ...Of course, if you make a way
 to translate the old format to something resembling the new format,
 the transition can happen more quickly.

Also, for a quick hack that's likely to give good results:  if the man
macros merely render all explicitly-requested boldface as the verbatim
font with verbatim processing, that will go a long way toward doing the
right thing.  Bold does not see much other use in traditional manpages.

  Henry Spencer
   [EMAIL PROTECTED]


--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




Re: UTF-8 wakeup call

2002-12-07 Thread Henry Spencer
On Fri, 6 Dec 2002, Antoine Leca wrote:
  [UTF-8] did not come from 10646 either; it came from the *Unix* side of the
  house, specifically from X/Open. 
 
 I thought it came from Plan 9 (Rune) then passed to X-Open (FSS-UTF?).
 Did I miss something? Note I was not there at this time.

Markus has helpfully explained this (especially helpful since I didn't
have much detail on the earliest history):  Plan 9 was the earliest major
implementation but didn't actually originate it.  Plan 9 in fact started
with a different Unicode encoding, but switched when UTF-8 appeared.

  Henry Spencer
   [EMAIL PROTECTED]

--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




RE: UTF-8 wakeup call

2002-12-07 Thread Kent Karlsson
Keld,

 Maybe there are flaws in 14651, but it is ISO 14651 which is 
 used in Linux.

That is a problem, not a feature.  While UAX 10 is conforming to
14651, it does specify a number of requirements in addition to 14651.
Specifically, for Thai, Lao, and combining characters support.

   and the ISO TR 14652 locale standard. 
  
  14652 is NOT a standard.  It is also very unlikely to ever 
 develop into one.
  Keld, please stop promoting it as a standard, when you very 
 well know
  that it is NOT a standard.
 
 It is as much a standard as Unicode in the generic sense of the word
 standard, but it is not an ISO standard. Please understand that.

It's an ISO TR that became a TR because it FAILED to become an ISO standard.
Please understand that.

...
 The mappings used are at least also from the RFC 1345 (recode uses that) 
 or the IS 15897 which uses many if the same names and mappings.
 Specifically I have seen that Linux is *not* using the Unicode data
 because of copyright issues. 

Hmmm.  From http://www.unicode.org/Public/UNIDATA/UnicodeCharacterDatabase.html:

Limitations on Rights to Redistribute This Data

Recipient is granted the right to make copies in any 
form for internal distribution and to freely use the 
information supplied in the creation of products supporting 
the UnicodeTM Standard. The files in the Unicode Character 
Database can be redistributed to third parties or other 
organizations (whether for profit or not) as long as 
this notice and the disclaimer notice are retained. 
Information can be extracted from these files and used 
in documentation or programs, as long as there is an 
accompanying notice indicating the source.

I don't see this as restrictive for use in Linux.  I'm sure Unicode
consortium would like to see its data being used also in open source
project, like Linux.  Note that IBM has its own open source project
on Unicode support (ICU).


Kind regards
/kent k

--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




Re: Input under RH8

2002-12-07 Thread Owen Taylor

Jungshik Shin [EMAIL PROTECTED] writes:

  Hopefully, in the near future, RH will ship all utf-8
  locales by default, and gtk2 will have a XIM wrapper
  that allows access to any input method on the system
  from any language locale.
 
   Alternatively, 'meta XIM server' (as implemented at the client level
 by Yudit and mlterm) that lets users switch between multiple XIMs will
 be handy. Then, it can be used for non-gtk2 applications as well as
 gtk2 applications.
 
  BTW, has anybody heard of gtk2 input modules for Chinese and Japanese?
 A quick googling didn't turn up anything.

Off the top of my head...

Japanese:

 http://bonobo.gnome.gr.jp/~nakai/immodule/

Chinese:

  http://sourceforge.net/projects/wenju/

[ Actually, a generic table based method, but contains tables for
  various methods of inputting Chinese ]

There may possibly be others.

Regards,
Owen

--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




RE: UTF-8 wakeup call

2002-12-07 Thread Jungshik Shin
On Sat, 7 Dec 2002, Kent Karlsson wrote:

  The mappings used are at least also from the RFC 1345 (recode uses that)
  or the IS 15897 which uses many if the same names and mappings.
  Specifically I have seen that Linux is *not* using the Unicode data
  because of copyright issues.

 Hmmm.  From http://www.unicode.org/Public/UNIDATA/UnicodeCharacterDatabase.html:

   Limitations on Rights to Redistribute This Data

   Recipient is granted the right to make copies in any
   form for internal distribution and to freely use the

 I don't see this as restrictive for use in Linux.  I'm sure Unicode
 consortium would like to see its data being used also in open source

   glibc 2.x may not use them, yet. However, glib(and other libraries
built on top of it) indeed makes an extensive use of Unicode data files.
So do Perl, Yudit, Mozilla and other free/opensource programs/projects
that run on Linux.

  Jungshik

--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




30100------300000emailaddress only usd$10

2002-12-07 Thread hello
ÄãÈç¹ûÐèÒª×ö¹ã¸æ£¬ÇëÄ㽫ÄãËùÒªÐû´«µÄ²úÆ·ºÍ¶ÔÏó¸æËßÎÒ£¬ÎÒ»áΪÄ㶨ÖÆÄãËùÒªµÄÓÊÖ·£¬¼Û¸ñ£º30ÍòÓÊÖ·100Ôª£¬ËÍÈí¼þÒ»Ì×£¬Ò»ÈÕÄÚ½»»õ£¬ÎÒµÄÓÊÏäÓû§ÃûÊÇusa123888»òusa123268»òusa123368
   ÓÊÏäÊÇÊôÓÚyaoweb.comµÄ
ÇëÁôÄãµÄÓÊÏäµØÖ·ÒÔ±ãÁªÏµ£¬Èç¹ûÄã²»ÐèÒªÎÒ¿´µ½ÎҵĹã¸æ£¬Çë·¢Óʼþµ½Óû§Ãûusa123468
µÄ88998.com»òyaoweb.comÖÐÒªÇó³·³ýÄãµÄÓÊÏäµØÖ·¡£
do you have products to sell in net ,please contact with me .i can find
30 emailaddress for you only 10$ ,and i will sent it with a emailsendersoft to 
you,if you need please sent a mail to me ,my email:   username£ºusa123888 or 
username:usa123268 or username:usa123368   they are yaoweb.com
if you want to remove your emailaddress pleses sent email to username:usa123468its 
 yaoweb.com
   


$BL$>5Bz9-9p"(I,8+!*N">pJs!*(J

2002-12-07 Thread hamil9999jp
‘—MŽÒ
“dŽqƒ[ƒ‹LŽÐ

¡ŒãAL‚ð‚²Šó–]‚³‚ê‚È‚¢•û‚Í‚±‚±‚Ö
[EMAIL PROTECTED]
•K‚¸–{•¶‚É‚ ‚È‚½‚̃[ƒ‹ƒAƒhƒŒƒX‚Ì‚Ý‚ð‚¨‘‚«‰º‚³‚¢


===
“–ŽÐ‚Ì‹­‘å‚ȍL—Í‚ðŠˆ‚©‚µ‚Ü‚¹‚ñ‚©I
”zM‹Æ–±‚©‚çƒz[ƒ€ƒy[ƒW»ì‚Ü‚ÅŠiˆÀ‚É‚Ä‚¨Žó‚¯’v‚µ‚Ü‚·B
‰º‹L‚e‚`‚w‚É‚Ä‚¨\‚µž‚݉º‚³‚¢B
===

§104-0061
“Œ‹ž“s’†‰›‹æ‹âÀ8-19-3
‘æ2ƒEƒCƒ“ƒOƒrƒ‹@3F
ƒ[ƒ‹ƒ}ƒKƒWƒ“”­s

TEL@03-3544-6222
FAX@03-3544-6218@@

===
–â‘菤•i‚΂©‚èW‚ß‚Ü‚µ‚½BÁ‚³‚ê‚é‹°‚ꂪ‚ ‚è‚Ü‚·‚Ì‚Å
‚¨\ž‚Ý‚Í‚¨‘‚߂ɁI
=

™\\\™\\\™\\\™\\\™\\\™\\\™\\\™\\\™

— ƒrƒfƒI”Ì”„E“ÁŽêƒ_ƒbƒ`ƒƒCƒtE‚r‚lƒNƒ‰ƒu 
@@ ‚`‚u’j—D•åWE‰‡•ŒðÛE‚r‚d‚wƒtƒŒƒ“ƒhEƒAƒ_ƒ‹ƒgƒOƒbƒY‚È‚Ç
 š@ƒAƒ_ƒ‹ƒgŠÖ˜A‚̏î•ñ–žÚ@š


@@‚¨\ž‚݁E‚²’•¶E¤•iÚ×“™‚́@
@@@@@‰º‹L‚t‚q‚k‚ðƒNƒŠƒbƒN‚µ‚Ä‚²——‰º‚³‚¢B

«@@@@«@@@@«@
@@@http://www.ss-koukoku.com/

™\\\™\\\™\\\™\\\™\\\™\\\™\\\™\\\™

@@ŠJ‰^ƒOƒbƒYE‹É”éî•ñŽE–h”ƃOƒbƒYE‹à–ׂ¯î•ñ‚Ȃǁ@
@@@@@@@@@š@‚»‚Ì‘¼î•ñ–žÚ@š


@@‚¨\ž‚݁E‚²’•¶E¤•iÚ×“™‚́@
@@@@@‰º‹L‚t‚q‚k‚ðƒNƒŠƒbƒN‚µ‚Ä‚²——‰º‚³‚¢B

«@@@@«@@@@«@
http://www.pp-koukoku.com/

™\\\™\\\™\\\™\\\™\\\™\\\™\\\™\\\™

--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




Emacs automatic UTF-8 setup

2002-12-07 Thread Vasilis Vasaitis
  Hello,

  Lately, I've started to slowly migrate my environment to
UTF-8. Since I don't feel ready to do a complete switch yet, as the
environment doesn't seem to be mature enough, I like to have a
transition period, when I can run applications in either a UTF-8 or
a non-UTF-8 locale, as needed. More specifically, I want to be able to
constantly switch back and forth between el_GR.ISO8859-7 and
el_GR.UTF-8.

  Most programs don't need any particular setup for this. Emacs [0],
however, is a notable exception. In its default setup, it doesn't
completely support a UTF-8 environment, the main problem being that it
doesn't recognise UTF-8 keyboard input. So I set out to discover the
minimum configuration possible, so that it would fully support the
UTF-8 locale, without creating any problems at the ISO8859-7 locale at
the same time. In addition, it would have to work both in X11 and
terminal mode, and in the latter, both on the Linux console and inside
an xterm. The result isn't the most obvious setup, so I thought I'd
post it here, in the hope that others find it useful as well
(esp. Emacs developers).

  First of all, I wanted to make sure that Emacs automatically sets
the language environment to Greek in all cases, without actually
configuring it to be the default. This is accomplished with the
following line in .emacs:

(setq locale-language-names (cdr locale-language-names))

  The variable locale-language-names is a list of patters that match
locale names to names of language environments. In my version of
Emacs, the first entry inhibits all UTF-8 locales from setting any
language environment. In my case, this seems to cause more harm than
good, so I eliminate that entry with the above command.

  In addition, I want to set the various coding systems for each
locale to sane values. This is achieved with the following piece of
code:

(setq locale-preferred-coding-systems
  (cons (cons .*\\.utf-8 'utf-8) locale-preferred-coding-systems))
((lambda (cs)
   (set-keyboard-coding-system cs)
   (if cs (set-terminal-coding-system cs)))
 (set-locale-environment nil))

  This makes UTF-8 the preferred coding system for UTF-8 locales, and
sets the various coding systems according to the current locale
settings. Now Emacs behaves just like most other applications: assumes
an 8-bit, ISO8859-7 environment under the el_GR.ISO8859-7 locale, and
a multi-byte, UTF-8 environment when run under el_GR.UTF-8.


[0] I use GNU Emacs 21.2-5, the latest version in Debian unstable.

-- 
Vasilis Vasaitis
[EMAIL PROTECTED]
+30976604701


--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/




Re: UTF-8 wakeup call

2002-12-07 Thread Larry Wall
On Sat, Dec 07, 2002 at 03:21:44PM +0100, Keld Jørn Simonsen wrote:
: Yes, true. And that is then why I find it funny that the people
: that were dead-set against anything other than 16 bit, now gets all
: the glory for the stuff they fought so hard. The irony of history:-)

Yes, it's ironic, but the reason they get the glory has very little
to do with history--except the part of history in which they were
clever enough to pick the snappier name.  All other things being
equal, had the 10646 and Unicode folks swapped names from the start,
it would still be called Unicode today, because that's the right
name for it, culturally speaking.  Most people don't give a rip about
history, but they do care about sounding cool.

Larry
--
Linux-UTF8:   i18n of Linux on all levels
Archive:  http://mail.nl.linux.org/linux-utf8/