Re: [Encode] new README.cjk available

2002-04-21 Thread Shigeki Moro
> I think "$B%G%#!<%t%!%J%,!<%j(B" should be "$B%G!<%t%!%J!<%,%j!<(B".

I'm sorry that was too short. This is a correction for /00README.jp/.

Shigeki Moro

Hanazono University
[EMAIL PROTECTED]
[EMAIL PROTECTED]
http://www.ya.sakura.ne.jp/~moro/


Re: [Encode] new README.cjk available

2002-04-21 Thread moro
On Thu, 18 Apr 2002 02:35:49 +0900
Dan Kogai <[EMAIL PROTECTED]> san wrote:

> http://www.dan.co.jp/~dankogai/bleedperl/

I think "$B%G%#!<%t%!%J%,!<%j(B" should be "$B%G!<%t%!%J!<%,%j!<(B".

Otsukaresama-desu.

Shigeki Moro

Hanazono University
[EMAIL PROTECTED]
[EMAIL PROTECTED]
http://www.ya.sakura.ne.jp/~moro/


Handling a utf8 string.

2000-04-29 Thread Shigeki Moro

Dear all,

# This mail is in UTF-8.

I have a question about handling a string of characters in UTF-8 on
Perl 5.6. I wrote a script quoted below:

#!perl -w
use utf8;
$a = '摩訶吠室&M004651;末那野提婆喝&M004651;闍陀羅尼儀軌';
$a =~ s{&M(\d\d\d)(\d\d\d);}
{http://www.mojikyo.gr.jp/gif/$1/$1$2.gif">}g;
print "$a\n";
__END__

This script results:

摩訶吠室http://www.mojikyo.gr.jp/gif/004/004651.gif">末
那野提婆喝http://www.mojikyo.gr.jp/gif/004/004651.gif">闍陀羅
尼儀軌

It seems to me that a string "摩訶吠室" has been changed into a
mysterious "摩訶吠室", although "末那野提婆喝" and "闍陀
羅尼儀軌" have been handled correctly.
Is this because of an incompleteness of Perl 5.6, or lack of my
understanding? Any suggestion and information will be helpful for me.

The version of Perl is:

  This is perl, v5.6.0 built for MSWin32-x86-multi-thread
  Binary build 613 provided by ActiveState Tool Corp.
  Built 12:36:25 Mar 24 2000

Thanks in advance,

Shigeki Moro
[EMAIL PROTECTED]
http://www.ya.sakura.ne.jp/~moro/




splitting devanagari characters

2000-04-04 Thread Shigeki Moro

Dear subscribers,

I wrote a report in Japanese concerned with the management of Devanagari
(one of the Indic scripts) characters on Perl 5.6.

http://www.ya.sakura.ne.jp/~moro/resources/indic_on_perl5.6/index.html

For example, using utf8, splitting a Devanagari word 'vij~naana' into
character semantics results in 'va + (i) + ja + (viraama) + ~na + (aa) +
na'. 

It seems to me that Perl divides a combined character into the base
character and the combining character(s), and doesn't regard a combined
character as one character.

Any comments will be appreciated for me.

Regards,

Shigeki Moro
[EMAIL PROTECTED]
http://www.ya.sakura.ne.jp/~moro/