Re: Re[2]: converting Japanese chars into their Unicode values using 5.8's Encode

2002-09-19 Thread Nick Ing-Simmons
Robert Allerstorfer <[EMAIL PROTECTED]> writes: >Hi Nick, > >thank you so much for solving that problem! I didn't know that >"Unicode" is a valid canonical name of an available encoding, since > >use Encode; >my @all_encodings = Encode->encodings(":all"); >print join("\n", @all_encodings); > >does

Re: Is \p{EastAsianFullwidth} worth implementing?

2002-09-19 Thread Autrijus Tang
On Fri, Sep 20, 2002 at 08:08:23AM +0300, Jarkko Hietaniemi wrote: > > Bug uncovered, though. > Could you please perlbug this so that is doesn't get lost? Sure, already done. http://www.autrijus.org/Unicode-EastAsianWidth-1.00.tar.gz just hit CPAN; SYNOPSIS follows: use Unicode::EastAsianWi

Re: Is \p{EastAsianFullwidth} worth implementing?

2002-09-19 Thread Jarkko Hietaniemi
> Bug uncovered, though. Could you please perlbug this so that is doesn't get lost? > Line 88 in utf8_heavy.pl tests for user-definedness: > > if ($type =~ /^I[ns](\w+)$/) { > > But it couldn't contain 'Is' anyway, because line 53 removed it: > > $type =~ s/^Is(?:\s+|[-_])?//i >

Re: Is \p{EastAsianFullwidth} worth implementing?

2002-09-19 Thread Autrijus Tang
On Thu, Sep 19, 2002 at 08:35:23PM +0300, Jarkko Hietaniemi wrote: > > I'll be happy to oblige and make a U::EAW with that, then. > If it is found to work fine, we can certainly merge that later back > into the core (maybe when Unicode 3.2.1 comes out). Okay, I've put it together now. Bug uncove

Re: converting Japanese chars into their Unicode values using 5.8's Encode

2002-09-19 Thread Autrijus Tang
On Thu, Sep 19, 2002 at 12:35:46AM +0200, Robert Allerstorfer wrote: > use Encode::JP; > my $string = "¼¾"; > Encode::from_to($string, "shiftjis", "utf8"); > my $ord = join("\n", unpack('U*', $string)); > print "$string\n$ord"; > > But, this gives a 3-chara

Re: Is \p{EastAsianFullwidth} worth implementing?

2002-09-19 Thread Jarkko Hietaniemi
> > Uhhh, why? Not that I have anything in particular against East Asian > > widths, but why it has to be included in core Perl? > > Well, why add EastAsianWidth.txt in the unicore/ in the first place? :-) Because it comes with the standard Unicode data files... > > User-defined Unicode charac

Re: Is \p{EastAsianFullwidth} worth implementing?

2002-09-19 Thread Autrijus Tang
On Thu, Sep 19, 2002 at 08:14:22PM +0300, Jarkko Hietaniemi wrote: > > But as it overrides core modules's behaviours, I'd hesitate to release it > > as a CPAN module (Unicode::EastAsianWidth), but rather suggest it to > > be included in core perl. > Uhhh, why? Not that I have anything in particul

Re: Is \p{EastAsianFullwidth} worth implementing?

2002-09-19 Thread Jarkko Hietaniemi
> But as it overrides core modules's behaviours, I'd hesitate to release it > as a CPAN module (Unicode::EastAsianWidth), but rather suggest it to > be included in core perl. Uhhh, why? Not that I have anything in particular against East Asian widths, but why it has to be included in core Perl?

Re[2]: converting Japanese chars into their Unicode values using 5.8's Encode

2002-09-19 Thread Robert Allerstorfer
Hi Nick, thank you so much for solving that problem! I didn't know that "Unicode" is a valid canonical name of an available encoding, since use Encode; my @all_encodings = Encode->encodings(":all"); print join("\n", @all_encodings); does not include it on my machine. best, rob -- On Thu, 19 S

Re: converting Japanese chars into their Unicode values using 5.8's Encode

2002-09-19 Thread Nick Ing-Simmons
Robert Allerstorfer <[EMAIL PROTECTED]> writes: >Hello, > >I want to convert source code written in the Japanese shift_jis >character set, into their Unicode numbers. For instance, "ŒŸ" should >result in "U+691C" (which is 26908 in decimal). I tried using the >Encode module of Perl 5.8 with someth

converting Japanese chars into their Unicode values using 5.8's Encode

2002-09-19 Thread Robert Allerstorfer
Hello, I want to convert source code written in the Japanese shift_jis character set, into their Unicode numbers. For instance, "¼¾" should result in "U+691C" (which is 26908 in decimal). I tried using the Encode module of Perl 5.8 with something like this: use Encode::JP; my $st