Encode-1.42 & PerlIO-encoding-0.01 now available

2002-04-16 Thread Dan Kogai

NI-XS, jhi and porters,

The surgical operation is finished.  PerlIO layer functions in Encode.xs 
has been successfully detached.  Now PerlIO part is in 
PerlIO::encoding.  They are now more like interdependent than 
dependent.  You can get one via URLs below;

http://www.dan.co.jp/~dankogai/PerlIO-encoding-0.01.tar.gz
http://www.dan.co.jp/~dankogai/Encode-1.42.tar.gz
http://www.dan.co.jp/~dankogai/perl-dan.tar.bz2

The last one is the whole perl with interdependent versions of Encode 
and PerlIO.  As a matter of fact, just replace Encode with 1.42 above, 
untargzip PerlIO-encoding-0.01 at ext/PerlIO/ and rename the thawed 
directory to "encoding", and fix toplevel MANIFEST and it will work 
perfectly.  Configure file needed now modification.

Here is how Encode tests as a module.

> t/Aliases.ok
> t/CN..ok
> t/Encode..ok
> t/Encoder.ok
> t/JP..ok, 6/27 skipped: PerlIO Encoding Needed
> t/KR..ok, 6/22 skipped: PerlIO Encoding Needed
> t/TW..ok
> t/Unicode.ok
> t/encodingok
> t/growok
> t/jperl...ok
> All tests successful, 12 subtests skipped.
> Files=11, Tests=4616, 11 wallclock secs ( 7.52 cusr +  0.50 csys =  
> 8.02 CPU)

And with Whole perl and PerlIO
> ext/Encode/t/CN.ok
> ext/Encode/t/Encode.ok
> ext/Encode/t/Encoderok
> ext/Encode/t/JP.ok
> ext/Encode/t/KR.ok
> ext/Encode/t/TW.ok
> ext/Encode/t/Unicodeok
> ext/Encode/t/encoding...ok
> ext/Encode/t/grow...ok
> ext/Encode/t/jperl..ok
> []
> ext/PerlIO/PerlIO...ok
> ext/PerlIO/t/encoding...ok
> ext/PerlIO/t/scalar.ok
> ext/PerlIO/t/viaok

See ext/PerlIO/t/encoding.t was never modified.  So it is 100% 
compatible with the prior version.

FYI those will not be uploaded to CPAN;  I'll wait until perl-current 
catches up.  And PerlIO::encoding is not mine but NI-XS.  So if it is to 
be CPANized, it must be done by NI-XS (I pretty much doubt if he does, 
however).

.Man, I'm exhausted.  Autrijus, Jungshik, sorry for not responding 
soon.  Please let me take a nap before I process your new READMEs.

Dan the Encode Maintainer.




[Encode] 1.41 released

2002-04-16 Thread Dan Kogai

Folks,

   I have released Encode ver.1.41 as follows.

Whole:
http://www.dan.co.jp/~dankogai/Encode-1.41.tar.gz
CPAN
Diff:
http://www.dan.co.jp/~dankogai/current-1.41.diff.gz

=head1 CAUTION

This will be the last Encode module that has PerlIO ":encoding()" 
bundled.  From the next version and on,  It will be released as 
ext/PerlIO/encoding.  So for those who use bleedperl for regular 
business (in spite of -Dusedevel), maybe you should wait while BOTH 
Encode 1.42 or later AND ext/PerlIO/encoding appear in the perl-current 
repository.

> On Wednesday, April 17, 2002, at 08:25 , Jarkko Hietaniemi wrote:
>> On Wed, Apr 17, 2002 at 07:49:19AM +0900, Dan Kogai wrote:
>> I will go ahead w/ the plan.  I will release the next version with
>> PerlIO part untouched to let us sync.  Then the following version will
>> detach the PerlIO part.  How's that sound ?
>
> Ok.

=head1 Notable Changes

Encode::XS can now handle substitution characters.  Encode::Encoding 
noted that when $enc->encode($str, 0), it should try its best to replace 
unmapped characters with substitution characters but that feature was 
not implemented;  It always acted like $enc->encode($str, 1).  Now it 
behaves as documented.

I also added a special case for CHECK, -1.  When -1 is fed, well...  
please check for yourself.  You can check it in action via

piconv -p -f foo -t bar

Try

piconv -p -f utf8 -t ascii

to see it clear.

And Changes right after the sig.

Dan the Encode Maintainer

1.41 $Date: 2002/04/16 23:35:00 $
! encoding.pm
   binmode(STDIN|STDOUT ...) done iff PerlIO is available
! t/*.t
   Cleaned up PerlIO skip conditions to prepare for the upcoming
   Encode - PerlIO forking.
! Encode.pm
   exported functions are now prototyped.
! lib/Encode/CN/HZ.pm
! bin/enc2xs
! Encode.xs
   fallback implemented # was /* FIXME */
   affected programs revised to fit (only HZ was using the try-catch
   approach which needed to be fixed for API-compliance).
! Encode/Config.pm
! Encode/KR/2022_KR.pm
! Encode/KR/KR.pm
   can find =head1 NAME now, jhi
   Message-Id: <[EMAIL PROTECTED]>
! encoding.pm
   s/\{h\}/{$h}/g ;)
! Encode.xs
   now complies with less warnings with the pickest compilers.
   Suggested by Craig, fixed by Dan.
   ! Encode/Makefile_PL.e2x
! bin/enc2xs
   A bug that fails to find *.e2x in certain conditions fixed




Re: [Encode] All README.xx available at...

2002-04-16 Thread Autrijus Tang

On Tue, Apr 16, 2002 at 10:35:36AM +0900, Dan Kogai wrote:
> http://www.dan.co.jp/~dankogai/bleedperl/

Wonderful. I learned quite a bit Japanese just by reading the Kanji
and try to read out Katagana characters, and was generally able to
understand the whole article. :)

(I wonder if Kogai-san can similarily read .tw and .cn, and which
 is the easier one. Probably the Simplified version?)

> "00" was prepended to all file names so Apache is happy.  Originals as 
> well as pod2htmlized versions available.  they all look great in 
> Mozilla, okay in IE (on MacOS X).

Attached are the proofread versions from Taipei.pm and ORA Taiwan; they
should be the final version unless some serious bug surfaces.

> Tokyo.pm members, please take a look at them and tell me what you guys 
> think.

*sigh* I missed the YARPC; any plans to do a YAPC::Asia any time soon?
At least we have active Tokyo and Taipei mongers, and it seems (from
their web page) that Seoul.pm is also quite lively. :)

Some nitpicks to .kr and .jp: .kr had a verbatim English paragraph
near the beginning of article, which seems redundant to me. Also,
both refer to the FOO encoding, which might or might not be intentional.

Thanks,
/Autrijus/


If you read this file _as_is_, just ignore the funny characters you
see. It is written in the POD format (see perlpod manpage) which is
specially designed to be readable as is.

The following documentation is written in Big5 encoding.

¦pªG§A¥Î¤@¯ëªº¤å¦r½s¿è¾¹¾\Äý³o¥÷¤å¥ó, ½Ð©¿²¤¤å¤¤©_¯Sªºµù°O¦r²Å.
³o¥÷¤å¥ó¬O¥H POD (²©ú¤å¥ó®æ¦¡) ¼g¦¨; ³oºØ®æ¦¡¬O¬°¤F¯àÅý¤Hª½±µÅª¨ú,
¦Ó¯S§O³]­pªº. Ãö©ó¦¹®æ¦¡ªº¶i¤@¨B¸ê°T, ½Ð°Ñ¦Ò perlpod ½u¤W¤å¥ó.

=head1 NAME

perltw - ¥¿Å餤¤å Perl «ü«n

=head1 DESCRIPTION

Åwªï¨Ó¨ì Perl ªº¤Ñ¦a!

±q 5.8.0 ª©¶}©l, Perl ¨ã³Æ¤F§¹µ½ªº Unicode (¸U°ê½X) ¤ä´©,
¤]³s±a¤ä´©¤F³\¦h©Ô¤B»y¨t¥H¥~ªº½s½X¤è¦¡; CJK (¤¤¤éÁú) «K¬O¨ä¤¤ªº¤@³¡¥÷.
Unicode ¬O°ê»Ú©Êªº¼Ð·Ç, ¸Õ¹Ï²[»\¥@¬É¤W©Ò¦³ªº¦r²Å: ¦è¤è¥@¬É, ªF¤è¥@¬É,
¥H¤Î¨âªÌ¶¡ªº¤@¤Á (§Æþ¤å, ±Ô§Q¨È¤å, ªü©Ô§B¤å, §Æ§B¨Ó¤å, ¦L«×¤å,
¦L¦a¦w¤å, µ¥µ¥). ¥¦¤]®e¯Ç¤F¦hºØ§@·~¨t²Î»P¥­»O (¦p PC ¤Î³Áª÷¶ð).

Perl ¥»¨­¥H Unicode ¶i¦æ¾Þ§@. ³oªí¥Ü Perl ¤º³¡ªº¦r¦ê¸ê®Æ¥i¥Î Unicode
ªí¥Ü; Perl ªº¨ç¦¡»Pºâ²Å (¨Ò¦p¥¿³Wªí¥Ü¦¡¤ñ¹ï) ¤]¯à¹ï Unicode ¶i¦æ¾Þ§@.
¦b¿é¤J¤Î¿é¥X®É, ¬°¤F³B²z¥H Unicode ¤§«eªº½s½X¤è¦¡Àx¦sªº¸ê®Æ, Perl
´£¨Ñ¤F Encode ³o­Ó¼Ò²Õ, ¥i¥HÅý§A»´©ö¦aŪ¨ú¤Î¼g¤J¦³ªº½s½X¸ê®Æ.

Encode ©µ¦ù¼Ò²Õ¤ä´©¤U¦C¥¿Å餤¤åªº½s½X¤è¦¡:

big5­ì©lªº Big5 ½s½X (§t­Ê¤Ñ¤é¤å¦r§Î)
big5-hkscs  Big5 + ­»´ä¥~¦r¶°
cp950   ¦r½X­¶ 950 (Big5 + ·L³n²K¥[ªº¦r²Å)

Á|¨Ò¨Ó»¡, ±N Big5 ½s½XªºÀÉ®×Âন Unicode, ¯­»ÝÁä¤J¤U¦C«ü¥O:

perl -Mencoding=big5,STDOUT,utf8 -pe1 < file.big5 > file.utf8

Perl ¤]¤ºªþ¤F "piconv", ¤@¤ä§¹¥þ¥H Perl ¼g¦¨ªº¦r²ÅÂà´«¤u¨ãµ{¦¡, ¥Îªk¦p¤U:

piconv -f big5 -t utf8 < file.big5 > file.utf8
piconv -f utf8 -t big5 < file.utf8 > file.big5

¥t¥~, §Q¥Î encoding ¼Ò²Õ, §A¥i¥H»´©ö¼g¥X¥H¦r²Å¬°³æ¦ìªºµ{¦¡½X, ¦p¤U©Ò¥Ü:

#!/usr/bin/env perl
# ±Ò°Ê big5 ¦r¦ê¸ÑªR; ¼Ð·Ç¿é¥X¤J¤Î¼Ð·Ç¿ù»~³£³]¬° big5 ½s½X
use encoding 'big5', STDIN => 'big5', STDOUT => 'big5';
print length("Àd¾m");#  2 (Âù¤Þ¸¹ªí¥Ü¦r²Å)
print length('Àd¾m');#  4 (³æ¤Þ¸¹ªí¥Ü¦ì¤¸²Õ)
print index("½Î½Î±Ð»£", "να"); # -1 (¤£¥]§t¦¹¤l¦r¦ê)
print index('½Î½Î±Ð»£', 'να'); #  1 (±q²Ä¤G­Ó¦ì¤¸²Õ¶}©l)

¦b³Ì«á¤@¦C¨Ò¤l¸Ì, "½Î" ªº²Ä¤G­Ó¦ì¤¸²Õ»P "½Î" ªº²Ä¤@­Ó¦ì¤¸²Õµ²¦X¦¨ Big5
½Xªº "ν"; "½Î" ªº²Ä¤G­Ó¦ì¤¸²Õ«h»P "±Ð" ªº²Ä¤@­Ó¦ì¤¸²Õµ²¦X¦¨ "α".
³o¸Ñ¨M¤F¥H«e Big5 ½X¤ñ¹ï³B²z¤W±`¨£ªº°ÝÃD.

=head2 ÃB¥~ªº¤¤¤å½s½X

¦pªG»Ý­n§ó¦hªº¤¤¤å½s½X, ¥i¥H±q CPAN (L) ¤U¸ü
Encode::HanExtra ¼Ò²Õ. ¥¦¥Ø«e´£¨Ñ¤U¦C½s½X¤è¦¡:

euc-tw  Unix ©µ¦ù¦r²Å¶°, ¥]§t CNS11643 ¥­­± 1-7
big5plus¤¤¤å¼Æ¦ì¤Æ§Þ³N±À¼s°òª÷·|ªº Big5+

¥t¥~, Encode::HanConvert ¼Ò²Õ«h´£¨Ñ¤F²ÁcÂà´«¥Îªº¨âºØ½s½X:

big5-simp   Big5 ¥¿Å餤¤å»P Unicode ²Å餤¤å¤¬Âà
gbk-tradGBK ²Å餤¤å»P Unicode ¥¿Å餤¤å¤¬Âà

­Y·Q¦b GBK »P Big5 ¤§¶¡¤¬Âà, ½Ð°Ñ¦Ò¸Ó¼Ò²Õ¤ºªþªº b2g.pl »P g2b.pl ¨â¤äµ{¦¡,
©Î¦bµ{¦¡¤º¨Ï¥Î¤U¦C¼gªk:

use Encode::HanConvert;
$euc_cn = big5_to_gb($big5); # ±q Big5 Âର GBK
$big5 = gb_to_big5($euc_cn); # ±q GBK Âର Big5

=head2 ¶i¤@¨Bªº¸ê°T

½Ð°Ñ¦Ò Perl ¤ºªþªº¤j¶q»¡©ú¤å¥ó (¤£©¯¥þ¬O¥Î­^¤å¼gªº), ¨Ó¾Ç²ß§ó¦hÃö©ó
Perl ªºª¾ÃÑ, ¥H¤Î Unicode ªº¨Ï¥Î¤è¦¡. ¤£¹L, ¥~³¡ªº¸ê·½¬Û·íÂ×´I:

=head2 ´£¨Ñ Perl ¸ê·½ªººô§}

=over 4

=item L

Perl ªº­º­¶ (¥Ñ¼ÚµÜ§¤½¥qºûÅ@)

=item L

Perl ºî¦X¨åÂúô (Comprehensive Perl Archive Network)

=item L

Perl ¶l»¼½×¾Â¤@Äý

=back

=head2 ¾Ç²ß Perl ªººô§}

=over 4

=item L

¥¿Å餤¤åª©ªº¼ÚµÜ§ Perl ®ÑÂÇ

=item L

»OÆW Perl ³s½u°Q½×°Ï (¤]´N¬O¦U¤j BBS ªº Perl ³s½uª©)

=back

=head2 Perl ¨Ï¥ÎªÌ¶°·|

=over 4

=item L

»OÆW Perl ±À¼s²Õ¤@Äý

=item L

ÃÀ¥ß¨ó½u¤W²á¤Ñ«Ç

=back

=head2 Unicode ¬ÛÃöºô§}

=over 4

=item L

Unicode ¾Ç

Re: iso-2022-jp problem

2002-04-16 Thread Nick Ing-Simmons

Dan Kogai <[EMAIL PROTECTED]> writes:
>On Tuesday, April 16, 2002, at 01:06 , Nick Ing-Simmons wrote:
>> So we need some way of telling from an encoding object (e.g.
>> an attribute or a method call) that it needs line buffering
>> so that :encoding layer can take the appropriate steps.
>
>Okay, which way do you like, attribute or method ?  

Let us make it a method - then most encodings can inherit as default.

>I think method is
>more elegant but attribute seems easier to fetch.  Since this is more
>for PerlIO than Encode itself, I would appreciate if you gave me the API
>(just name would be enough) 

But hard to think of - don't blame you for asking.

>and I will add them to ISO-2022 stuff (not
>just JP but KR has one, too).

Add 

sub line_aware { '' }# false 

to Encode::Encoding

and 

sub line_aware { "\n" }  # true, hint at chars that will do 

To those that need it 

Exact meaning of value of string (other than as truth value) to be 
thrashed out later. May either be $/ like, or perhaps a "set" or ...


>
>Dan
-- 
Nick Ing-Simmons
http://www.ni-s.u-net.com/