On Tue, Aug 06, 2002 at 10:36:09PM +0900, SADAHIRO Tomoyuki wrote: > > On Mon, 5 Aug 2002 22:17:10 +0100 > Nicholas Clark <[EMAIL PROTECTED]> wrote: > > > I'm trying to backport ExtUtils::Constant from 5.8.0 to work on perl pre > > 5.8.0. Currently ExtUtils::Constant is using utf8::encode and utf8::decode > > to convert Unicode strings to and from their internal byte representation > > for testing purposes. > > > > For 5.005_03 I don't have a problem - I just skip all the Unicode tests! :-) > > However, for 5.6.1 (and 5.6.0) I do. I can't work out how to (legally!) get > > perl to give me the utf8 bytes that represent the Unicode strings, or how > > to translate a sequence of utf8 bytes back into a perl Unicode string. > > > > So how should I write utf8::encode and utf8::decode for 5.6.1 and 5.6.0? > > I can cope if a different solution is needed on both. > > How about these codelets? > (sorry, I haven't try them on 5.6.0).
Thanks. They seem to work very well on 5.6.1 After spending a couple of nights fighting all the Unicode bugs and unhelpfulness in 5.6.1 with various workarounds, I gave up on the idea of 5.6.0 - it's just too much trouble. > The test.t of my Unicode::Normalize uses many pack() and unpack() > as tests should be passed both on Perl 5.6.1 and on 5.8.0, > and via XS and via Non-XS; > but this technique seems not to be portable to EBCDIC. :-/ I've not got access to EBCDIC, so I've no idea what will go wrong. However, ExtUtils-Constant-0.13.tar.gz is currently working its way round CPAN. I couldn't find any sort of tie hash implementation on CPAN that would let me reliably mix UTF8 and 8 bit scalars as hash keys for 5.6.1, so I knocked up a quick one based on your unpack/pack code. (Although I'm storing the hash keys as a string of BER compressed integers rather than UTF8 bytes) Did I miss one, or would this be a useful small module to separate out and upload to CPAN in its own right? Clearly 5.8.0 doesn't need it: ____________________________________________________________________________ [ 7980] By: jhi on 2000/12/04 19:36:51 Log: UTF-8 hash keys, patch from Inaba Hiroto. Branch: perl ! embed.h embed.pl hv.c hv.h pod/perlapi.pod proto.h ____________________________________________________________________________ but I guess there are people needing to stick on 5.6.1 who might find it useful. My experience of trying to manipulate data that is sometimes 8 bit, sometime UTF-8 on 5.6.1? "Aaaaaaaaaaaaargh". I'd really strongly recommend upgrading to 5.8.0, where hashes, s/// and tr/// "just work". If anyone here tries ExtUtils::Constant and finds bugs, particularly in the Unicode/UTF8 bits, please don't hesitate to report them. Nicholas Clark -- Even better than the real thing: http://nms-cgi.sourceforge.net/