On Thu, May 23, 2002 at 11:32:12AM +0100, Greg McCarroll wrote: > > Does anyone have any strong feelings (or even weak ones) about > 5.005_03 vs 5.6.1. I'd be especially interested in any > stories/feelings connected with Solaris or Oracle.
For starters the direct 5.6.0 dislike, but it also illustrates printing in 5.6.x series: $ cat S560crap.t #!/usr/local/bin/perl5.6.0 -w use strict; my $byte = "é"; my $utf8 = $byte . chr 256; chop $utf8; if ($utf8 eq $byte) { printf "Yes\n"; } else { print "No\n"; printf "%d byte of %d, %b byte of %d\n", length $utf8, ord $utf8, length $byte, ord $byte; } print "byte: $byte\n"; print "utf8: $utf8\n"; __END__ $ perl5.00503 S560crap.t Yes byte: é utf8: é $ perl5.6.0 S560crap.t No 1 byte of 233, 1 byte of 233 byte: é utf8: é $ perl5.6.1 S560crap.t Yes byte: é utf8: é $ perl5.7.3 S560crap.t Yes byte: é utf8: é But for my general 5.6.1 dislike - I don't trust it. How to spot 5.6.1 from quite a long way away: $ cat spot_56x.pl #!/usr/local/bin/perl5.6.1 -w use strict; my $byte = "é"; my $utf8 = $byte; $utf8 .= chr 256; chop $utf8; my %hash = ($utf8, "value"); my ($key) = keys %hash; if ($key eq $utf8) { print "hash keys ok\n"; } else { print "hash keys not ok - put in $utf8 (ie ", $byte, "), got $key\n"; } my $copy = $utf8; $copy =~ s/././g; if (length $copy == length $utf8) { print "regexp ok\n"; } else { print "regexp not ok - put in $utf8 (ie ", $byte, "), got $copy\n"; } my $pid = open CHILD, "|-"; die "-| failed: $!" unless defined $pid; if ($pid) { # Parent; print CHILD $utf8; close CHILD or die; } else { my $io = <STDIN>; if ($io eq $utf8) { print "io ok\n"; } else { print "io not ok - put in $utf8 (ie ", $byte, "), got $io\n"; } } __END__ $ perl5.00503 spot_56x.pl hash keys ok regexp ok io ok $ perl5.6.1 spot_56x.pl hash keys not ok - put in é (ie é), got é regexp not ok - put in é (ie é), got .. io not ok - put in é (ie é), got é $ perl5.7.3 spot_56x.pl hash keys ok regexp ok io ok Of course, the inability of 5.6.1 to print Latin 1 store in Unicode makes the diagnostic output a bit messy. And I needed ,$byte, to stop it getting interpolated into utf8 and then garbled. Basically, if any of your 8 bit data happens to get converted into utf8 by 5.6.1, it is likely to get mangled. Nicholas Clark