5.005_03 v 5.6.1

2002-05-23 Thread Greg McCarroll


Does anyone have any strong feelings (or even weak ones) about
5.005_03 vs 5.6.1. I'd be especially interested in any
stories/feelings connected with Solaris or Oracle.
  


-- 
Greg McCarroll http://www.mccarroll.org.uk/~gem/
   jabber:[EMAIL PROTECTED]




Re: 5.005_03 v 5.6.1

2002-05-23 Thread Nicholas Clark

On Thu, May 23, 2002 at 11:32:12AM +0100, Greg McCarroll wrote:
 
 Does anyone have any strong feelings (or even weak ones) about
 5.005_03 vs 5.6.1. I'd be especially interested in any
 stories/feelings connected with Solaris or Oracle.

For starters the direct 5.6.0 dislike, but it also illustrates printing
in 5.6.x series:

$ cat S560crap.t
#!/usr/local/bin/perl5.6.0 -w
use strict;

my $byte = é;

my $utf8 = $byte . chr 256;
chop $utf8;


if ($utf8 eq $byte) {
  printf Yes\n;
} else {
  print No\n;
  printf %d byte of %d, %b byte of %d\n, length $utf8, ord $utf8,
length $byte, ord $byte;
}

print byte: $byte\n;
print utf8: $utf8\n;
__END__

$ perl5.00503 S560crap.t 
Yes
byte: é
utf8: é
$ perl5.6.0 S560crap.t 
No
1 byte of 233, 1 byte of 233
byte: é
utf8: é
$ perl5.6.1 S560crap.t 
Yes
byte: é
utf8: é
$ perl5.7.3 S560crap.t 
Yes
byte: é
utf8: é


But for my general 5.6.1 dislike - I don't trust it. How to spot 5.6.1 from
quite a long way away:


$ cat spot_56x.pl
#!/usr/local/bin/perl5.6.1 -w
use strict;

my $byte = é;

my $utf8 = $byte;
$utf8 .= chr 256; chop $utf8;

my %hash = ($utf8, value);

my ($key) = keys %hash;

if ($key eq $utf8) {
  print hash keys ok\n;
} else {
  print hash keys not ok - put in $utf8 (ie , $byte, ), got $key\n;
}

my $copy = $utf8;
$copy =~ s/././g;

if (length $copy == length $utf8) {
  print regexp ok\n;
} else {
  print regexp not ok - put in $utf8 (ie , $byte, ), got $copy\n;
}

my $pid = open CHILD, |-;
die -| failed: $! unless defined $pid;

if ($pid) {
  # Parent;
  print CHILD $utf8;
  close CHILD or die;
} else {
  my $io = STDIN;
  if ($io eq $utf8) {
print io ok\n;
  } else {
print io not ok - put in $utf8 (ie , $byte, ), got $io\n;
  }
}
__END__

$ perl5.00503 spot_56x.pl
hash keys ok
regexp ok
io ok
$ perl5.6.1 spot_56x.pl
hash keys not ok - put in é (ie é), got é
regexp not ok - put in é (ie é), got ..
io not ok - put in é (ie é), got é
$ perl5.7.3 spot_56x.pl
hash keys ok
regexp ok
io ok


Of course, the inability of 5.6.1 to print Latin 1 store in Unicode makes
the diagnostic output a bit messy. And I needed ,$byte, to stop it getting
interpolated into utf8 and then garbled.

Basically, if any of your 8 bit data happens to get converted into utf8
by 5.6.1, it is likely to get mangled.

Nicholas Clark