[EMAIL PROTECTED] said: > $ perl -e '$x = "\x{2019}\nk"; $x =~ s/(\S)\n(\S)/$1 $2/sg; print "$x\n";' > ' <= this denotes a \x{2019} followed by \n > k $ perl -e > > $ perl -e '$x = "b\nk"; $x =~ s/(\S)\n(\S)/$1 $2/sg; print "$x\n";' > b k > > [snip] > > $ perl -e 'print (("\x{2019}" =~ /\S/) . "\n");' > 1
This behavior certainly does seem to contradict expectations. I even thought that the third test might not be exactly equivalent to the first, so I tried this: $ perl -e '$x = "\x{2019}"; print "x2019 matches \\S\n" if ( $x =~ /\S/ );' x2019 matches \S But since perl provides many ways of doing the same thing (or at least trying to), there is an "idiom" that will produce the expected result: require 5.008; use Encode; $x = encode( "utf8", "\x{2019}\nk" ); $x =~ s/(\S)\n(\S)/$1 $2/sg; print "$x\n"; __END__ __OUTPUT__ ' k Even in this case, I was puzzled as to why I got the expected behavior by using the "encode()" method this way, but not when I used "decode()" instead. (I should have expected it to be the other way around?) Go figure... Dave Graff