>>>>> On Fri, 02 Jan 2004 18:17:13 -0500, Martin Duerst <[EMAIL PROTECTED]> said:
>>> Jungshik has also reported that >>> it fails with Perl 5.8.0 with an UTF-8 locale. >> >> Perl 5.8.0 was very broken with UTF-8 locales since it "auto-PERL_UNICODEd". >> We saw (keep seeing) a lot of that since RedHat 8 and 9 had the unfortunate >> combination of both Perl 5.8.0 _and_ UTF-8 locales (which the users didn't >> expect/know about/care about). Lots of code that expected to produce e.g. >> 0xff started to produce 0xc3 0xbf. Bang! >> Use rather 5.8.1 or later. > If it were just me, that would be easy. But stating on an FAQ > page 'use Perl 5.8.1 or later' for something that worked > probably even in Perl 4 doesn't look like a good idea. I seem to remember I heard Matt Sergeant (CC'd; Hi Matt, sorry if I misremember) say that he has a large codebase that works with perl 5.00503, 5.6.x and 5.8.x. I don't think that the tricks you need to program around the Unicode cliffs through perl versions are collected in a document. I can say for sure that I have managed to have the PAUSE code (ftp://pause.perl.org/pub/PAUSE/PAUSE-code/) run under both 5.6.1 and 5.8.x. The typical idiom I used was: if ($] > 5.007) { require Encode; # let Encode do some tweaking } The tricks that I used, have found their way into perlunicode.pod/"Porting code from perl-5.6.X". I suppose your one-liner would work with (untested) #!/usr/bin/perl -pi~ -0777 # program to remove a leading UTF-8 BOM from a file # works both STDIN -> STDOUT and on the spot (with filename as argument) if ($] > 5.007) { require Encode; Encode::_utf8_off($_); } s/^\xEF\xBB\xBF//s; >>> What I'm looking for is a very simple way to write perl programs >>> that work on byte streams. This should be possible without depending >>> on versions, working both on very old versions as well as future >>> versions. >> >> Off-hand I can say that getting both 5.6 and 5.8 work at the same time >> may be impossible in spots simply because 5.6 was badly unfinished as >> regards to Unicode. No, it won't get fixed. Beyond 5.8, I don't. > Sorry, I think you missed something in the last sentence. Did you > want to say "I don't know?". >> Some people may have some tricks they use to get Unicode code working both >> in 5.6 and 5.8, but _in_principle_ the bytes pragma should tell Perl in >> both 5.6 and 5.8 that "I want bytes, darn it." > Yes, that seems to do the job. But is this available in 5.0 or earlier? > Or is it possible to write some little code at the start that says > something like: > if (eval "use bytes;") { use bytes; } That would be use if $] >= 5.006, "bytes"; But you would have to make sure that if.pm is available, no option IMO. > (without making the actual invocation restricted to the { ... } ? -- andreas