>>>>> On Fri, 02 Jan 2004 18:17:13 -0500, Martin Duerst <[EMAIL PROTECTED]> said:

 >>> Jungshik has also reported that
 >>> it fails with Perl 5.8.0 with an UTF-8 locale.
 >> 
 >> Perl 5.8.0 was very broken with UTF-8 locales since it "auto-PERL_UNICODEd".
 >> We saw (keep seeing) a lot of that since RedHat 8 and 9 had the unfortunate
 >> combination of both Perl 5.8.0 _and_ UTF-8 locales (which the users didn't
 >> expect/know about/care about).  Lots of code that expected to produce e.g.
 >> 0xff started to produce 0xc3 0xbf.  Bang!
 >> Use rather 5.8.1 or later.

  > If it were just me, that would be easy. But stating on an FAQ
  > page 'use Perl 5.8.1 or later' for something that worked
  > probably even in Perl 4 doesn't look like a good idea.

I seem to remember I heard Matt Sergeant (CC'd; Hi Matt, sorry if I
misremember) say that he has a large codebase that works with perl
5.00503, 5.6.x and 5.8.x. I don't think that the tricks you need to
program around the Unicode cliffs through perl versions are collected
in a document.

I can say for sure that I have managed to have the PAUSE code
(ftp://pause.perl.org/pub/PAUSE/PAUSE-code/) run under both 5.6.1 and
5.8.x.

The typical idiom I used was:

    if ($] > 5.007) {
      require Encode;
      # let Encode do some tweaking
    }

The tricks that I used, have found their way into
perlunicode.pod/"Porting code from perl-5.6.X".

I suppose your one-liner would work with (untested)

       #!/usr/bin/perl -pi~ -0777
       # program to remove a leading UTF-8 BOM from a file
       # works both STDIN -> STDOUT and on the spot (with filename as argument)
       if ($] > 5.007) {
         require Encode;
         Encode::_utf8_off($_);
       }
       s/^\xEF\xBB\xBF//s;


 >>> What I'm looking for is a very simple way to write perl programs
 >>> that work on byte streams. This should be possible without depending
 >>> on versions, working both on very old versions as well as future
 >>> versions.
 >> 
 >> Off-hand I can say that getting both 5.6 and 5.8 work at the same time
 >> may be impossible in spots simply because 5.6 was badly unfinished as
 >> regards to Unicode.  No, it won't get fixed.  Beyond 5.8, I don't.

  > Sorry, I think you missed something in the last sentence. Did you
  > want to say "I don't know?".

 >> Some people may have some tricks they use to get Unicode code working both
 >> in 5.6 and 5.8, but _in_principle_ the bytes pragma should tell Perl in
 >> both 5.6 and 5.8 that "I want bytes, darn it."

  > Yes, that seems to do the job. But is this available in 5.0 or earlier?
  > Or is it possible to write some little code at the start that says
  > something like:

  > if (eval "use bytes;") { use bytes; }

That would be

  use if $] >= 5.006, "bytes";

But you would have to make sure that if.pm is available, no option IMO.

  > (without making the actual invocation restricted to the { ... } ?


-- 
andreas

Reply via email to