I have to write a simple function which strips out the various
newlines on text files, and replaces them with the standard unix
newline \n....after reading the perlport doc, I'm even more confused
now.

          LF  eq  \012  eq  \x0A  eq  \cJ  eq  chr(10)  eq  ASCII 10
          CR  eq  \015  eq  \x0D  eq  \cM  eq  chr(13)  eq  ASCII 13



                   | Unix | DOS  | Mac  |
              ---------------------------
              \n   |  LF  |  LF  |  CR  |
              \r   |  CR  |  CR  |  LF  |
              \n * |  LF  | CRLF |  CR  |
              \r * |  CR  |  CR  |  LF  |
              ---------------------------
              * text-mode STDIO


In text-mode, I open the file, and do the following:

               while (defined(my $line = <INFILE>))
               {
                       my $outline;
                       if ($line =~ m/\cM\cJ/)
                       {
                               print "dos\n";
                               ($outline = $line) =~ s/\cM\cJ/\cJ/; #win32

                       } elsif ($line =~ m/\cM(?!\cJ)/) {
                               print "mac\n";
                               ($outline = $line) =~ s/\cM/\cJ/g; #mac
                       } else {
                               print "other\n";
                               $outline = $line; #default
                       }

                       print OUTFILE $outline;
               }

It works fine on unix when I run the unit tests on old mac files, win,
and unix files and do a hexdump -C on them....however, when I run it
on win32 perl 5.6.1, it is not doing any replacement. Teh lines remain
unchanged.

My understanding is that \n is a reference (depending on which OS your
perl is running on) to CR (mac), CRLF (dos), and LF (unix) in
text-mode STDIO. So replacing CR (not followed by LF) with LF should
work on mac, and CRLF with LF on dos, and leaving LF untouched on *nix
(other)....then it shouldn't be a problem...however it appears that
\cJ is actually different on win32 than it is on unix.

so is \cJ is actually \cM\cJ on win32?



--
Anthony Ettinger
Signature: http://chovy.dyndns.org/hcard.html

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to