On Sun, May 30, 2010 at 1:33 AM, Paul Boniol <[email protected]> wrote:
> On Fri, May 28, 2010 at 9:54 AM, Michael Chaney <[email protected]>
> wrote:
>>
>> perl -p -i -e 'chomp; s/\r(?!\n)/\n/g; print;' files-to-fix
>
> For the record:  print is not recommended with in Perl place editing.  It
> will duplicate the lines. For a CR line terminated file, the whole file is
> read at once so it is like cat filename filename >> filename2; mv filename2
> filename;.  (Or if you pass a LF or CR/LF terminated file, print with in
> place edit duplicates each line as it goes.)  Neither is likely desired.
> The chomp does nothing for CR line terminated files (and would delete
> Linux/DOS line terminators if you accidentally process a LF or CR/LF file.)

Wow!  Yes, bad perl.  I ran it with "-n -e" then copied.  So what you
want is this:

perl -p -i -e 's/\r(?!\n)/\n/g;' files-to-fix

Now, how to find them.  I'd still use perl.  It's really inefficient,
terrible, etc.  But you're not running this every day, and doing it
some other "more correct" way will take far longer to program than
this will take to run:

#!/usr/bin/env perl

use strict;

my ($hascrs, $barecrs, $hasnls);

for my $fname (@ARGV) {
   open F, "<${fname}" or die "Cannot open ${fname}: $!";
   ($hascrs, $barecrs, $hasnls) = (0,0,0);
   while (<F>) {
      if (/\r/) {
         $hascrs=1;
         if (/\r(?!\n)/) {
            $barecrs=1;
         }
      }
      if (/\n/) {
         $hasnls=1;
      }
   }
   close F;
   printf "%s\t%s\t%s\t%s\n", $fname, ($hascrs ? 'hascrs' : ''),
               ($barecrs ? 'barecrs' : ''), ($hasnls ? 'hasnls' : '');
}

Give this a list of files, it'll print out something like this:

test.csv                        hasnls
test.txt        hascrs  barecrs 

test.csv is a standard unix file, test.txt is a Mac file.  If it's
just "hascrs" and "hasnls" together it's a dos file with crlf (lf==nl)
separators.  If it has barecrs and hasnls, then you need to determine
manually what to do.

Michael
-- 
Michael Darrin Chaney, Sr.
[email protected]
http://www.michaelchaney.com/

-- 
You received this message because you are subscribed to the Google Groups 
"NLUG" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/nlug-talk?hl=en

Reply via email to