On Fri, May 28, 2010 at 9:54 AM, Michael Chaney
<mdcha...@michaelchaney.com>wrote:

> On Fri, May 28, 2010 at 3:06 AM, Paul Boniol <paul.bon...@gmail.com>
> wrote:
> > Another programmer uses a Mac, and occasionally some program reverts to
> > using the old line terminator (carriage return only).  So we end up with
> an
> > occasional CR only file mixed in with lots of other "normal" files.  The
> DOS
> > and Linux files work fairly well without any issues, but the CR only
> files
> > reek havoc when I grep under Linux (returning the entire file).
> > I know how to find text files under Linux (i.e. non-binary), but is there
> an
> > easy way to find old style Mac text files only, so I can convert them?
>
> You're looking for CR that's followed by something other than NL.  I
> don't think you can use grep to find them, but a perl script would
> work.  You can also fix them pretty easily in perl:
>
> perl -n -e 'chomp; s/\r(?!\n)/\n/g; print;'
>
> Do something like this:
>
> perl -p -i -e 'chomp; s/\r(?!\n)/\n/g; print;' files-to-fix
>
> To fix them in-place.
>
> Michael
> --
> Michael Darrin Chaney, Sr.
> mdcha...@michaelchaney.com
> http://www.michaelchaney.com/


Yes, this works for changing the line terminators, thank you for that part.
 But the bulk of the work lies in identifying the files with CR only line
terminators.

The file command will let me pick out what it identifies as ASCII text files
with CR line terminators, which is great.  However, it identifies most of
the files as HTML or exported SGML (actually those are HTML to be included
in other files), but file doesn't say what type of carriage returns these
files have.

Is there something that would find the files with line terminators of CR
only (HTML or not)?  I could use Perl to read through all of the files, but
I don't want to reinvent the wheel if there's one out there.  (All totaled
there are over 10,000 files to look at, but being mostly text type files, it
doesn't amount to very much size in today's terms.)

Paul

-- 
You received this message because you are subscribed to the Google Groups 
"NLUG" group.
To post to this group, send email to nlug-talk@googlegroups.com
To unsubscribe from this group, send email to 
nlug-talk+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/nlug-talk?hl=en

Reply via email to