Re: Detecting file's line endings

2006-03-01 Thread Adam Witney
On 1/3/06 1:55 am, Peter N Lewis [EMAIL PROTECTED] wrote:

 At 17:25 + 28/2/06, Adam Witney wrote:
 Does this work on all platforms? When I try it it works fine on OSX/Linux
 with MAC/DOS/UNIX line endings, but fails (reads the whole file) when
 reading DOS line endings on WinXP... Here is my script
 
 use Fcntl;
 
 my $file = $ARGV[0];
 
 open(INFILE, $file) || die cannot open $file: $!\n;
 
 {
  local $/ = get_line_ending_for_file($file);
 
 Try reading the line ending before opening the file, ie:
 
 my $temp_line_ending = get_line_ending_for_file($file);
 open(INFILE, $file) || die cannot open $file: $!\n;
 {
  local $/ = $temp_line_ending;
 
 It may be that WinXP is getting confused by opening the file, and
 then sysopen/closing the file in get_line_ending_for_file, and then
 expecting to be able to read from the file.  Not all platforms allow
 you to open the same file multiple times and have independent access
 to it - not that I know anything about WinXP, but old Classic Mac OS
 would quite probably have had problems with this.

Hi Peter,

Unfortunately this doesn't work either, the only way to get it to read the
DOS file properly on WinXP is not to set $/ at all, but of course this
breaks the other platforms

Anyway, I have rewritten it to use sysread to read a chunk at the top of the
file (I only need the header of the file) and process that with a split and
foreach, which seems to be working fine!

Thanks for your help

Adam


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.



Re: Detecting file's line endings

2006-02-28 Thread Adam Witney

At 15:15 + 22/12/05, James Harvard wrote:
 
   I'm trying to detect a file's line endings (\r\n for DOS, \r for Mac and \n
 for Unix as I'm sure y'all know).
 
   Is there any easy way to do this?
 
 use Fcntl;
 
 sub get_line_ending_for_file {
 my( $file ) = @_;
 
 my $fh;
 sysopen( $fh, $file, O_RDONLY );
 sysread( $fh, $_, 33000 );
 close( $fh );
 
 return /(\015\012|\015|\012)/ ? $1 : \n;
 }
 

Does this work on all platforms? When I try it it works fine on OSX/Linux
with MAC/DOS/UNIX line endings, but fails (reads the whole file) when
reading DOS line endings on WinXP... Here is my script

use Fcntl;

my $file = $ARGV[0];

open(INFILE, $file) || die cannot open $file: $!\n;

{
 local $/ = get_line_ending_for_file($file);
   
 while(INFILE)
   {
my $line = $_;
chomp $line;

print \n\n.length($line).\n\n;
last;
   }
}

sub get_line_ending_for_file {
   my($file) = @_;

   my $fh;
   sysopen( $fh, $file, O_RDONLY );
   sysread( $fh, $_, 200 );
   close( $fh );
   
   return /(\015\012|\015|\012)/ ? $1 : \n;
}


Thanks for any help

Adam


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.



Detecting file's line endings

2005-12-22 Thread James Harvard
I'm trying to detect a file's line endings (\r\n for DOS, \r for Mac and \n for 
Unix as I'm sure y'all know).

Is there any easy way to do this?

I don't want to slurp the whole file, because it could be 14 MB or more, so I 
wanted to read in chunks until I got to a line break. However I can see a 
potential problem ending a chunk half way through a DOS \r\n, so then you just 
get \r which makes it look like a Mac formatted file.

Anyway, I started to roll my own code for it, and because I'm new to Perl I 
hoped that one of you kind souls would have a quick look (below) to check that 
I've got the right idea of how to do this sort of thing with Perl. (It seems to 
work with my tests, but that doesn't necessarily mean that it is a robust 
method!)

Also, I assume that one can pass a file handle to a sub-routine?
$/ = sniff_line_endings(INFILE) ;

Many thanks,
James Harvard

open (INFILE,$filename) or die Couldn't open ;
$/ = \50 ;
my $taste = '' ;
my $lb = undef ;
until ($lb) {
$taste .= INFILE ;
if ($taste =~ /\r\n/) {
$lb = \r\n ;
# DOS line endings
} elsif ($taste =~ /\r(?!$)/) {
$lb = \r ;
# Mac line endings
} elsif ($taste =~ /\n/) {
$lb = \n ;
# Unix line endings
}
}
$/ = $lb ;
seek INFILE, 0, 0 ; # reset the file read pointer
# do while(INFILE) stuff


Re: Detecting file's line endings

2005-12-22 Thread John Delacour

At 3:15 pm + 22/12/05, James Harvard wrote:

I'm trying to detect a file's line endings (\r\n for DOS, \r for Mac 
and \n for Unix as I'm sure y'all know).


Is there any easy way to do this?


At 10:45 am +0800 21/11/02, Peter N Lewis wrote:


At 13:22 + 20/11/02, John Delacour wrote:


 if (/\015\012/) {
  $/ = \015\012 ;
 } elsif (/\015/) {
   $/ = \015 ;
 } else {
   $/ = \012 ;
 }


You can do this with one regular expression which will pick up the 
first line ending:


 $/ = /(\015\012|\015|\012)/ ? $1: \n;

Note that because Perl picks the first match location, and after 
that picks the first of an or | set, it will find the first 
location, and will find the \015\012 if it is there in preference to 
the \015 by itself.


Enjoy,
   Peter.


Re: Detecting file's line endings

2005-12-22 Thread John Delacour

At 3:15 pm + 22/12/05, James Harvard wrote:


Is there any easy way to do this?


PS.  The whole script, from which Peter quoted only the last bit in 
providing his genial one-liner, was as follows:




#!/usr/bin/perl
$f = $ENV{HOME}/Documents/Eudora Folder/Mail Folder/Manningham ;
sysopen F, $f, O_RDONLY ;
sysread F, $_, 1000 ;
if (/\015\012/) {
  $/ = \015\012 ;
 } elsif (/\015/) {
   $/ = \015 ;
 } else {
   $/ = \012 ;
 }
 open F, $f ;
 for (F) {
   /^From: / and chomp and print $_\n
 }


At 10:45 am +0800 21/11/02, Peter N Lewis wrote:

You can do this with one regular expression which will pick up the 
first line ending:


 $/ = /(\015\012|\015|\012)/ ? $1: \n;

   Peter.


Re: Detecting file's line endings

2005-12-22 Thread Doug McNutt

At 15:15 + 12/22/05, James Harvard wrote:
I'm trying to detect a file's line endings (\r\n for DOS, \r for Mac 
and \n for Unix as I'm sure y'all know).


ftp://ftp.macnauchtan.com/Software/LineEnds/FixEndsFolder.sit  52 kB
ftp://ftp.macnauchtan.com/Software/LineEnds/ReadMe_fixends.txt  4 kB

I have trouble with files that contain multiple types of line ends. 
The result was these drag and drop AppleScripts that might help. They 
do look at the whole file but the underlying code (included) is in C 
and pretty fast and not memory intensive. You can change or just test 
for line endings but they don't (yet) handle the two newer 16 bit 
unicode line ends.


--

Applescript syntax is like English spelling:
Roughly, but not thoroughly, thought through.


Re: Detecting file's line endings

2005-12-22 Thread Peter N Lewis

At 15:15 + 22/12/05, James Harvard wrote:
I'm trying to detect a file's line endings (\r\n for DOS, \r for Mac 
and \n for Unix as I'm sure y'all know).


Is there any easy way to do this?


use Fcntl;

sub get_line_ending_for_file {
  my( $file ) = @_;

  my $fh;
  sysopen( $fh, $file, O_RDONLY );
  sysread( $fh, $_, 33000 );
  close( $fh );

  return /(\015\012|\015|\012)/ ? $1 : \n;
}

Adjust the 33000 number to whatever maximum line size you think might 
be appropriate.


Enjoy,
   Peter.



I don't want to slurp the whole file, because it could be 14 MB or 
more, so I wanted to read in chunks until I got to a line break. 
However I can see a potential problem ending a chunk half way 
through a DOS \r\n, so then you just get \r which makes it look like 
a Mac formatted file.


Anyway, I started to roll my own code for it, and because I'm new to 
Perl I hoped that one of you kind souls would have a quick look 
(below) to check that I've got the right idea of how to do this sort 
of thing with Perl. (It seems to work with my tests, but that 
doesn't necessarily mean that it is a robust method!)


Also, I assume that one can pass a file handle to a sub-routine?
$/ = sniff_line_endings(INFILE) ;

Many thanks,
James Harvard

open (INFILE,$filename) or die Couldn't open ;
$/ = \50 ;
my $taste = '' ;
my $lb = undef ;
until ($lb) {
$taste .= INFILE ;
if ($taste =~ /\r\n/) {
$lb = \r\n ;
# DOS line endings
} elsif ($taste =~ /\r(?!$)/) {
$lb = \r ;
# Mac line endings
} elsif ($taste =~ /\n/) {
$lb = \n ;
# Unix line endings
}
}
$/ = $lb ;
seek INFILE, 0, 0 ; # reset the file read pointer
# do while(INFILE) stuff



--
http://www.stairways.com/  http://download.stairways.com/