Re: Mac Perl bug?

2006-07-19 Thread Paul McCann

Hi again,

Ugh: please ignore  my previous example, which split on the empty  
string by mistake (it's even evident in the script you quoted).  
Thwack...


Let me  try and pull this together: splitting on a string containing  
a single space is special in a do what you probably mean way: it's  
not the same as splitting on \s+, in that the former discards leading  
and trailing horizontal white space (ie, spaces or tabs).


So if we have

#!/usr/bin/perl
use warnings;
use strict;
my $string = \t12\t   3\t\t4\t;
print join(**, split( ,$string)),the end;
print \n;
print join(**,split(/\s+/,$string)),the end;


Produces the following output:
1**2**3**4the end
**1**2**3**4the end

In general: when using any regexp as the first argument the split  
function acts very much by the book.


Cheers,
Paul



Re: Mac Perl bug?

2006-07-19 Thread David Cantrell
On Wed, Jul 19, 2006 at 12:29:04AM +0200, ende wrote:

 Why?
 
 $a = 1  2 3;
   1  2 3
 split / /, $a;
   [1, , 2, 3]
 split  , $a;
   [1, 2, 3]

Splitting on / / is different from splitting on   because   is
magickal.  While this is mentioned in the docs for split(), it could
perhaps be written somewhat better.

-- 
David Cantrell | Enforcer, South London Linguistic Massive

Are you feeling bored? depressed? slowed down?  Evil Scientists may
be manipulating the speed of light in your vicinity.  Buy our patented
instructional video to find out how, and maybe YOU can stop THEM


Regex and Mac vs UNIX line endings

2006-07-19 Thread Andrew Brosnan
I'm processing a string with embedded newlines. For testing I was
storing the text in __DATA__ and slurping it into a string. This works
fine. However when I read in a file, I'm having trouble with the line
endings. Matching begining/end of logical lines is not working as I
expect. Regexes like the one below match when using the DATA filehandle,
but don't when opening other text files on my Mac.

$text =~ s/^Text to match.*$//m;

Is this due to UNIX '\n' vs. Mac '\r' line endings? I assumed the 'm'
modifier would recognize any line ending.

Oh what to do?

Andrew



Re: Regex and Mac vs UNIX line endings

2006-07-19 Thread Robert Hicks

Andrew Brosnan wrote:

I'm processing a string with embedded newlines. For testing I was
storing the text in __DATA__ and slurping it into a string. This works
fine. However when I read in a file, I'm having trouble with the line
endings. Matching begining/end of logical lines is not working as I
expect. Regexes like the one below match when using the DATA filehandle,
but don't when opening other text files on my Mac.

$text =~ s/^Text to match.*$//m;

Is this due to UNIX '\n' vs. Mac '\r' line endings? I assumed the 'm'
modifier would recognize any line ending.

Oh what to do?

Andrew

What version of the Mac? Anything in the OSX family is Unix and uses the 
standard \n line ending/new line. If you brought the files over then 
yes you are going to have the '\r' line ending.


:Robert


Re: Regex and Mac vs UNIX line endings

2006-07-19 Thread Andrew Brosnan
On 7/19/06 at 9:51 PM, [EMAIL PROTECTED] (Robert Hicks) wrote:

 Andrew Brosnan wrote:
  I'm processing a string with embedded newlines. For testing I was
  storing the text in __DATA__ and slurping it into a string. This 
  works fine. However when I read in a file, I'm having trouble with 
  the line endings. Matching begining/end of logical lines is not 
  working as I expect. Regexes like the one below match when using 
  the DATA filehandle, but don't when opening other text files on my 
  Mac.
  
  $text =~ s/^Text to match.*$//m;
  
  Is this due to UNIX '\n' vs. Mac '\r' line endings? I assumed the 
  'm' modifier would recognize any line ending.
  
  Oh what to do?
  
  Andrew
  
 What version of the Mac?

10.3.9

 Anything in the OSX family is Unix and uses the 
 standard \n line ending

I don't think that is the case. These are text files created on 10.3.9
and they use \r for line endings. The problem is that /^.*$/ won't match
lines ending with \r even with the m modifier.

Andrew




Re: Regex and Mac vs UNIX line endings

2006-07-19 Thread Doug McNutt
If you want to adjust the line ends in the files have a look at:

ftp://ftp.macnauchtan.com/Software/LineEnds/FixEndsFolder.sit  52 kB
ftp://ftp.macnauchtan.com/Software/LineEnds/ReadMe_fixends.txt  4 kB

Yeah. It's pretty easy in perl too.

I have on occasion, read the first few hundred characters of a file and then 
searched for \n and \r and \r\n. From that I make a guess and reopen the file 
for line by line reading after setting $/ to what I found.

If you slurp in the whole string you can play with

$option1 = split /\n/, $thedata;
$option2 = split /\r/, $thedata;

Which option has the most elements?

split /(\r|\n)/, $thedata; # is an idea I just had. I wonder? 
-- 

-- Science is the business of discovering and codifying the rules and methods 
employed by the Intelligent Designer. Religions provide myths to mollify the 
anxiety experienced by those who choose not to participate. --