Re: YARQ: Yet Another Regex Question

2007-05-16 Thread Mathew


Chas Owens wrote:
> On 5/16/07, Mathew <[EMAIL PROTECTED]> wrote:
> snip
>> What does gr() do?
>>
>> Mathew
>>
> 
> qr not gr.  It is the quote regex operator.
> 
> from perldoc perlop
>   qr/STRING/imosx
>   This operator quotes (and possibly compiles) its STRING as a
>   regular expression.  STRING is interpolated the same way as
>   PATTERN in "m/PATTERN/".  If "'" is used as the delimiter, no
>   interpolation is done.  Returns a Perl value which may be
> used
>   instead of the corresponding "/STRING/imosx" expression.
> 

Ahh, yes that would certainly work better.  One of my concerns was using
a bare string containing a space.  While it may not break it could cause
problems down the road.  Something I'd like to avoid.

Thanks
Mathew

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




Re: YARQ: Yet Another Regex Question

2007-05-16 Thread Jeff Pang

Mathew 写道:



What does gr() do?



It's "qr" not "gr".
See "perldoc perlop" and look for "qr/STRING/imosx".


--
http://home.arcor.de/jeffpang/

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




Re: YARQ: Yet Another Regex Question

2007-05-16 Thread Chas Owens

On 5/16/07, Mathew <[EMAIL PROTECTED]> wrote:
snip

What does gr() do?

Mathew



qr not gr.  It is the quote regex operator.

from perldoc perlop
  qr/STRING/imosx
  This operator quotes (and possibly compiles) its STRING as a
  regular expression.  STRING is interpolated the same way as
  PATTERN in "m/PATTERN/".  If "'" is used as the delimiter, no
  interpolation is done.  Returns a Perl value which may be used
  instead of the corresponding "/STRING/imosx" expression.

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




Re: YARQ: Yet Another Regex Question

2007-05-16 Thread Mathew


Chas Owens wrote:
> On 5/16/07, Mathew Snyder <[EMAIL PROTECTED]> wrote:
>> I have a trouble ticket application that uses a regex to find a piece of
>> information in an incoming email and auto populate a field if it is
>> found.  The
>> line it will be looking for is
>> CUSTOMER ENVIRONMENT customer_name
>> where customer_name will never have a space making it one word.  If I
>> just want
>> to pull from the line the customer_name would my regex look like
>> $MatchString = "CUSTOMER ENVIRONMENT\s+(\w)"
> 
> Bad idea.  Use qr() instead.
> 
>>
>> For what it's worth the line that will handle this is
>> $found = ($Transaction->Attachments->First->Content =~ /$MatchString/m);
>> I'm guessing that when used in an assignment like this, $1 will be
>> used as the
>> value.  The contents of (\w) in this case.  Is that correct?
> snip
> 
> Yes, the $1 match variable will hold the match if $found is true.  A
> common idiom is therefore
> 
> my $name;
> my $regex = qr/CUSTOMER ENVIRONMENT\s+(\w)/;
> if ($Transaction->Attachments->First->Content =~ /$regex) {
>$name = $1;
> } else {
>die "could not find name";
> }
> 
> Another way to write this is
> 
> my $regex = qr/CUSTOMER ENVIRONMENT\s+(\w)/;
> my ($name) = $Transaction->Attachments->First->Content =~ /$regex/
>or die "could not find name";
> 

What does gr() do?

Mathew

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




Re: YARQ: Yet Another Regex Question

2007-05-16 Thread Chas Owens

On 5/16/07, Mathew Snyder <[EMAIL PROTECTED]> wrote:

I have a trouble ticket application that uses a regex to find a piece of
information in an incoming email and auto populate a field if it is found.  The
line it will be looking for is
CUSTOMER ENVIRONMENT customer_name
where customer_name will never have a space making it one word.  If I just want
to pull from the line the customer_name would my regex look like
$MatchString = "CUSTOMER ENVIRONMENT\s+(\w)"


Bad idea.  Use qr() instead.



For what it's worth the line that will handle this is
$found = ($Transaction->Attachments->First->Content =~ /$MatchString/m);
I'm guessing that when used in an assignment like this, $1 will be used as the
value.  The contents of (\w) in this case.  Is that correct?

snip

Yes, the $1 match variable will hold the match if $found is true.  A
common idiom is therefore

my $name;
my $regex = qr/CUSTOMER ENVIRONMENT\s+(\w)/;
if ($Transaction->Attachments->First->Content =~ /$regex) {
   $name = $1;
} else {
   die "could not find name";
}

Another way to write this is

my $regex = qr/CUSTOMER ENVIRONMENT\s+(\w)/;
my ($name) = $Transaction->Attachments->First->Content =~ /$regex/
   or die "could not find name";

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




YARQ: Yet Another Regex Question

2007-05-16 Thread Mathew Snyder
I have a trouble ticket application that uses a regex to find a piece of
information in an incoming email and auto populate a field if it is found.  The
line it will be looking for is
CUSTOMER ENVIRONMENT customer_name
where customer_name will never have a space making it one word.  If I just want
to pull from the line the customer_name would my regex look like
$MatchString = "CUSTOMER ENVIRONMENT\s+(\w)"

For what it's worth the line that will handle this is
$found = ($Transaction->Attachments->First->Content =~ /$MatchString/m);
I'm guessing that when used in an assignment like this, $1 will be used as the
value.  The contents of (\w) in this case.  Is that correct?

Mathew
-- 
Keep up with me and what I'm up to: http://theillien.blogspot.com

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/




another regex question

2002-04-22 Thread James Taylor

Hmm, it looks like something was wrong with my mail server, so I'm 
sending this question again - If you already got this, I apologize:


I'm have this program that reads over mail logs looking for spammers, 
and depending on certain conditions, they're marked as a spammer.  If 
the reverse lookup on the relay used matches their email address 
however, no matter what, we're not marking them as a spammer. However, 
I've run across a problem where a users email address can look something 
like

[EMAIL PROTECTED]

and the relay used will be reversed to:

relay.inthemiddleofnowhere.com

How do I match JUST the last part of the address, so that 
inthemiddleofnowhere.com is the only thing picked up, no matter how many 
subdomains are listed, and it also matches if the email address is just 
[EMAIL PROTECTED] ? I can't seem to figure this one out.


Also, there is one spammer in particular who's email addresses look 
something like:

OWNER-NOLIST-3249_Columbia_music*TJUDKINS**EXO*[EMAIL PROTECTED]

When I get one of the emails from this spammer, I get this error with 
the script:

Nested quantifiers before HERE mark in regex 
m/OWNER-NOLIST-3249_Columbia_music*TJUDKINS** << HERE
EXO*[EMAIL PROTECTED]/ at /sbin/removespam.pl line 10.

I'm assuming the asterisk is breaking the regex already in place 
somehow. Anyone know what might be causing this one, and how to fix it?






-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




another regex question

2002-04-22 Thread James Taylor

I'm have this program that reads over mail logs looking for spammers, 
and depending on certain conditions, they're marked as a spammer.  If 
the reverse lookup on the relay used matches their email address 
however, no matter what, we're not marking them as a spammer. However, 
I've run across a problem where a users email address can look something 
like

[EMAIL PROTECTED]

and the relay used will be reversed to:

relay.inthemiddleofnowhere.com

How do I match JUST the last part of the address, so that 
inthemiddleofnowhere.com is the only thing picked up, no matter how many 
subdomains are listed, and it also matches if the email address is just 
[EMAIL PROTECTED]? I can't seem to figure this one out.


Also, there is one spammer in particular who's email addresses look 
something like:

OWNER-NOLIST-3249_Columbia_music*TJUDKINS**EXO*[EMAIL PROTECTED]

When I get one of the emails from this spammer, I get this error with 
the script:

Nested quantifiers before HERE mark in regex 
m/OWNER-NOLIST-3249_Columbia_music*TJUDKINS** << HERE
EXO*[EMAIL PROTECTED]/ at /sbin/removespam.pl line 10.

I'm assuming the asterisk is breaking the regex already in place somehow. Anyone know 
what might be causing this one, and how to fix it?





-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




Re: Another Regex Question

2001-06-22 Thread M.W. Koskamp


- Original Message - 
From: Jack Lauman <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Thursday, June 21, 2001 11:48 PM
Subject: Another Regex Question


> I'm trying to create a CSV file from the text data below.  Lines
> containing High and Low Tide data have 9 fields, lines having
> sunrise/sunset and lunar data have 8 fields.
> 
> How you differentiate between the two conditions?
> 
> 2000-12-03 11:30 AM PST   9.39 feet  High Tide
> 2000-12-03  4:15 PM PST   Sunset
> 2000-12-03  7:56 PM PST   First Quarter
> 2000-12-04  3:42 AM PST   2.81 feet  Low Tide
> 2000-12-04  7:48 AM PST   Sunrise
> 
> <->
I dont think you need a reg exp for that.
try this
 
 while () {
 
  my @fields = split;
  if (@fields == 9) {
 # do your tide stuff
  }elsif (@fields == 8) {
# do your lunar stuff
  }else {
  #what is wrong?
 } 
}

maarten




Re: Another Regex Question

2001-06-22 Thread Jos Boumans

assuming i'm understanding correctly what you want, you might want to try this:

use strict;
use vars qw(@store);

open I, "input.txt" or die "oops: $!\n";

while(){ my @foo = /(.+?)\s+(.+?)\s{2,}(.+?)(?:\s{2,}(.+))?$/; push @store,
\@foo }

@store will now hold all matches on your input file, which are references to
arrays.
you can acces the references as follows, with the format of the array referred
to like this:

for(@store) {
$_->[0] # is the date
$_->[1] # is the time
$_->[2] # is either 'x quarter | sunrise, etc' or 'x feet'
$_->[3] # will hold the tide indication if there is one
}

you can print them out as a csv file as follows:
for(@store){ print ( (join ',', @$_) . "\n")}

this is a bit advanced regexes, but you can check out a tutorial on them at
www.sharemation.com/~perl/tut to give you an insight in them

hth,
Jos Boumans

Ken wrote:

> One thing to do would be to test for Tide in the line(Assuming all tide data
> ends with the word "Tide") right away...then do special stuff for each case
> in an if:
>
> if( /Tide$/ ) # If last word in line is Tide
> {
> }
> else # Must be lunar
> {
> }
>
> And just a note, if you're just going to put the date and time fields back
> together, don't seperate them in your pattern match.
>
> ($date, $time, ... ) = ^(\d+-\d+-\d+)\s+(\d+:\d+)...$/
>
> - Original Message -
> From: "Jack Lauman" <[EMAIL PROTECTED]>
> To: <[EMAIL PROTECTED]>
> Sent: Thursday, June 21, 2001 3:48 PM
> Subject: Another Regex Question
>
> > I'm trying to create a CSV file from the text data below.  Lines
> > containing High and Low Tide data have 9 fields, lines having
> > sunrise/sunset and lunar data have 8 fields.
> >
> > How you differentiate between the two conditions?
> >
> > 2000-12-03 11:30 AM PST   9.39 feet  High Tide
> > 2000-12-03  4:15 PM PST   Sunset
> > 2000-12-03  7:56 PM PST   First Quarter
> > 2000-12-04  3:42 AM PST   2.81 feet  Low Tide
> > 2000-12-04  7:48 AM PST   Sunrise
> >
> > <->
> >
> > while () {
> >
> > ($year, $month, $mday, $hour, $minute, $am_pm, $tz, $height, $cond)
> > =
> > ^(\d+)-(\d+)-(\d+)\s+(\d+):(\d+)\s+([A-Z]{2})\s+([A-Z]{3})\s+
> > ([0-9A-Za-z-.\s]{11})\s+(\w+\s+\w+)/;
> >
> > $year and $started++;
> >
> > if ($cond) {
> > ($year, $month, $mday, $hour, $minute, $am_pm, $tz, $cond) =
> > /^(\d+)-(\d+)-(\d+)\s+(\d+):(\d+)\s+([A-Z]{2})\s+([A-Z]{3})\s+
> > ([A-Za-z\s])/;
> >
> > $date = "$year-$month-$mday";
> > $time = "$hour:$minute";
> > # Strip the leading and trailing spaces from $height
> > StripLTSpace($height);
> >
> > printf OUTFILE "%s\,%s\,%s\,%s\,%s\,%s\n",
> > $date, $time, $am_pm, $tz, $height, $cond;
> > }
> >
> > }
> >
> > $started or print STDERR "Didn't find a tides line";
> >
> > close(INFILE);
> > close(OUTFILE);
> > print STDERR "\n";
> >
> > 1;
> >




Re: Another Regex Question

2001-06-21 Thread Ken

One thing to do would be to test for Tide in the line(Assuming all tide data
ends with the word "Tide") right away...then do special stuff for each case
in an if:

if( /Tide$/ ) # If last word in line is Tide
{
}
else # Must be lunar
{
}

And just a note, if you're just going to put the date and time fields back
together, don't seperate them in your pattern match.

($date, $time, ... ) = ^(\d+-\d+-\d+)\s+(\d+:\d+)...$/

- Original Message -
From: "Jack Lauman" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Thursday, June 21, 2001 3:48 PM
Subject: Another Regex Question


> I'm trying to create a CSV file from the text data below.  Lines
> containing High and Low Tide data have 9 fields, lines having
> sunrise/sunset and lunar data have 8 fields.
>
> How you differentiate between the two conditions?
>
> 2000-12-03 11:30 AM PST   9.39 feet  High Tide
> 2000-12-03  4:15 PM PST   Sunset
> 2000-12-03  7:56 PM PST   First Quarter
> 2000-12-04  3:42 AM PST   2.81 feet  Low Tide
> 2000-12-04  7:48 AM PST   Sunrise
>
> <->
>
> while () {
>
> ($year, $month, $mday, $hour, $minute, $am_pm, $tz, $height, $cond)
> =
> ^(\d+)-(\d+)-(\d+)\s+(\d+):(\d+)\s+([A-Z]{2})\s+([A-Z]{3})\s+
> ([0-9A-Za-z-.\s]{11})\s+(\w+\s+\w+)/;
>
> $year and $started++;
>
> if ($cond) {
> ($year, $month, $mday, $hour, $minute, $am_pm, $tz, $cond) =
> /^(\d+)-(\d+)-(\d+)\s+(\d+):(\d+)\s+([A-Z]{2})\s+([A-Z]{3})\s+
> ([A-Za-z\s])/;
>
> $date = "$year-$month-$mday";
> $time = "$hour:$minute";
> # Strip the leading and trailing spaces from $height
> StripLTSpace($height);
>
> printf OUTFILE "%s\,%s\,%s\,%s\,%s\,%s\n",
> $date, $time, $am_pm, $tz, $height, $cond;
> }
>
> }
>
> $started or print STDERR "Didn't find a tides line";
>
> close(INFILE);
> close(OUTFILE);
> print STDERR "\n";
>
> 1;
>




Another Regex Question

2001-06-21 Thread Jack Lauman

I'm trying to create a CSV file from the text data below.  Lines
containing High and Low Tide data have 9 fields, lines having
sunrise/sunset and lunar data have 8 fields.

How you differentiate between the two conditions?

2000-12-03 11:30 AM PST   9.39 feet  High Tide
2000-12-03  4:15 PM PST   Sunset
2000-12-03  7:56 PM PST   First Quarter
2000-12-04  3:42 AM PST   2.81 feet  Low Tide
2000-12-04  7:48 AM PST   Sunrise

<->

while () {

($year, $month, $mday, $hour, $minute, $am_pm, $tz, $height, $cond)
=
^(\d+)-(\d+)-(\d+)\s+(\d+):(\d+)\s+([A-Z]{2})\s+([A-Z]{3})\s+
([0-9A-Za-z-.\s]{11})\s+(\w+\s+\w+)/;

$year and $started++;

if ($cond) {
($year, $month, $mday, $hour, $minute, $am_pm, $tz, $cond) =
/^(\d+)-(\d+)-(\d+)\s+(\d+):(\d+)\s+([A-Z]{2})\s+([A-Z]{3})\s+
([A-Za-z\s])/;

$date = "$year-$month-$mday";
$time = "$hour:$minute";
# Strip the leading and trailing spaces from $height
StripLTSpace($height);

printf OUTFILE "%s\,%s\,%s\,%s\,%s\,%s\n",
$date, $time, $am_pm, $tz, $height, $cond;
}

}   

$started or print STDERR "Didn't find a tides line";

close(INFILE);
close(OUTFILE);
print STDERR "\n";

1;