------------------------------------------------
On Wed, 23 Jul 2003 02:41:28 +0800, "LI NGOK LAM" <[EMAIL PROTECTED]> wrote:

> ----- Original Message ----- 
> From: <[EMAIL PROTECTED]>
> To: "James Kelty" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
> Sent: Wednesday, July 23, 2003 1:21 AM
> Subject: RE: Reg Exp Help...
> 
> 
> >
> > ------------------------------------------------
> > On 22 Jul 2003 09:15:29 -0700, James Kelty <[EMAIL PROTECTED]> wrote:
> >
> > > I know that this is a common request, but I have a question about
> > > parsing an email box. I have a UW IMAP box, and I am trying to extract
> > > all the emails where the line starts with From: blah blah. Now, getting
> > > those lines isn't the issue, but since each email is a little different,
> > > I am having a problem. Given this list, how would I extract JUST the
> > > email address?
> > >
> > > From: "James Kelty" <[EMAIL PROTECTED]>
> > > From: [EMAIL PROTECTED]
> > > From: <[EMAIL PROTECTED]"
> > >
> 
> < snipped >
> 
> > >
> >
> > This is a relatively complex task since e-mail addresses can come in so
> many different forms and contain so many different types of values. Your
> best bet may be to either use a module for parsing the whole message which
> is always advised, or look at the source for one of the better message
> header parsing modules to determine  how they are doing it.  Sorry this is
> such a non-specific answer, but rather than suggesting a way to poorly
> re-invent the wheel, I prefer suggesting that this should be avoided....
> >
> 
> Hmm.... The OP seems not trying to do somewhat Email::Valid,
> but to fetch the mail address from a line only... ie, try to cut out
> something not expect to left...  I hope I bet it correct..
> 
> sub filter
> {    my $line = shift; chomp ($line);
>         chop ($line) if ($line =~ /[^\w]$/; # mail must end with tld or
> country code
>         my ($waste, $mailAd) = split / /, $line ; # So 'From: ' is kicked
> out
>         $mailAd =~ s/^[^\w]//; # so '<' or '"' will be kicked out from head
> too...
>         # Perhaps the regex above can be [^\w|\\] if [EMAIL PROTECTED] is
> valid
>         # I am not sure....
>         return $mailAd
> }
> 
> Code not tested, but HTH
> 

Thank you for illustrating my point. I understood that the OP was not trying to verify 
the validity of an address but to retrieve it, but your code snippet fails on the OPs 
first line of data, which was my point, parsing email addresses out of a line of data 
is a very difficult task that is easily botched.  (There is also a missing right paren 
for those that get caught by the syntax check.)

A true e-mail address is an incredibly complex and nasty little beast, so matching 
them while it may seem simple at first becomes a nightmare quickly.  

Possibly the OP would be satisfied with just stripping off the 'From:' if that is 
guaranteed...

if ($line =~ /^From:\s*(.*)/) {
   $address = $1;
}
else {
    die "Not a 'From' line";
}

I still hold that this is best handled by a module that is designed to parse a mail 
message, or at the very least a module designed to parse either a message header, or a 
single header line that contains e-mail addresses.

http://danconia.org

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to