Dave,

{0,1} = ?
{0,} = *
{1,} = +

Also note that beginning a sub-match with a "(?" improves PCRE's performance because it tells it not to track the sub-matches, and the engine likely has a hard limit in order to prevent an expression from causing itself to become overly complicated with sub-matches that don't need to be tracked (which can result in missing matches). So never start a sub-match with just a parenthesis, always use a "(?", or other more specific argument (or whatever they call it).

A good thing to remember when dealing with regex and E-mail is that there can be both code breaks, <CODE>888</CODE>, line breaks, and also quoted printable encoding. For instance, between every two characters that display immediately together and that you are attempting to match without normalizing, you would need to test for:

   (?=\r\n|(?<[^>]+>)+)

It gets a lot worse when you start trying to apply spaces because of all the ways that this can appear. If Declude wants to get serious about applying regular expressions to the bodies of E-mail, you would need to normalize the data otherwise you would end up with too many permutations. When I do this programatically, I produce a range of variables, for instance one that is the full original source, one that strips out all line breaks, removes quoted-printable encoding, removes HTML, and combinations there-of. If you are going to try to use regular expressions for finding phrases, it is the only way to do this without leaving a huge gaping hole that even standard E-mail clients will produce source that would be missed. If you are going after E-mail format and not the content, then what you have is perfect.

Matt




David Barker wrote:
This would match on all you have provided, the . meaning any character including a space {0,1} means min of 0 max of 1

(206.{0,1}888.{0,1}2083)

If you wanted to use detect O as well as the 0 [o0] also you could use the ?i: meaning case insensitive:

(?i:2[o0]6.{0,1}888.{0,1}2[o0]83)

David B

------------------------------------------------------------------------
*From*: Matt <[EMAIL PROTECTED]>
*Sent*: Tuesday, July 03, 2007 4:08 PM
*To*: declude.junkmail@declude.com
*Subject*: Re: [Declude.JunkMail] phone regex/pcre help

Scott,

The following should do the same. Note that I do not know if Declude requires the whole match to be placed in parenthesis.

    2[0Oo]6[\s\r\n\-\.]*888[\s\r\n\-\.]*2[0Oo]83

Matt



Scott Fisher wrote:

I'm looking to replace these lines with a pcre but it doesn't seem to be working. Any suggestions?

BODY 175 CONTAINS 206 888-2083

BODY 175 CONTAINS 206.8882083

BODY 175 CONTAINS 2068882083

BODY 175 CONTAINS 206-8882083

BODY 175 CONTAINS 206 8882083

BODY 175 PCRE (?i:[\(\{]?2[0o]6[\)\}]?{\-\_\.\s}?888{\-\_\.\s}?2[0o]83)

Scott Fisher

Dir of IT

Farm Progress Companies

191 S Gary Ave

Carol Stream, IL 60188

Tel: 630-462-2323

/This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. Although Farm Progress Companies has taken reasonable precautions to ensure no viruses are present in this email, the company cannot accept responsibility for any loss or damage arising from the use of this email or attachments./


---
This E-mail came from the Declude.JunkMail mailing list. To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type "unsubscribe Declude.JunkMail". The archives can be found
at http://www.mail-archive.com.

---
This E-mail came from the Declude.JunkMail mailing list. To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type "unsubscribe Declude.JunkMail". The archives can be found
at http://www.mail-archive.com.

---
This E-mail came from the Declude.JunkMail mailing list. To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type "unsubscribe Declude.JunkMail". The archives can be found
at http://www.mail-archive.com.


---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type "unsubscribe Declude.JunkMail".  The archives can be found
at http://www.mail-archive.com.

Reply via email to