Re: [Declude.JunkMail] PCRE FILTERING
Would anyone be willing to share their regular expressions files (lines) with the group? I know this will be a valuable addition to Declude but most of us don't want to (or know how to) re-invent the wheel. Thanks. -- John Olden - Technology Manager Champaign Park District --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type unsubscribe Declude.JunkMail. The archives can be found at http://www.mail-archive.com.
Re: [Declude.JunkMail] PCRE FILTERING
Here are some web pages you might check out: http://www.cecilw.com/eudora/regexp.htm http://www.adamlyon.com/spam/spam_filter_regex.html http://www.adamlyon.com/spam/afo.txt http://trac.edgewall.org/wiki/BadContent http://www.regexlib.com/ Hopefully at some point Declude will post a list of good examples on their web site. Gary Original Message From: John Olden [EMAIL PROTECTED] Sent: Friday, March 16, 2007 4:58 PM To: declude.junkmail@declude.com Subject: Re: [Declude.JunkMail] PCRE FILTERING Would anyone be willing to share their regular expressions files (lines) with the group? I know this will be a valuable addition to Declude but most of us don't want to (or know how to) re-invent the wheel. Thanks. -- John Olden - Technology Manager Champaign Park District --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type unsubscribe Declude.JunkMail. The archives can be found at http://www.mail-archive.com. --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type unsubscribe Declude.JunkMail. The archives can be found at http://www.mail-archive.com.
Re[2]: [Declude.JunkMail] PCRE FILTERING
Hopefully at some point Declude will post a list of good examples on their web site. I hope people aren't ignoring the ridiculously profuse SpamAssassin Rules Emporium, SA built-in rules, etc. --Sandy Sanford Whiteman, Chief Technologist Broadleaf Systems, a division of Cypress Integrated Systems, Inc. e-mail: [EMAIL PROTECTED] SpamAssassin plugs into Declude! http://www.imprimia.com/products/software/freeutils/SPAMC32/download/release/ Defuse Dictionary Attacks: Turn Exchange or IMail mailboxes into IMail Aliases! http://www.imprimia.com/products/software/freeutils/exchange2aliases/download/release/ http://www.imprimia.com/products/software/freeutils/ldap2aliases/download/release/ --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type unsubscribe Declude.JunkMail. The archives can be found at http://www.mail-archive.com.
RE: [Declude.JunkMail] PCRE FILTERING
Yes I noticed that is why I used 3 rather than 5 as for the others, I guess one way to deal with this would be: #FP ADJUSTMENTS ANYWHERE-3 CONTAINSclassifieds Or ANYWHEREEND CONTAINSclassifieds David Barker Director of Product Management Your Email security is our business 978.499.2933 office 978.988.1311 fax [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Nick Hayer Sent: Wednesday, March 14, 2007 9:14 AM To: declude.junkmail@declude.com Subject: Re: [Declude.JunkMail] PCRE FILTERING fyi - #CIALIS ANYWHERE 3 PCRE (?i:\bc.{0,2}[\|li1í\!].{0,[EMAIL PROTECTED],2}[\|li1í\!].{0,2}[\|i1í\!].{0,2} s) This one will false positive onclassifieds -Nick --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type unsubscribe Declude.JunkMail. The archives can be found at http://www.mail-archive.com. --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type unsubscribe Declude.JunkMail. The archives can be found at http://www.mail-archive.com.
Re: [Declude.JunkMail] PCRE FILTERING
I'm seeing hits in the attachments too. Triggered ANYWHERE PCRE filter REGEX-KEYWORDS : vHXAH51eG1ujzM (valium) It would be real nice to be able to search the body without the attachments like this. BODYONLY 25 PCRE (?i:v.{0,[EMAIL PROTECTED],2}[\|li1í\!].{0,2}[\|i1í\!].{0,2}[vu].{0,2}m) Being able to search the body without the attachments would also be a time saver on those BODY filters. - Original Message - From: David Barker [EMAIL PROTECTED] To: declude.junkmail@declude.com Sent: Tuesday, March 13, 2007 11:24 AM Subject: [Declude.JunkMail] PCRE FILTERING Wanted to give a sample of how the new Regular Expressions are identifying patterns, here is a log snip on a few patterns for Drugs: ANYWHERE PCRE filter FILTER-DRUGS : C1al.is [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : C1alis is [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : [EMAIL PROTECTED] [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Cia1is s [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Cial1s S [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Cialiis [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : CIALIS [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Cialis S [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : H,G,H [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : HGH [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Human Growth Hormone [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : HxGxH [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : [EMAIL PROTECTED] [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Leviitra [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Levitra [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Levitra a [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Levltra [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : v!Agr@ a [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : V_I_A_G_R_A [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : v|aGR@ [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : V1agr@ [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : V1agra [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Val1um [weight - 1] ANYWHERE PCRE filter FILTER-DRUGS : [EMAIL PROTECTED]@ [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Vi[agra [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Via gra [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Viagr@ a [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Viagra [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Viagra a [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Viagraa [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : VlAGR@ [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : VlAGRA [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Xanax [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Xanaxx [weight - 5] These are the expressions I am using - as I am still on a learning curve these expressions may be improved and become more accurate While testing I score relatively low just in case of FP's. I use a tool called baregrep http://www.baremetalsoft.com/baregrep/ which speeds through huge DEBUG logs pulling out entries I am looking for. Hope this helps get you started with PCRE, I think the Declude community can recieve great value from sharing this type of info. #CIALIS ANYWHERE 3 PCRE (?i:\bc.{0,2}[\|li1í\!].{0,[EMAIL PROTECTED],2}[\|li1í\!].{0,2}[\|i1í\!].{0,2}s) #HGH ANYWHERE 5 PCRE (?i:\b(?:human growth hormone|(?-i:HGH)|H.G.H)\b) #LEVITRA ANYWHERE 5 PCRE (?i:\bl.{0,2}e.{0,2}v.{0,2}[\|li1í\!].{0,2}t.{0,2}r.{0,[EMAIL PROTECTED]) #VIAGRA ANYWHERE 5 PCRE (?i:v.{0,2}[\|li1í\!].{0,[EMAIL PROTECTED],2}g.{0,2}r.{0,[EMAIL PROTECTED]) #XANAX ANYWHERE 5 PCRE (?i:x.{0,[EMAIL PROTECTED],2}n.{0,[EMAIL PROTECTED],2}x) David Barker Director of Product Management Your Email security is our business 978.499.2933 office 978.988.1311 fax [EMAIL PROTECTED] --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type unsubscribe Declude.JunkMail. The archives can be found at http://www.mail-archive.com. --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type unsubscribe Declude.JunkMail. The archives can be found at http://www.mail-archive.com.
Re: [Declude.JunkMail] PCRE FILTERING
also: Capital Firms cycle analysis - Original Message - From: Nick Hayer [EMAIL PROTECTED] To: declude.junkmail@declude.com Sent: Wednesday, March 14, 2007 8:14 AM Subject: Re: [Declude.JunkMail] PCRE FILTERING fyi - #CIALIS ANYWHERE 3 PCRE (?i:\bc.{0,2}[\|li1í\!].{0,[EMAIL PROTECTED],2}[\|li1í\!].{0,2}[\|i1í\!].{0,2}s) This one will false positive onclassifieds -Nick --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type unsubscribe Declude.JunkMail. The archives can be found at http://www.mail-archive.com. --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type unsubscribe Declude.JunkMail. The archives can be found at http://www.mail-archive.com.
RE: [Declude.JunkMail] PCRE FILTERING
We can certainly look at doing something like that, currently I am using this line: BODYEND CONTAINSContent-Transfer-Encoding: base64 David -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Scott Fisher Sent: Wednesday, March 14, 2007 10:15 AM To: declude.junkmail@declude.com Subject: Re: [Declude.JunkMail] PCRE FILTERING I'm seeing hits in the attachments too. Triggered ANYWHERE PCRE filter REGEX-KEYWORDS : vHXAH51eG1ujzM (valium) It would be real nice to be able to search the body without the attachments like this. BODYONLY 25 PCRE (?i:v.{0,[EMAIL PROTECTED],2}[\|li1í\!].{0,2}[\|i1í\!].{0,2}[vu].{0,2}m) Being able to search the body without the attachments would also be a time saver on those BODY filters. - Original Message - From: David Barker [EMAIL PROTECTED] To: declude.junkmail@declude.com Sent: Tuesday, March 13, 2007 11:24 AM Subject: [Declude.JunkMail] PCRE FILTERING Wanted to give a sample of how the new Regular Expressions are identifying patterns, here is a log snip on a few patterns for Drugs: ANYWHERE PCRE filter FILTER-DRUGS : C1al.is [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : C1alis is [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : [EMAIL PROTECTED] [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Cia1is s [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Cial1s S [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Cialiis [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : CIALIS [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Cialis S [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : H,G,H [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : HGH [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Human Growth Hormone [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : HxGxH [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : [EMAIL PROTECTED] [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Leviitra [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Levitra [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Levitra a [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Levltra [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : v!Agr@ a [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : V_I_A_G_R_A [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : v|aGR@ [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : V1agr@ [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : V1agra [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Val1um [weight - 1] ANYWHERE PCRE filter FILTER-DRUGS : [EMAIL PROTECTED]@ [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Vi[agra [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Via gra [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Viagr@ a [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Viagra [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Viagra a [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Viagraa [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : VlAGR@ [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : VlAGRA [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Xanax [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Xanaxx [weight - 5] These are the expressions I am using - as I am still on a learning curve these expressions may be improved and become more accurate While testing I score relatively low just in case of FP's. I use a tool called baregrep http://www.baremetalsoft.com/baregrep/ which speeds through huge DEBUG logs pulling out entries I am looking for. Hope this helps get you started with PCRE, I think the Declude community can recieve great value from sharing this type of info. #CIALIS ANYWHERE 3 PCRE (?i:\bc.{0,2}[\|li1í\!].{0,[EMAIL PROTECTED],2}[\|li1í\!].{0,2}[\|i1í\!].{0,2}s) #HGH ANYWHERE 5 PCRE (?i:\b(?:human growth hormone|(?-i:HGH)|H.G.H)\b) #LEVITRA ANYWHERE 5 PCRE (?i:\bl.{0,2}e.{0,2}v.{0,2}[\|li1í\!].{0,2}t.{0,2}r.{0,[EMAIL PROTECTED]) #VIAGRA ANYWHERE 5 PCRE (?i:v.{0,2}[\|li1í\!].{0,[EMAIL PROTECTED],2}g.{0,2}r.{0,[EMAIL PROTECTED]) #XANAX ANYWHERE 5 PCRE (?i:x.{0,[EMAIL PROTECTED],2}n.{0,[EMAIL PROTECTED],2}x) David Barker Director of Product Management Your Email security is our business 978.499.2933 office 978.988.1311 fax [EMAIL PROTECTED] --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type unsubscribe Declude.JunkMail. The archives can be found at http://www.mail-archive.com. --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type unsubscribe Declude.JunkMail. The archives can be found at http://www.mail-archive.com. --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type unsubscribe Declude.JunkMail. The archives can be found at http://www.mail-archive.com.
RE: [Declude.JunkMail] PCRE FILTERING
I find the CIALIS on it's own does tend to match on some weird combos more than the other drugs give this one a try: BODY5 PCRE (?im:c.{0,2}[\|li1í\!].{0,[EMAIL PROTECTED],2}[\|li1í\!].{0,2}[\|li1í\!].{0,2}s+.{0, 30}?(\$\d{1,4}(\.|,)\d{1,4})) Basically looking for Cialis with some sort of $ amount David -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Scott Fisher Sent: Wednesday, March 14, 2007 10:17 AM To: declude.junkmail@declude.com Subject: Re: [Declude.JunkMail] PCRE FILTERING also: Capital Firms cycle analysis - Original Message - From: Nick Hayer [EMAIL PROTECTED] To: declude.junkmail@declude.com Sent: Wednesday, March 14, 2007 8:14 AM Subject: Re: [Declude.JunkMail] PCRE FILTERING fyi - #CIALIS ANYWHERE 3 PCRE (?i:\bc.{0,2}[\|li1í\!].{0,[EMAIL PROTECTED],2}[\|li1í\!].{0,2}[\|i1í\!].{0,2}s) This one will false positive onclassifieds -Nick --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type unsubscribe Declude.JunkMail. The archives can be found at http://www.mail-archive.com. --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type unsubscribe Declude.JunkMail. The archives can be found at http://www.mail-archive.com. --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type unsubscribe Declude.JunkMail. The archives can be found at http://www.mail-archive.com.
Re: [Declude.JunkMail] PCRE FILTERING
Dave, This was an old, old feature request/bug fix from back in the Scott days, where it was desired not include encoded base64 content on BODY searches (decoded content was desired). The work around for this it to add a separator to the end of the filter such as a period, comma, space, tab, or left HTML bracket. It would also help to specify what format the BODY data would come in, for instance is a line break in the original processed by the regular expression as a line break? It would be hugely beneficial to regular expressions to take the BODY content and strip out all line breaks, replacing them with spaces for the purpose of filtering with regex. Maybe it is time to create another variable for body content that is more regex friendly? That should be easy enough to do. Matt David Barker wrote: We can certainly look at doing something like that, currently I am using this line: BODYEND CONTAINSContent-Transfer-Encoding: base64 David -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Scott Fisher Sent: Wednesday, March 14, 2007 10:15 AM To: declude.junkmail@declude.com Subject: Re: [Declude.JunkMail] PCRE FILTERING I'm seeing hits in the attachments too. Triggered ANYWHERE PCRE filter REGEX-KEYWORDS : vHXAH51eG1ujzM (valium) It would be real nice to be able to search the body without the attachments like this. BODYONLY 25 PCRE (?i:v.{0,[EMAIL PROTECTED],2}[\|li1í\!].{0,2}[\|i1í\!].{0,2}[vu].{0,2}m) Being able to search the body without the attachments would also be a time saver on those BODY filters. - Original Message - From: David Barker [EMAIL PROTECTED] To: declude.junkmail@declude.com Sent: Tuesday, March 13, 2007 11:24 AM Subject: [Declude.JunkMail] PCRE FILTERING Wanted to give a sample of how the new Regular Expressions are identifying patterns, here is a log snip on a few patterns for Drugs: ANYWHERE PCRE filter FILTER-DRUGS : C1al.is [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : C1alis is [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : [EMAIL PROTECTED] [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Cia1is s [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Cial1s S [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Cialiis [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : CIALIS [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Cialis S [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : H,G,H [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : HGH [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Human Growth Hormone [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : HxGxH [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : [EMAIL PROTECTED] [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Leviitra [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Levitra [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Levitra a [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Levltra [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : v!Agr@ a [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : V_I_A_G_R_A [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : v|aGR@ [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : V1agr@ [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : V1agra [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Val1um [weight - 1] ANYWHERE PCRE filter FILTER-DRUGS : [EMAIL PROTECTED]@ [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Vi[agra [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Via gra [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Viagr@ a [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Viagra [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Viagra a [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Viagraa [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : VlAGR@ [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : VlAGRA [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Xanax [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Xanaxx [weight - 5] These are the expressions I am using - as I am still on a learning curve these expressions may be improved and become more accurate While testing I score relatively low just in case of FP's. I use a tool called baregrep http://www.baremetalsoft.com/baregrep/ which speeds through huge DEBUG logs pulling out entries I am looking for. Hope this helps get you started with PCRE, I think the Declude community can recieve great value from sharing this type of info. #CIALIS ANYWHERE 3 PCRE (?i:\bc.{0,2}[\|li1í\!].{0,[EMAIL PROTECTED],2}[\|li1í\!].{0,2}[\|i1í\!].{0,2}s) #HGH ANYWHERE 5 PCRE (?i:\b(?:human growth hormone|(?-i:HGH)|H.G.H)\b) #LEVITRA ANYWHERE 5 PCRE (?i:\bl.{0,2}e.{0,2}v.{0,2}[\|li1í\!].{0,2}t.{0,2}r.{0,[EMAIL PROTECTED]) #VIAGRA ANYWHERE 5 PCRE (?i:v.{0,2}[\|li1í\!].{0,[EMAIL PROTECTED],2}g.{0,2}r.{0,[EMAIL PROTECTED]) #XANAX ANYWHERE 5 PCRE (?i:x.{0,[EMAIL PROTECTED],2}n.{0,[EMAIL PROTECTED],2}x) David Barker Director of Product Management Your Email security is our business 978.499.2933 office 978.988.1311 fax [EMAIL PROTECTED] --- This E-mail
RE: [Declude.JunkMail] PCRE FILTERING
This was an old, old feature request/bug fix from back in the Scott days, where it was desired not include encoded base64 I requested this as a change long ago for two reasons: 1) To avoid false positives where search text matches the MIME or UUENCODE formatting 2) To provide an instant speed up in BODY and ANYWHERE processing because Declude has less text to match, in particular when MIME encoding text is being searched for, say, an encoded PDF, DOC or JPG. It may also have the additional benefit of being more accurate: 3) To provide for fewer false negatives, because the string size is more complete with the body text. I don't know how it was truly programmed, but the operational explanation from Scott years ago, Declude decodes the message and strips various formattings, concatenates it all into a very large string, and that is what the BODY and ANYWHERE filters search against. This lets Declude do a BODY match where the text is obfuscated inside of HTML, because the HTML tags are stripped, and likewise, should catch a phrase which is split by a linefeed. I recognized that this was a major coding change, but I thought it would be beneficial for power users to specify the layer at which the text searching is done, e.g. Message(Original message format with all the warts) MessageFixed (Illegal characters stripped and line formats fixed) MessageDecoded (MIME and UUENCODE converted back to 8 bit ASCII) Text (Only the text attachments specified, not graphics and not documents or other binary attachments) TextStripped (HTML stripped out, white space collapsed) I've removed HTML deobfuscation as a layer to this onion, as that is too specfic of a spammer technique, and is adequately covered by creative PCRE if the last two text layers are available. The MessageDecoded layer might is probably sufficiently represented by just the bones of the message, the text that makes up the framework of the message such as the header lines and the MIME Content-Type and boundary lines, without the actual text contents and without the attachments. In the many years that I've used Declude (and been preceeded by power users such as Sandy, Matt, and John [and superseded by Scott]) nobody has ever wanted to match text against the representation of an attachment, e.g. to match text against the representation of an executable, a specific virus, or the header of a TIFF file. Andrew. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Matt Sent: Wednesday, March 14, 2007 9:21 AM To: declude.junkmail@declude.com Subject: Re: [Declude.JunkMail] PCRE FILTERING Dave, This was an old, old feature request/bug fix from back in the Scott days, where it was desired not include encoded base64 content on BODY searches (decoded content was desired). The work around for this it to add a separator to the end of the filter such as a period, comma, space, tab, or left HTML bracket. It would also help to specify what format the BODY data would come in, for instance is a line break in the original processed by the regular expression as a line break? It would be hugely beneficial to regular expressions to take the BODY content and strip out all line breaks, replacing them with spaces for the purpose of filtering with regex. Maybe it is time to create another variable for body content that is more regex friendly? That should be easy enough to do. Matt David Barker wrote: We can certainly look at doing something like that, currently I am using this line: BODYEND CONTAINS Content-Transfer-Encoding: base64 David -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Scott Fisher Sent: Wednesday, March 14, 2007 10:15 AM To: declude.junkmail@declude.com Subject: Re: [Declude.JunkMail] PCRE FILTERING I'm seeing hits in the attachments too. Triggered ANYWHERE PCRE filter REGEX-KEYWORDS : vHXAH51eG1ujzM (valium) It would be real nice to be able to search the body without the attachments like this. BODYONLY 25 PCRE (?i:v.{0,[EMAIL PROTECTED],2}[\|li1í\!].{0,2}[\|i1í\!].{0,2}[vu].{0,2}m) Being able to search the body without the attachments would also be a time saver on those BODY filters. - Original Message - From: David Barker [EMAIL PROTECTED] To: declude.junkmail@declude.com Sent: Tuesday, March 13, 2007 11:24 AM Subject: [Declude.JunkMail] PCRE FILTERING Wanted to give a sample of how the new Regular Expressions are identifying patterns, here is a log snip on a few patterns for Drugs: ANYWHERE PCRE filter FILTER-DRUGS : C1al.is [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : C1alis is [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : [EMAIL PROTECTED] [weight - 5] ANYWHERE PCRE filter FILTER-DRUGS : Cia1is s [weight - 5] ANYWHERE PCRE filter
RE: [Declude.JunkMail] PCRE FILTERING
This was an old, old feature request/bug fix from back in the Scott days, where it was desired not include encoded base64 I requested this as a change long ago for two reasons: 1) To avoid false positives where search text matches the MIME or UUENCODE formatting 2) To provide an instant speed up in BODY and ANYWHERE processing because Declude has less text to match, in particular when MIME encoding text is being searched for, say, an encoded PDF, DOC or JPG. It may also have the additional benefit of being more accurate: 3) To provide for fewer false negatives, because the string size is more complete with the body text. Giving a third to what Andrew and Matt have said, I have a client that deals in electronic parts. Electronic part numbers take on all forms of sequences and not being able to limit body searches to non-base64 encoding which is primarily attachments has caused a lot of extra work on my part constantly having to make adjustments to counter this problem. Being able to have BODY not include attachments is coming to the point where it is no longer a feature but a requirement. John T --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type unsubscribe Declude.JunkMail. The archives can be found at http://www.mail-archive.com.
Re: [Declude.JunkMail] PCRE FILTERING
Just to clarify a bit on this, there is the conundrum regarding text or HTML base64 encoded attachments and other types of attachments where you want to search the text and HTML stuff in decoded format, but not the image, application and other MIME types. It is however less common to obfuscate with base64 encoding these days, so even without supporting encoded text or HTML would still be of benefit. It certainly could be done to support them though with a little extra work to look at the MIME types. Matt John T (lists) wrote: This was an old, old feature request/bug fix from back in the Scott days, where it was desired not include encoded base64 I requested this as a change long ago for two reasons: 1) To avoid false positives where search text matches the MIME or UUENCODE formatting 2) To provide an instant speed up in BODY and ANYWHERE processing because Declude has less text to match, in particular when MIME encoding text is being searched for, say, an encoded PDF, DOC or JPG. It may also have the additional benefit of being more accurate: 3) To provide for fewer false negatives, because the string size is more complete with the body text. Giving a third to what Andrew and Matt have said, I have a client that deals in electronic parts. Electronic part numbers take on all forms of sequences and not being able to limit body searches to non-base64 encoding which is primarily attachments has caused a lot of extra work on my part constantly having to make adjustments to counter this problem. Being able to have BODY not include attachments is coming to the point where it is no longer a feature but a requirement. John T --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type unsubscribe Declude.JunkMail. The archives can be found at http://www.mail-archive.com. --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type unsubscribe Declude.JunkMail. The archives can be found at http://www.mail-archive.com.