Re: [Declude.JunkMail] PCRE FILTERING

2007-03-16 Thread John Olden
Would anyone be willing to share their regular expressions files (lines) 
with the group?
I know this will be a valuable addition to Declude but most of us don't 
want to (or know how to) re-invent the wheel.

Thanks.
--
John Olden - Technology Manager
Champaign Park District


---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type unsubscribe Declude.JunkMail.  The archives can be found
at http://www.mail-archive.com.



Re: [Declude.JunkMail] PCRE FILTERING

2007-03-16 Thread Gary Steiner
Here are some web pages you might check out:

http://www.cecilw.com/eudora/regexp.htm

http://www.adamlyon.com/spam/spam_filter_regex.html

http://www.adamlyon.com/spam/afo.txt

http://trac.edgewall.org/wiki/BadContent

http://www.regexlib.com/

Hopefully at some point Declude will post a list of good examples on their web 
site.

Gary



 Original Message 
 From: John Olden [EMAIL PROTECTED]
 Sent: Friday, March 16, 2007 4:58 PM
 To: declude.junkmail@declude.com
 Subject: Re: [Declude.JunkMail] PCRE FILTERING
 
 Would anyone be willing to share their regular expressions files (lines) 
 with the group?
 I know this will be a valuable addition to Declude but most of us don't 
 want to (or know how to) re-invent the wheel.
 Thanks.
 -- 
 John Olden - Technology Manager
 Champaign Park District
 
 
 ---
 This E-mail came from the Declude.JunkMail mailing list.  To
 unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
 type unsubscribe Declude.JunkMail.  The archives can be found
 at http://www.mail-archive.com. 







---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type unsubscribe Declude.JunkMail.  The archives can be found
at http://www.mail-archive.com.



Re[2]: [Declude.JunkMail] PCRE FILTERING

2007-03-16 Thread Sanford Whiteman
 Hopefully at some point Declude will post a list of good examples on
 their web site.

I  hope  people  aren't ignoring the ridiculously profuse SpamAssassin
Rules Emporium, SA built-in rules, etc.

--Sandy



Sanford Whiteman, Chief Technologist
Broadleaf Systems, a division of
Cypress Integrated Systems, Inc.
e-mail: [EMAIL PROTECTED]

SpamAssassin plugs into Declude!
  http://www.imprimia.com/products/software/freeutils/SPAMC32/download/release/

Defuse Dictionary Attacks: Turn Exchange or IMail mailboxes into IMail Aliases!
  
http://www.imprimia.com/products/software/freeutils/exchange2aliases/download/release/
  
http://www.imprimia.com/products/software/freeutils/ldap2aliases/download/release/



---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type unsubscribe Declude.JunkMail.  The archives can be found
at http://www.mail-archive.com.



RE: [Declude.JunkMail] PCRE FILTERING

2007-03-14 Thread David Barker
Yes I noticed that is why I used 3 rather than 5 as for the others, I guess
one way to deal with this would be:

#FP ADJUSTMENTS
ANYWHERE-3  CONTAINSclassifieds

Or

ANYWHEREEND CONTAINSclassifieds

David Barker
Director of Product Management
Your Email security is our business
978.499.2933 office
978.988.1311 fax
[EMAIL PROTECTED]
 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Nick
Hayer
Sent: Wednesday, March 14, 2007 9:14 AM
To: declude.junkmail@declude.com
Subject: Re: [Declude.JunkMail] PCRE FILTERING

fyi -
 #CIALIS
 ANYWHERE  3   PCRE
 (?i:\bc.{0,2}[\|li1í\!].{0,[EMAIL PROTECTED],2}[\|li1í\!].{0,2}[\|i1í\!].{0,2}
 s)

   
This one will false positive onclassifieds

-Nick






---
This E-mail came from the Declude.JunkMail mailing list.  To unsubscribe,
just send an E-mail to [EMAIL PROTECTED], and type unsubscribe
Declude.JunkMail.  The archives can be found at
http://www.mail-archive.com.



---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type unsubscribe Declude.JunkMail.  The archives can be found
at http://www.mail-archive.com.



Re: [Declude.JunkMail] PCRE FILTERING

2007-03-14 Thread Scott Fisher

I'm seeing hits in the attachments too.
Triggered ANYWHERE PCRE filter REGEX-KEYWORDS : vHXAH51eG1ujzM   (valium)

It would be real nice to be able to search the body without the attachments 
like this.
BODYONLY 25  PCRE 
(?i:v.{0,[EMAIL PROTECTED],2}[\|li1í\!].{0,2}[\|i1í\!].{0,2}[vu].{0,2}m)


Being able to search the body without the attachments would also be a time 
saver on those BODY filters.




- Original Message - 
From: David Barker [EMAIL PROTECTED]

To: declude.junkmail@declude.com
Sent: Tuesday, March 13, 2007 11:24 AM
Subject: [Declude.JunkMail] PCRE FILTERING


Wanted to give a sample of how the new Regular Expressions are identifying
patterns, here is a log snip on a few patterns for Drugs:

ANYWHERE PCRE filter FILTER-DRUGS : C1al.is [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : C1alis is [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : [EMAIL PROTECTED] [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Cia1is s [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Cial1s S [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Cialiis [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : CIALIS [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Cialis S [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : H,G,H [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : HGH [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Human Growth Hormone [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : HxGxH [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : [EMAIL PROTECTED] [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Leviitra [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Levitra [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Levitra a [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Levltra [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : v!Agr@ a [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : V_I_A_G_R_A [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : v|aGR@ [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : V1agr@ [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : V1agra [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Val1um [weight - 1]
ANYWHERE PCRE filter FILTER-DRUGS : [EMAIL PROTECTED]@ [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Vi[agra [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Via gra [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Viagr@ a [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Viagra [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Viagra a [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Viagraa [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : VlAGR@ [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : VlAGRA [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Xanax [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Xanaxx [weight - 5]

These are the expressions I am using - as I am still on a learning curve
these expressions may be improved and become more accurate While testing I
score relatively low just in case of FP's. I use a tool called baregrep
http://www.baremetalsoft.com/baregrep/ which speeds through huge DEBUG logs
pulling out entries I am looking for. Hope this helps get you started with
PCRE, I think the Declude community can recieve great value from sharing
this type of info.

#CIALIS
ANYWHERE 3 PCRE
(?i:\bc.{0,2}[\|li1í\!].{0,[EMAIL PROTECTED],2}[\|li1í\!].{0,2}[\|i1í\!].{0,2}s)

#HGH
ANYWHERE 5 PCRE (?i:\b(?:human growth
hormone|(?-i:HGH)|H.G.H)\b)

#LEVITRA
ANYWHERE 5 PCRE
(?i:\bl.{0,2}e.{0,2}v.{0,2}[\|li1í\!].{0,2}t.{0,2}r.{0,[EMAIL PROTECTED])

#VIAGRA
ANYWHERE 5 PCRE
(?i:v.{0,2}[\|li1í\!].{0,[EMAIL PROTECTED],2}g.{0,2}r.{0,[EMAIL PROTECTED])

#XANAX
ANYWHERE 5 PCRE (?i:x.{0,[EMAIL PROTECTED],2}n.{0,[EMAIL PROTECTED],2}x)

David Barker
Director of Product Management
Your Email security is our business
978.499.2933 office
978.988.1311 fax
[EMAIL PROTECTED]



---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type unsubscribe Declude.JunkMail.  The archives can be found
at http://www.mail-archive.com.




---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type unsubscribe Declude.JunkMail.  The archives can be found
at http://www.mail-archive.com.



Re: [Declude.JunkMail] PCRE FILTERING

2007-03-14 Thread Scott Fisher

also:
Capital Firms
cycle analysis

- Original Message - 
From: Nick Hayer [EMAIL PROTECTED]

To: declude.junkmail@declude.com
Sent: Wednesday, March 14, 2007 8:14 AM
Subject: Re: [Declude.JunkMail] PCRE FILTERING



fyi -

#CIALIS
ANYWHERE 3 PCRE
(?i:\bc.{0,2}[\|li1í\!].{0,[EMAIL PROTECTED],2}[\|li1í\!].{0,2}[\|i1í\!].{0,2}s)



This one will false positive onclassifieds

-Nick






---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type unsubscribe Declude.JunkMail.  The archives can be found
at http://www.mail-archive.com.






---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type unsubscribe Declude.JunkMail.  The archives can be found
at http://www.mail-archive.com.



RE: [Declude.JunkMail] PCRE FILTERING

2007-03-14 Thread David Barker
We can certainly look at doing something like that, currently I am using
this line:

BODYEND CONTAINSContent-Transfer-Encoding: base64

David 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Scott
Fisher
Sent: Wednesday, March 14, 2007 10:15 AM
To: declude.junkmail@declude.com
Subject: Re: [Declude.JunkMail] PCRE FILTERING

I'm seeing hits in the attachments too.
Triggered ANYWHERE PCRE filter REGEX-KEYWORDS : vHXAH51eG1ujzM   (valium)

It would be real nice to be able to search the body without the attachments
like this.
BODYONLY 25  PCRE
(?i:v.{0,[EMAIL PROTECTED],2}[\|li1í\!].{0,2}[\|i1í\!].{0,2}[vu].{0,2}m)

Being able to search the body without the attachments would also be a time
saver on those BODY filters.



- Original Message - 
From: David Barker [EMAIL PROTECTED]
To: declude.junkmail@declude.com
Sent: Tuesday, March 13, 2007 11:24 AM
Subject: [Declude.JunkMail] PCRE FILTERING


Wanted to give a sample of how the new Regular Expressions are identifying
patterns, here is a log snip on a few patterns for Drugs:

ANYWHERE PCRE filter FILTER-DRUGS : C1al.is [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : C1alis is [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : [EMAIL PROTECTED] [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Cia1is s [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Cial1s S [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Cialiis [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : CIALIS [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Cialis S [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : H,G,H [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : HGH [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Human Growth Hormone [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : HxGxH [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : [EMAIL PROTECTED] [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Leviitra [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Levitra [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Levitra a [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Levltra [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : v!Agr@ a [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : V_I_A_G_R_A [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : v|aGR@ [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : V1agr@ [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : V1agra [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Val1um [weight - 1]
ANYWHERE PCRE filter FILTER-DRUGS : [EMAIL PROTECTED]@ [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Vi[agra [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Via gra [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Viagr@ a [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Viagra [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Viagra a [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Viagraa [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : VlAGR@ [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : VlAGRA [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Xanax [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Xanaxx [weight - 5]

These are the expressions I am using - as I am still on a learning curve
these expressions may be improved and become more accurate While testing I
score relatively low just in case of FP's. I use a tool called baregrep
http://www.baremetalsoft.com/baregrep/ which speeds through huge DEBUG logs
pulling out entries I am looking for. Hope this helps get you started with
PCRE, I think the Declude community can recieve great value from sharing
this type of info.

#CIALIS
ANYWHERE 3 PCRE
(?i:\bc.{0,2}[\|li1í\!].{0,[EMAIL PROTECTED],2}[\|li1í\!].{0,2}[\|i1í\!].{0,2}s)

#HGH
ANYWHERE 5 PCRE (?i:\b(?:human growth
hormone|(?-i:HGH)|H.G.H)\b)

#LEVITRA
ANYWHERE 5 PCRE
(?i:\bl.{0,2}e.{0,2}v.{0,2}[\|li1í\!].{0,2}t.{0,2}r.{0,[EMAIL PROTECTED])

#VIAGRA
ANYWHERE 5 PCRE
(?i:v.{0,2}[\|li1í\!].{0,[EMAIL PROTECTED],2}g.{0,2}r.{0,[EMAIL PROTECTED])

#XANAX
ANYWHERE 5 PCRE (?i:x.{0,[EMAIL PROTECTED],2}n.{0,[EMAIL PROTECTED],2}x)

David Barker
Director of Product Management
Your Email security is our business
978.499.2933 office
978.988.1311 fax
[EMAIL PROTECTED]



---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type unsubscribe Declude.JunkMail.  The archives can be found
at http://www.mail-archive.com.




---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type unsubscribe Declude.JunkMail.  The archives can be found
at http://www.mail-archive.com.



---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type unsubscribe Declude.JunkMail.  The archives can be found
at http://www.mail-archive.com.



RE: [Declude.JunkMail] PCRE FILTERING

2007-03-14 Thread David Barker
I find the CIALIS on it's own does tend to match on some weird combos more
than the other drugs give this one a try:

BODY5   PCRE
(?im:c.{0,2}[\|li1í\!].{0,[EMAIL 
PROTECTED],2}[\|li1í\!].{0,2}[\|li1í\!].{0,2}s+.{0,
30}?(\$\d{1,4}(\.|,)\d{1,4}))

Basically looking for Cialis with some sort of $ amount

David

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Scott
Fisher
Sent: Wednesday, March 14, 2007 10:17 AM
To: declude.junkmail@declude.com
Subject: Re: [Declude.JunkMail] PCRE FILTERING

also:
Capital Firms
cycle analysis

- Original Message -
From: Nick Hayer [EMAIL PROTECTED]
To: declude.junkmail@declude.com
Sent: Wednesday, March 14, 2007 8:14 AM
Subject: Re: [Declude.JunkMail] PCRE FILTERING


 fyi -
 #CIALIS
 ANYWHERE 3 PCRE
 (?i:\bc.{0,2}[\|li1í\!].{0,[EMAIL 
 PROTECTED],2}[\|li1í\!].{0,2}[\|i1í\!].{0,2}s)


 This one will false positive onclassifieds

 -Nick






 ---
 This E-mail came from the Declude.JunkMail mailing list.  To
 unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
 type unsubscribe Declude.JunkMail.  The archives can be found
 at http://www.mail-archive.com.

 



---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type unsubscribe Declude.JunkMail.  The archives can be found
at http://www.mail-archive.com.



---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type unsubscribe Declude.JunkMail.  The archives can be found
at http://www.mail-archive.com.



Re: [Declude.JunkMail] PCRE FILTERING

2007-03-14 Thread Matt

Dave,

This was an old, old feature request/bug fix from back in the Scott 
days, where it was desired not include encoded base64 content on BODY 
searches (decoded content was desired).  The work around for this it to 
add a separator to the end of the filter such as a period, comma, space, 
tab, or left HTML bracket.


It would also help to specify what format the BODY data would come in, 
for instance is a line break in the original processed by the regular 
expression as a line break?  It would be hugely beneficial to regular 
expressions to take the BODY content and strip out all line breaks, 
replacing them with spaces for the purpose of filtering with regex.  
Maybe it is time to create another variable for body content that is 
more regex friendly?  That should be easy enough to do.


Matt



David Barker wrote:

We can certainly look at doing something like that, currently I am using
this line:

BODYEND CONTAINSContent-Transfer-Encoding: base64

David 


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Scott
Fisher
Sent: Wednesday, March 14, 2007 10:15 AM
To: declude.junkmail@declude.com
Subject: Re: [Declude.JunkMail] PCRE FILTERING

I'm seeing hits in the attachments too.
Triggered ANYWHERE PCRE filter REGEX-KEYWORDS : vHXAH51eG1ujzM   (valium)

It would be real nice to be able to search the body without the attachments
like this.
BODYONLY 25  PCRE
(?i:v.{0,[EMAIL PROTECTED],2}[\|li1í\!].{0,2}[\|i1í\!].{0,2}[vu].{0,2}m)

Being able to search the body without the attachments would also be a time
saver on those BODY filters.



- Original Message - 
From: David Barker [EMAIL PROTECTED]

To: declude.junkmail@declude.com
Sent: Tuesday, March 13, 2007 11:24 AM
Subject: [Declude.JunkMail] PCRE FILTERING


Wanted to give a sample of how the new Regular Expressions are identifying
patterns, here is a log snip on a few patterns for Drugs:

ANYWHERE PCRE filter FILTER-DRUGS : C1al.is [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : C1alis is [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : [EMAIL PROTECTED] [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Cia1is s [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Cial1s S [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Cialiis [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : CIALIS [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Cialis S [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : H,G,H [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : HGH [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Human Growth Hormone [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : HxGxH [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : [EMAIL PROTECTED] [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Leviitra [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Levitra [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Levitra a [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Levltra [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : v!Agr@ a [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : V_I_A_G_R_A [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : v|aGR@ [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : V1agr@ [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : V1agra [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Val1um [weight - 1]
ANYWHERE PCRE filter FILTER-DRUGS : [EMAIL PROTECTED]@ [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Vi[agra [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Via gra [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Viagr@ a [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Viagra [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Viagra a [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Viagraa [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : VlAGR@ [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : VlAGRA [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Xanax [weight - 5]
ANYWHERE PCRE filter FILTER-DRUGS : Xanaxx [weight - 5]

These are the expressions I am using - as I am still on a learning curve
these expressions may be improved and become more accurate While testing I
score relatively low just in case of FP's. I use a tool called baregrep
http://www.baremetalsoft.com/baregrep/ which speeds through huge DEBUG logs
pulling out entries I am looking for. Hope this helps get you started with
PCRE, I think the Declude community can recieve great value from sharing
this type of info.

#CIALIS
ANYWHERE 3 PCRE
(?i:\bc.{0,2}[\|li1í\!].{0,[EMAIL PROTECTED],2}[\|li1í\!].{0,2}[\|i1í\!].{0,2}s)

#HGH
ANYWHERE 5 PCRE (?i:\b(?:human growth
hormone|(?-i:HGH)|H.G.H)\b)

#LEVITRA
ANYWHERE 5 PCRE
(?i:\bl.{0,2}e.{0,2}v.{0,2}[\|li1í\!].{0,2}t.{0,2}r.{0,[EMAIL PROTECTED])

#VIAGRA
ANYWHERE 5 PCRE
(?i:v.{0,2}[\|li1í\!].{0,[EMAIL PROTECTED],2}g.{0,2}r.{0,[EMAIL PROTECTED])

#XANAX
ANYWHERE 5 PCRE (?i:x.{0,[EMAIL PROTECTED],2}n.{0,[EMAIL PROTECTED],2}x)

David Barker
Director of Product Management
Your Email security is our business
978.499.2933 office
978.988.1311 fax
[EMAIL PROTECTED]



---
This E-mail

RE: [Declude.JunkMail] PCRE FILTERING

2007-03-14 Thread Colbeck, Andrew
 This was an old, old feature request/bug fix from back in the 
 Scott days, where it was desired not include encoded base64 

I requested this as a change long ago for two reasons:

1) To avoid false positives where search text matches the MIME or UUENCODE 
formatting

2) To provide an instant speed up in BODY and ANYWHERE processing because 
Declude has less text to match, in particular when MIME encoding text is being 
searched for, say, an encoded PDF, DOC or JPG.

It may also have the additional benefit of being more accurate:

3) To provide for fewer false negatives, because the string size is more 
complete with the body text.

I don't know how it was truly programmed, but the operational explanation from 
Scott years ago, Declude decodes the message and strips various formattings, 
concatenates it all into a very large string, and that is what the BODY and 
ANYWHERE filters search against.

This lets Declude do a BODY match where the text is obfuscated inside of HTML, 
because the HTML tags are stripped, and likewise, should catch a phrase which 
is split by a linefeed.

I recognized that this was a major coding change, but I thought it would be 
beneficial for power users to specify the layer at which the text searching 
is done, e.g.

Message(Original message format with all the warts)
MessageFixed   (Illegal characters stripped and line formats fixed)
MessageDecoded (MIME and UUENCODE converted back to 8 bit ASCII)
Text   (Only the text attachments specified, not graphics
and not documents or other binary attachments)
TextStripped   (HTML stripped out, white space collapsed)

I've removed HTML deobfuscation as a layer to this onion, as that is too 
specfic of a spammer technique, and is adequately covered by creative PCRE if 
the last two text layers are available.

The MessageDecoded layer might is probably sufficiently represented by just the 
bones of the message, the text that makes up the framework of the message such 
as the header lines and the MIME Content-Type and boundary lines, without the 
actual text contents and without the attachments.

In the many years that I've used Declude (and been preceeded by power users 
such as Sandy, Matt, and John [and superseded by Scott]) nobody has ever wanted 
to match text against the representation of an attachment, e.g. to match text 
against the representation of an executable, a specific virus, or the header of 
a TIFF file.

Andrew.



 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On 
 Behalf Of Matt
 Sent: Wednesday, March 14, 2007 9:21 AM
 To: declude.junkmail@declude.com
 Subject: Re: [Declude.JunkMail] PCRE FILTERING
 
 Dave,
 
 This was an old, old feature request/bug fix from back in the 
 Scott days, where it was desired not include encoded base64 
 content on BODY searches (decoded content was desired).  The 
 work around for this it to add a separator to the end of the 
 filter such as a period, comma, space, tab, or left HTML bracket.
 
 It would also help to specify what format the BODY data would 
 come in, for instance is a line break in the original 
 processed by the regular expression as a line break?  It 
 would be hugely beneficial to regular expressions to take the 
 BODY content and strip out all line breaks, replacing them 
 with spaces for the purpose of filtering with regex.  
 Maybe it is time to create another variable for body content 
 that is more regex friendly?  That should be easy enough to do.
 
 Matt
 
 
 
 David Barker wrote:
  We can certainly look at doing something like that, 
 currently I am using
  this line:
 
  BODYEND CONTAINS
 Content-Transfer-Encoding: base64
 
  David 
 
  -Original Message-
  From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On 
 Behalf Of Scott
  Fisher
  Sent: Wednesday, March 14, 2007 10:15 AM
  To: declude.junkmail@declude.com
  Subject: Re: [Declude.JunkMail] PCRE FILTERING
 
  I'm seeing hits in the attachments too.
  Triggered ANYWHERE PCRE filter REGEX-KEYWORDS : 
 vHXAH51eG1ujzM   (valium)
 
  It would be real nice to be able to search the body without 
 the attachments
  like this.
  BODYONLY 25  PCRE
  (?i:v.{0,[EMAIL PROTECTED],2}[\|li1í\!].{0,2}[\|i1í\!].{0,2}[vu].{0,2}m)
 
  Being able to search the body without the attachments would 
 also be a time
  saver on those BODY filters.
 
 
 
  - Original Message - 
  From: David Barker [EMAIL PROTECTED]
  To: declude.junkmail@declude.com
  Sent: Tuesday, March 13, 2007 11:24 AM
  Subject: [Declude.JunkMail] PCRE FILTERING
 
 
  Wanted to give a sample of how the new Regular Expressions 
 are identifying
  patterns, here is a log snip on a few patterns for Drugs:
 
  ANYWHERE PCRE filter FILTER-DRUGS : C1al.is [weight - 5]
  ANYWHERE PCRE filter FILTER-DRUGS : C1alis is [weight - 5]
  ANYWHERE PCRE filter FILTER-DRUGS : [EMAIL PROTECTED] [weight - 5]
  ANYWHERE PCRE filter FILTER-DRUGS : Cia1is s [weight - 5]
  ANYWHERE PCRE filter

RE: [Declude.JunkMail] PCRE FILTERING

2007-03-14 Thread John T \(lists\)
  This was an old, old feature request/bug fix from back in the
  Scott days, where it was desired not include encoded base64
 
 I requested this as a change long ago for two reasons:
 
 1) To avoid false positives where search text matches the MIME or UUENCODE
 formatting
 
 2) To provide an instant speed up in BODY and ANYWHERE processing because
 Declude has less text to match, in particular when MIME encoding text is
 being searched for, say, an encoded PDF, DOC or JPG.
 
 It may also have the additional benefit of being more accurate:
 
 3) To provide for fewer false negatives, because the string size is more
 complete with the body text.

Giving a third to what Andrew and Matt have said, I have a client that deals
in electronic parts. Electronic part numbers take on all forms of sequences
and not being able to limit body searches to non-base64 encoding which is
primarily attachments has caused a lot of extra work on my part constantly
having to make adjustments to counter this problem.

Being able to have BODY not include attachments is coming to the point where
it is no longer a feature but a requirement.

John T




---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type unsubscribe Declude.JunkMail.  The archives can be found
at http://www.mail-archive.com.



Re: [Declude.JunkMail] PCRE FILTERING

2007-03-14 Thread Matt
Just to clarify a bit on this, there is the conundrum regarding text or 
HTML base64 encoded attachments and other types of attachments where you 
want to search the text and HTML stuff in decoded format, but not the 
image, application and other MIME types.  It is however less common to 
obfuscate with base64 encoding these days, so even without supporting 
encoded text or HTML would still be of benefit.  It certainly could be 
done to support them though with a little extra work to look at the MIME 
types.


Matt



John T (lists) wrote:

This was an old, old feature request/bug fix from back in the
Scott days, where it was desired not include encoded base64
  

I requested this as a change long ago for two reasons:

1) To avoid false positives where search text matches the MIME or UUENCODE
formatting

2) To provide an instant speed up in BODY and ANYWHERE processing because
Declude has less text to match, in particular when MIME encoding text is
being searched for, say, an encoded PDF, DOC or JPG.

It may also have the additional benefit of being more accurate:

3) To provide for fewer false negatives, because the string size is more
complete with the body text.



Giving a third to what Andrew and Matt have said, I have a client that deals
in electronic parts. Electronic part numbers take on all forms of sequences
and not being able to limit body searches to non-base64 encoding which is
primarily attachments has caused a lot of extra work on my part constantly
having to make adjustments to counter this problem.

Being able to have BODY not include attachments is coming to the point where
it is no longer a feature but a requirement.

John T




---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type unsubscribe Declude.JunkMail.  The archives can be found
at http://www.mail-archive.com.



  



---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type unsubscribe Declude.JunkMail.  The archives can be found
at http://www.mail-archive.com.