Coffey, Neal wrote:
I'm trying to create a rule to catch some of the perscription drug
references that come into our system.  We're not in pharmaceuticals, so
I'm not too concerned about false positives :)

Some examples of what I'm looking for (using an innocent drug so I don't
trip someone else's filters):

        ADVwIL
        ADxDVIL
        ADxV1L
        Advjjl

Have a look at the ReplaceTags plugin:
http://wiki.apache.org/spamassassin/ReplaceTags

Also, I have a script that will generate a rule that catches a lot of this type of spam in a similar manner to the ReplaceTags plugin:

http://sandgnat.com/cmos/cmos.jsp?words=advil&matchobfuonly=true&multigapenabled=true&multigap=2&duplicatecharsenabled=true&duplicatechars=2
I've come up with a rule that'll match every one of those instances, but
also has the unfortunate consequence of matching plain old "ADVIL":

        /A[a-z]?A?D[a-z]?D?V[a-z]?V?[Il1j][a-z]?[Il1j]?L[a-z]?L?/
You probably want to add a negative lookahead, like so:
/(?!\badvil\b)A[a-z]?A?D[a-z]?D?V[a-z]?V?[Il1j][a-z]?[Il1j]?L[a-z]?L?/
This will look ahead for \badvil\b and if found, stop testing the rest of the pattern and the match fails.


Reply via email to