Re: List of "banned" words/bounce to sender

2010-08-11 Thread Martin Gregorie
On Tue, 2010-08-10 at 19:24 -0700, jdow wrote: > From: "Martin Gregorie" > Sent: Monday, 2010/August/09 18:08 > > > > On Mon, 2010-08-09 at 17:42 -0700, jdow wrote: > >> From: "Martin Gregorie" > >> > Something like this will match a sequence of two capitalised name > >> > words, > >> > includ

Re: List of "banned" words/bounce to sender

2010-08-10 Thread jdow
From: "Martin Gregorie" Sent: Monday, 2010/August/09 18:08 On Mon, 2010-08-09 at 17:42 -0700, jdow wrote: From: "Martin Gregorie" > Something like this will match a sequence of two capitalised name > words, > including hyphenated ones, and extract the name words: > > /([A-Z][-a-zA-Z]{1,20}

Re: List of "banned" words/bounce to sender

2010-08-10 Thread John Hardin
On Tue, 10 Aug 2010, Henrik K wrote: On Tue, Aug 10, 2010 at 07:37:32AM -0700, John Hardin wrote: On Tue, 10 Aug 2010, Henrik K wrote: Ok I did some more testing since this is an interesting experiment.. I dumped 15000 mail bodies into a file like SA sees them and feeded it to simple Perl sc

Re: List of "banned" words/bounce to sender

2010-08-10 Thread Henrik K
On Tue, Aug 10, 2010 at 07:37:32AM -0700, John Hardin wrote: > On Tue, 10 Aug 2010, Henrik K wrote: > > >Ok I did some more testing since this is an interesting experiment.. > > > >I dumped 15000 mail bodies into a file like SA sees them and > >feeded it to simple Perl script. > > > >Runtime for d

Re: List of "banned" words/bounce to sender

2010-08-10 Thread John Hardin
On Tue, 10 Aug 2010, Henrik K wrote: Ok I did some more testing since this is an interesting experiment.. I dumped 15000 mail bodies into a file like SA sees them and feeded it to simple Perl script. Runtime for different methods (memory used including Perl itself): - Single 7 name rege

Re: List of "banned" words/bounce to sender

2010-08-10 Thread Henrik K
On Tue, Aug 10, 2010 at 01:35:55PM +0300, Henrik K wrote: > > Big help was page 20+: > > > Basically you need to do something like: > > $pat = qr/\b(([a-z][-a-z]{2,15}[a-z]),? ([a-z][-a-z]{2,15}[

Re: List of "banned" words/bounce to sender

2010-08-10 Thread Henrik K
On Tue, Aug 10, 2010 at 10:47:15AM +0100, Martin Gregorie wrote: > On Tue, 2010-08-10 at 11:19 +0300, Henrik K wrote: > > Runtime for different methods (memory used including Perl itself): > > > > - Single 7 name regex, 20s (8MB) > > - 7 regexes of 1 names each, 141s (9MB) > > - "Martin st

Re: List of "banned" words/bounce to sender

2010-08-10 Thread Martin Gregorie
On Tue, 2010-08-10 at 11:19 +0300, Henrik K wrote: > Runtime for different methods (memory used including Perl itself): > > - Single 7 name regex, 20s (8MB) > - 7 regexes of 1 names each, 141s (9MB) > - "Martin style", lookups from Perl hash, 8s (12MB) > Very interesting indeed. Thanks fo

Re: List of "banned" words/bounce to sender

2010-08-10 Thread Henrik K
On Tue, Aug 10, 2010 at 02:08:28AM +0100, Martin Gregorie wrote: > On Mon, 2010-08-09 at 17:42 -0700, jdow wrote: > > From: "Martin Gregorie" > > > Something like this will match a sequence of two capitalised name words, > > > including hyphenated ones, and extract the name words: > > > > > > /([A

Re: List of "banned" words/bounce to sender

2010-08-09 Thread Martin Gregorie
On Mon, 2010-08-09 at 17:42 -0700, jdow wrote: > From: "Martin Gregorie" > > Something like this will match a sequence of two capitalised name words, > > including hyphenated ones, and extract the name words: > > > > /([A-Z][-a-zA-Z]{1,20})\s([A-Z][-a-zA-Z]{1,20})/ > > > > and should be fairly eas

Re: List of "banned" words/bounce to sender

2010-08-09 Thread jdow
From: "Martin Gregorie" Sent: Monday, 2010/August/09 15:45 On Mon, 2010-08-09 at 07:28 -0500, Daniel McDonald wrote: So, you are recommending that he use a plugin to query 70,000 records from a database, and perform 140,000 body matches, for every e-mail message he receives? It should be p

Re: List of "banned" words/bounce to sender

2010-08-09 Thread Martin Gregorie
On Mon, 2010-08-09 at 07:28 -0500, Daniel McDonald wrote: > So, you are recommending that he use a plugin to query 70,000 records from a > database, and perform 140,000 body matches, for every e-mail message he > receives? > It should be possible to write a rule that recognises names (initials + c

Re: List of "banned" words/bounce to sender

2010-08-09 Thread jdow
From: "Daniel McDonald" Sent: Monday, 2010/August/09 05:28 On 8/9/10 6:58 AM, "Martin Gregorie" wrote: On Mon, 2010-08-09 at 14:17 +0300, Henrik K wrote: On Mon, Aug 09, 2010 at 11:38:50AM +0100, Martin Gregorie wrote: On Thu, 2010-08-05 at 14:00 -0500, Matthew Kitchin (public/usenet) wro

Re: List of "banned" words/bounce to sender

2010-08-09 Thread Matthew Kitchin (public/usenet)
On 8/9/2010 8:27 AM, Henrik K wrote: Nope, people constantly underestimate the power of regexes.. of course you can easily make bad ones, but Perl can run huge lists of simple alternations FAST. I downloaded a 1 random name pack, and made a quick hack to regexify it with my favourite Regexp

Re: List of "banned" words/bounce to sender

2010-08-09 Thread Henrik K
On Mon, Aug 09, 2010 at 07:28:42AM -0500, Daniel McDonald wrote: > > This technique might cut down the number of rules by 93.5%, but then you > have to do database lookups and some fancy parsing to verify the hit. > Don't know if that would be worth it. Nope, people constantly underestimate the p

Re: List of "banned" words/bounce to sender

2010-08-09 Thread Daniel McDonald
On 8/9/10 6:58 AM, "Martin Gregorie" wrote: > On Mon, 2010-08-09 at 14:17 +0300, Henrik K wrote: >> On Mon, Aug 09, 2010 at 11:38:50AM +0100, Martin Gregorie wrote: >>> On Thu, 2010-08-05 at 14:00 -0500, Matthew Kitchin (public/usenet) >>> wrote: Thanks. We are looking at roughly 70,000 name

Re: List of "banned" words/bounce to sender

2010-08-09 Thread Martin Gregorie
On Mon, 2010-08-09 at 14:17 +0300, Henrik K wrote: > On Mon, Aug 09, 2010 at 11:38:50AM +0100, Martin Gregorie wrote: > > On Thu, 2010-08-05 at 14:00 -0500, Matthew Kitchin (public/usenet) > > wrote: > > > Thanks. We are looking at roughly 70,000 names and always growing. If I > > > gave it suffic

Re: List of "banned" words/bounce to sender

2010-08-09 Thread Henrik K
On Mon, Aug 09, 2010 at 11:38:50AM +0100, Martin Gregorie wrote: > On Thu, 2010-08-05 at 14:00 -0500, Matthew Kitchin (public/usenet) > wrote: > > Thanks. We are looking at roughly 70,000 names and always growing. If I > > gave it sufficient hardware, would you expect that to be practical, or > >

Re: List of "banned" words/bounce to sender

2010-08-09 Thread Martin Gregorie
On Thu, 2010-08-05 at 14:00 -0500, Matthew Kitchin (public/usenet) wrote: > Thanks. We are looking at roughly 70,000 names and always growing. If I > gave it sufficient hardware, would you expect that to be practical, or > is that totally ridiculous? Any options for a database look up here? > I'd

Re: List of "banned" words/bounce to sender

2010-08-05 Thread Dominic Benson
On 5 Aug 2010, at 20:13, Matthew Kitchin (public/usenet) wrote: > On 8/5/2010 2:10 PM, Noel Jones wrote: >> >> Use your database to generate rules for clamav. You could even remove >> the stock clamav rules if you want. Matching the body for 70,000 >> names would probably take less than 0.1 se

Re: List of "banned" words/bounce to sender

2010-08-05 Thread Matthew Kitchin (public/usenet)
On 8/5/2010 2:10 PM, Noel Jones wrote: Use your database to generate rules for clamav. You could even remove the stock clamav rules if you want. Matching the body for 70,000 names would probably take less than 0.1 seconds. That sounds like a really good idea. I do use ClamAV but have never w

Re: List of "banned" words/bounce to sender

2010-08-05 Thread Matthew Kitchin (public/usenet)
On 8/5/2010 2:05 PM, Bowie Bailey wrote: I would tend to say that something that large would not be practical. On the other hand, there's no way to really know until you try it. A database lookup is possible, but the problem is determining what to look up. You would have to somehow identify po

Re: List of "banned" words/bounce to sender

2010-08-05 Thread Noel Jones
On Thu, Aug 5, 2010 at 2:00 PM, Matthew Kitchin (public/usenet) wrote: >  On 8/5/2010 1:52 PM, Bowie Bailey wrote: >> >> My approach to doing something like this would be to have a rule that >> matches the names (however you implement it), and then have the MTA >> check for that particular rule hi

Re: List of "banned" words/bounce to sender

2010-08-05 Thread Bowie Bailey
On 8/5/2010 3:00 PM, Matthew Kitchin (public/usenet) wrote: > On 8/5/2010 1:52 PM, Bowie Bailey wrote: >> My approach to doing something like this would be to have a rule that >> matches the names (however you implement it), and then have the MTA >> check for that particular rule hit and bounce t

Re: List of "banned" words/bounce to sender

2010-08-05 Thread Matthew Kitchin (public/usenet)
On 8/5/2010 1:52 PM, Bowie Bailey wrote: My approach to doing something like this would be to have a rule that matches the names (however you implement it), and then have the MTA check for that particular rule hit and bounce the message if it exists. This is the same way you generally use the VB

Re: List of "banned" words/bounce to sender

2010-08-05 Thread Bowie Bailey
On 8/5/2010 2:11 PM, Matthew Kitchin (public/usenet) wrote: > > Amavisd could reject the mail. I was planning on using Spamassassin > (with a custom built rule) to examine the email for the names. We > would only use the names of our patients. The names would be dumped > out of our patient DB ev

RE: List of "banned" words/bounce to sender

2010-08-05 Thread Kelly, James
4) 997-6600, or Contact helpd...@chapman.edu. -Original Message- From: Matthew Kitchin (public/usenet) [mailto:mkitchin.pub...@gmail.com] Sent: Thursday, August 05, 2010 11:11 AM To: Spamassassin Subject: Re: List of "banned" words/bounce to sender On 8/5/2010 1:03 PM, Evan Platt wrote:

Re: List of "banned" words/bounce to sender

2010-08-05 Thread Matthew Kitchin (public/usenet)
On 8/5/2010 1:19 PM, Benny Pedersen wrote: On tor 05 aug 2010 19:47:37 CEST, "Matthew Kitchin (public/usenet)" wrote Is this a realistic setup? postfix will love it if done right with local smtp auth senders, eg no sender sends unauthed then its just add smtpd_sender_bcc_naps from a list o

Re: List of "banned" words/bounce to sender

2010-08-05 Thread Benny Pedersen
On tor 05 aug 2010 19:47:37 CEST, "Matthew Kitchin (public/usenet)" wrote Is this a realistic setup? postfix will love it if done right with local smtp auth senders, eg no sender sends unauthed then its just add smtpd_sender_bcc_naps from a list of all local recipients just dont make it

Re: List of "banned" words/bounce to sender

2010-08-05 Thread Matthew Kitchin (public/usenet)
On 8/5/2010 1:03 PM, Evan Platt wrote: Spamassassin can't handle this - it has no capability to reject mail, however you need to think - are you going to have a database of patients names, or is your intention to block anything with a "Name"? Are you really going to want to manage a databse

Re: List of "banned" words/bounce to sender

2010-08-05 Thread Evan Platt
On 08/05/2010 10:47 AM, Matthew Kitchin (public/usenet) wrote: Hello all. I have been a loyal users for years, but have never had to do much more than make a few custom rules. I work for a healthcare company, and I have been asked to implement a mechanism to search for patient names in outgoin

List of "banned" words/bounce to sender

2010-08-05 Thread Matthew Kitchin (public/usenet)
Hello all. I have been a loyal users for years, but have never had to do much more than make a few custom rules. I work for a healthcare company, and I have been asked to implement a mechanism to search for patient names in outgoing emails an bounce them back to the sender if one is identified