Programming Email Filters

Dave Cross Mon, 01 Sep 2003 21:13:20 +0000

Like (I guess) many people round here I'm getting Too Much Email
that I don't want to read. So I'm doing something about it. Using
Perl. And no, of course, no-one elses solution will work just
how I want it to work so I'm in full wheel-reinvention mode.


The biggest problem I'm currently having is bounces from spam
that has fake addresses in my domains as the 'From:' header,
so that's the problem I'm addressing first. I have a finite number
of email addresses that I want to read email for and any email
addressed to a different address will be filtered off to a different
folder for later investigation.

I hacked up something that identified the emails I want to filter
using Email::Simple, but I'd appreciate some input on what I've
done. There are three areas I need help on.

Problem the first.

The first problem is identifying why the email ended up in my
inbox. I need to work out which of the many email addresses in
the many headers is aimed at me. Here's the algorithm I'm using.

1/ If there's an 'Envelope-to' header then use that and stop
looking.

2/ Otherwise look for email in 'To' and 'Cc'. If something likely
is fond then stop looking.

3/ Otherwise start poking around in the 'Received' headers.

Can anyone see a problem with that?


Problem the second.

I'm using the Email::* modules but there doesn't seem to be a
way to extract the actual deliverable email address from the
headers. For example, from

"Dave Cross" <[EMAIL PROTECTED]>

I need [EMAIL PROTECTED] Currently I'm using a nasty regex hack.
Bear in mind tha some nasty email clients do stupid things like

"[EMAIL PROTECTED]" <[EMAIL PROTECTED]>

or even

"'[EMAIL PROTECTED]'" <[EMAIL PROTECTED]>

Email parsing is listed as "for a future release" in Regexp::Common.
I could use Mail::Address, but I don't really want to install
Mail-Tools given that Email::* replaces most of it.

Any other suggestions? Should I just write Email::Address and
submit it to the Email::* project?


Problem the third.

I use procmail for my mail filtering, so I need to plug this
new filter in with that. The processing I need is:

if email is to a dodgy address then
   save it in 'notme' mailbox
else
   continue with normal procmail processing
end

But my procmail powers are weak and I can't work out how to do
it. I see that spamassassin adds a new header to the email and
my next procmail rules does the filtering. That would be an acceptable
solution if I can't do it all in one rule.


Any help and advice on all of these issues much appreciated.

Dave...

-- 
<http://www.dave.org.uk>

"Let me see you make decisions, without your television"
   - Depeche Mode (Stripped)

Programming Email Filters

Reply via email to