Re: Scanning outgoing email

2005-09-15 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Rune Kristian Viken writes:
> On Wednesday 14 September 2005 18:34, Bret Miller wrote:
> >> We're in the need of checking parts of our outgoing email for
> >> spam (read: we've got unknown webmail users.. hugs lots of them,
> >> actually.. and some of them have this annoying habit of sending
> >> nigeria spam) 
> >> 
> >> [considering network tests useless, Bayes excellent, but feels the 
> >> default weighting may be useless] 
> >>
> >> How do we re-weight the rules, and does anyone have any good
> >> suggestions on which checks to use?  Also, checking for certain
> >> blacklisted URLs in the messages will probably help (Someone recommended
> >> SURBL for  this) .. but I think a re-weighting will still be in order.
> >
> > I'd be inclined to try the SARE fraud rules (see www.rulesemporium.com)
> > in addition to the SA internal and bayes tests. 
> 
> Excellent suggestion!  I think we'll try those.  
> 
> > If you find that doesn't give you a high enough score, pushing the
> > BAYES_99 score a little higher might be in order.
> 
> That was what I was thinking about.  Others have mentioned local.cf, which 
> of course is a good thing (and we've already looked at that, it's covered 
> quite well in the docs).  What I was thinking was using the 
> 'masses/corpus'-things to generate our own weightings, trying to tune 
> SpamAssassin for our particular use-case.  Not sure if they're meant for 
> that, though - and very unsure on how to do that. I've not been able to dig 
> that up through the docs. If it's a bad idea - please do not hesitate to 
> point it out. 
> 
> Also, David B Funk suggested using -L , indicating "No network tests".  As 
> mentioned, I'm cosidering using SURBL.  Is it possible to still use SURBL 
> with -L ?  The docs says this is "Use local tests only (no DNS)" and that 
> seems to be off the mark.

I think you *do* want to use SURBL, in which case -L would not be
recommended.

One possible thing to do is collect some data, namely:

  - a selection of "good" nonspam outgoing mail
  - a selection of "bad" outgoing spam attempts

If you can do this, you can then build a corpus of mails to test against
and manually tweak scores.  I don't think you need to go to the bother
of generating an entirely new score-set, it should be possible to do this
with just a little manual tweaking.

Bayes will definitely be helpful, too, and that corpus will provide
training data.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFDKajqMJF5cimLx9ARAsS8AKCFU7W92G6S7yd0oLpAa1GCggl6LwCdFLnf
pS/Rt0JvWYKPO3ExKLrfWAE=
=w2kB
-END PGP SIGNATURE-



Re: Scanning outgoing email

2005-09-15 Thread Rune Kristian Viken
On Wednesday 14 September 2005 18:34, Bret Miller wrote:
>> We're in the need of checking parts of our outgoing email for
>> spam (read: we've got unknown webmail users.. hugs lots of them,
>> actually.. and some of them have this annoying habit of sending
>> nigeria spam) 
>> 
>> [considering network tests useless, Bayes excellent, but feels the 
>> default weighting may be useless] 
>>
>> How do we re-weight the rules, and does anyone have any good
>> suggestions on which checks to use?  Also, checking for certain
>> blacklisted URLs in the messages will probably help (Someone recommended
>> SURBL for  this) .. but I think a re-weighting will still be in order.
>
> I'd be inclined to try the SARE fraud rules (see www.rulesemporium.com)
> in addition to the SA internal and bayes tests. 

Excellent suggestion!  I think we'll try those.  

> If you find that doesn't give you a high enough score, pushing the
> BAYES_99 score a little higher might be in order.

That was what I was thinking about.  Others have mentioned local.cf, which 
of course is a good thing (and we've already looked at that, it's covered 
quite well in the docs).  What I was thinking was using the 
'masses/corpus'-things to generate our own weightings, trying to tune 
SpamAssassin for our particular use-case.  Not sure if they're meant for 
that, though - and very unsure on how to do that. I've not been able to dig 
that up through the docs. If it's a bad idea - please do not hesitate to 
point it out. 


Also, David B Funk suggested using -L , indicating "No network tests".  As 
mentioned, I'm cosidering using SURBL.  Is it possible to still use SURBL 
with -L ?  The docs says this is "Use local tests only (no DNS)" and that 
seems to be off the mark.



-- 
Rune Kristian Viken
Basefarm AS
Tlf: (+47) 98 28 28 41


Re: Scanning outgoing email

2005-09-14 Thread David B Funk
On Wed, 14 Sep 2005, Rune Kristian Viken wrote:

> We're in the need of checking parts of our outgoing email for spam (read:
> we've got unknown webmail users.. hugs lots of them, actually.. and some of
> them have this annoying habit of sending nigeria spam)
>
> My question is how to get SpamAssassin to identify the spam, as the network
> tests will be quite useless (all the email will be originating in a
> standard format, from our own servers).  Bayes will probably be quite
> efficient, and so will various other local checks - but I have this nagging
> feeling that the standard weighting of the rules will be too lax in this
> use-case (due to nothing but content-checks triggering).
>
> How do we re-weight the rules, and does anyone have any good suggestions on
> which checks to use?  Also, checking for certain blacklisted URLs in the
> messages will probably help (Someone recommended SURBL for this) .. but I
> think a re-weighting will still be in order.
>
> Suggestions?

Set up a separate instance of spamd that will be used just for
scanning your outgoing mail (obviously this will have to be done with
your local system configuration). Run that spamd with the '-L' option to
disable network checks. One effect of doing that is to cause SA to
choose an alternative scoring set that has been weighted for use in
a no-networks-test environment. See the discussion of the 4-part
'score' values in Mail::SpamAssassin::Conf.

Dave

-- 
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{


Re: Scanning outgoing email

2005-09-14 Thread jdow

From: "Bret Miller" <[EMAIL PROTECTED]>

We're in the need of checking parts of our outgoing email for 
spam (read: 
we've got unknown webmail users.. hugs lots of them, 
actually.. and some of 
them have this annoying habit of sending nigeria spam)


My question is how to get SpamAssassin to identify the spam, 
as the network 
tests will be quite useless (all the email will be originating in a 
standard format, from our own servers).  Bayes will probably be quite 
efficient, and so will various other local checks - but I 
have this nagging 
feeling that the standard weighting of the rules will be too 
lax in this 
use-case (due to nothing but content-checks triggering).


How do we re-weight the rules, and does anyone have any good 
suggestions on 
which checks to use?  Also, checking for certain blacklisted 
URLs in the 
messages will probably help (Someone recommended SURBL for 
this) .. but I 
think a re-weighting will still be in order.


Suggestions?


I'd be inclined to try the SARE fraud rules (see www.rulesemporium.com)
in addition to the SA internal and bayes tests. If you find that doesn't
give you a high enough score, pushing the BAYES_99 score a little higher
might be in order.

Bret

+
Another good technique is to count the number of addresses for message
receipt or the number of messages the user has sent and throttle based
on "too many". For "Way Too Many" throttle back to one message every
five minutes.

{^_^}


RE: Scanning outgoing email

2005-09-14 Thread Bret Miller

> We're in the need of checking parts of our outgoing email for
> spam (read:
> we've got unknown webmail users.. hugs lots of them,
> actually.. and some of
> them have this annoying habit of sending nigeria spam)
>
> My question is how to get SpamAssassin to identify the spam,
> as the network
> tests will be quite useless (all the email will be originating in a
> standard format, from our own servers).  Bayes will probably be quite
> efficient, and so will various other local checks - but I
> have this nagging
> feeling that the standard weighting of the rules will be too
> lax in this
> use-case (due to nothing but content-checks triggering).
>
> How do we re-weight the rules, and does anyone have any good
> suggestions on
> which checks to use?  Also, checking for certain blacklisted
> URLs in the
> messages will probably help (Someone recommended SURBL for
> this) .. but I
> think a re-weighting will still be in order.
>
> Suggestions?

I'd be inclined to try the SARE fraud rules (see www.rulesemporium.com)
in addition to the SA internal and bayes tests. If you find that doesn't
give you a high enough score, pushing the BAYES_99 score a little higher
might be in order.

Bret





Scanning outgoing email

2005-09-14 Thread Rune Kristian Viken

We're in the need of checking parts of our outgoing email for spam (read: 
we've got unknown webmail users.. hugs lots of them, actually.. and some of 
them have this annoying habit of sending nigeria spam)

My question is how to get SpamAssassin to identify the spam, as the network 
tests will be quite useless (all the email will be originating in a 
standard format, from our own servers).  Bayes will probably be quite 
efficient, and so will various other local checks - but I have this nagging 
feeling that the standard weighting of the rules will be too lax in this 
use-case (due to nothing but content-checks triggering).

How do we re-weight the rules, and does anyone have any good suggestions on 
which checks to use?  Also, checking for certain blacklisted URLs in the 
messages will probably help (Someone recommended SURBL for this) .. but I 
think a re-weighting will still be in order.

Suggestions?

-- 
Rune Kristian Viken
Tlf: (+47) 98 28 28 41