subject:"RE\: Naming conventions for tests"

RE: Naming conventions for tests

2006-05-24 Thread Ben Kreunen

  The  main  problem with this approach is that it requires monitoring
  of  the  SPAM  assassin  tests  being  applied  as  the  software is
  updated...
 
 Well,  I'd  say  this  is  a  problem  chiefly  because  whoever  _is_
 administering  the server -- not spamassassin.apache.org -- is clearly
 not encouraging the use of granular client-side filtering.

While true, it is not necessarily the cause of the problem. I've been
using small text strings for my filters to cover a number of SpamAssassin
tests, mainly for the convenience of not having to include a separate filter
for each individual test. The problem I've had is that while tests that have
the same text strings are often similar in nature/function, there are often
additional similar tests that have slight variations of these text strings.
Including these would require complicating the filter(s) and also require
revising the filter with each upgrade. Finding the intial text string to use
is also a very unscientific process of collecting test results and examining
them manually... rather more of a hack than a procedure (albeit a very
effective one).

 If  filtering  on  more  than  the Spam Score were an expectation from
 end-to-end, you would have a consistently updated list provided to you
 by  your  mail  admin,  through an intranet portal or whatever. It's a
 virtual  certainty  that your mail admin is using rules and metas that
 don't ship with SA. What would you do about those?

In theory, also true, but that's a whole different can of worms. My aim has
been to stay out of the can of worms and simply make better use of what we
have. In our case, responsibility for determining what is and isn't spam
rests entirely with the user (apart from a conservative server-side
rejection of score 15). I'm not going to argue whether that's right or
wrong... but it's the situation that we're in, and there is some scope for
making things more efficient for the end user.

A consistent naming convention would make it easier/more efficient for end
users to filter out certain groups of messages regardless of what the server
admins were or weren't doing. Server admins could also use these conventions
for any custom filters they created to provide additional improvements.

Cheers

Ben Kreunen

Imaging and IT Coordinator
Department of Pathology
The University of Melbourne

RE: Naming conventions for tests

2006-05-23 Thread Chris Santerre

Title: RE: Naming conventions for tests

-Original Message-
From: Ben Kreunen [mailto:[EMAIL PROTECTED]]
Sent: Monday, May 22, 2006 8:07 PM
To: SPAMAssassin email list
Subject: Naming conventions for tests

Hi All

I've been approaching the problem of filtering spam at the
email client end
using the SpamAssassin (3.x) header. Our email server (over
which I have no
control) has a couple of server-side filters that reject emails with
infected attachments and messages with a spam score 15.
This leaves me
with about 100 spam messages per day.

Rather than rely on the numerical value of the X-Spam-Score
header I've been
looking at client side filters using text strings to pick out
groups of
SpammAssassin tests. Many tests that are similar in nature
have common text
strings, allowing you to create a filter for a single term
that includes a
wide number of tests. The effectiveness of this approach
could be improved
with a better naming scheme for the tests.

The first filter I trialled picks up many tests for
blacklisted domains/urls
using two text strings:
X-Spam-Score contains RCVD_IN OR contains BL_

Unfortunately RCVD_IN also includes some good tests so I
had to split
this into two filters:
X-Spam-Score contains RCVD_IN AND does not contain _IADB_
AND does not
contain _BSP_
X-Spam-Score contains BL_

While these two filters do not cover all blacklist tests (and
includes other
types of tests) they do pick up 90% of spam (for me), with
numerical scores
down to 0.35. The main problem with this approach is that it requires
monitoring of the SPAM assassin tests being applied as the software is
updated to ensure that it doesn't pick up additional tests
for good email.
On the positive side, the learning aspect of this filter is
done by the
various blacklists.

If the SpamAssassin test could be named with more consistent
text strings it
would be simpler to set up client side filters.
E.g.
All tests for blacklists contain _BL_
All possible porn to start with PORN_

Cheers

Ben Kreunen

Imaging and IT Coordinator
Department of Pathology
The University of Melbourne

Would it not be easier to create meta rules for the rules you are looking for, then simply add more points for those? Thats what most of us do. Otherwise you are prbly fighting a losing battle trying to get a standard naming scheme. Its a great idea, that simply won't get followed.

And it might FP less. I can get lots of Ham that hits PORN_ rules. I have lots of friends with potty mouths :)

Chris Santerre
SysAdmin and SARE/URIBL ninja
http://www.uribl.com
http://www.rulesemporium.com

RE: Naming conventions for tests

2006-05-23 Thread Ben Kreunen

 
 Would it not be easier to create meta rules for the rules you 
 are looking for, then simply add more points for those? Thats 
 what most of us do. Otherwise you are prbly fighting a losing 
 battle trying to get a standard naming scheme. Its a great 
 idea, that simply won't get followed. 

It would, except that I am working solely at the client end, ie. I have no
direct (or indirect) influence on what happens on the server. From where I
stand it's a toss up as to which organisational change is easier to affect
;-)
 
 And it might FP less. I can get lots of Ham that hits PORN_ 
 rules. I have lots of friends with potty mouths :) 

And that's where working at the client end has its benefits. When
incorporating spam filters into standard email filters, users have greater
flexibility as to when a filter is applied. They can filter out ham first
and then apply a filter to treat the remainder as spam.

Having looked through the emails on this list it seems that most of the
focus is on removing spam at the server, but SpamAssassin also provides
users with a useful tool to exercise their own control over what they decide
is spam.

Cheers

Ben Kreunen

Imaging and IT Coordinator
Department of Pathology
The University of Melbourne

Re: Naming conventions for tests

2006-05-23 Thread Sanford Whiteman

 The  main  problem with this approach is that it requires monitoring
 of  the  SPAM  assassin  tests  being  applied  as  the  software is
 updated...

Well,  I'd  say  this  is  a  problem  chiefly  because  whoever  _is_
administering  the server -- not spamassassin.apache.org -- is clearly
not encouraging the use of granular client-side filtering.

If  filtering  on  more  than  the Spam Score were an expectation from
end-to-end, you would have a consistently updated list provided to you
by  your  mail  admin,  through an intranet portal or whatever. It's a
virtual  certainty  that your mail admin is using rules and metas that
don't ship with SA. What would you do about those?

--Sandy

RE: Naming conventions for tests

RE: Naming conventions for tests

RE: Naming conventions for tests

Re: Naming conventions for tests

4 matches

Site Navigation

Mail list logo

Footer information