RE: Naming conventions for tests
The main problem with this approach is that it requires monitoring of the SPAM assassin tests being applied as the software is updated... Well, I'd say this is a problem chiefly because whoever _is_ administering the server -- not spamassassin.apache.org -- is clearly not encouraging the use of granular client-side filtering. While true, it is not necessarily the cause of the problem. I've been using small text strings for my filters to cover a number of SpamAssassin tests, mainly for the convenience of not having to include a separate filter for each individual test. The problem I've had is that while tests that have the same text strings are often similar in nature/function, there are often additional similar tests that have slight variations of these text strings. Including these would require complicating the filter(s) and also require revising the filter with each upgrade. Finding the intial text string to use is also a very unscientific process of collecting test results and examining them manually... rather more of a hack than a procedure (albeit a very effective one). If filtering on more than the Spam Score were an expectation from end-to-end, you would have a consistently updated list provided to you by your mail admin, through an intranet portal or whatever. It's a virtual certainty that your mail admin is using rules and metas that don't ship with SA. What would you do about those? In theory, also true, but that's a whole different can of worms. My aim has been to stay out of the can of worms and simply make better use of what we have. In our case, responsibility for determining what is and isn't spam rests entirely with the user (apart from a conservative server-side rejection of score 15). I'm not going to argue whether that's right or wrong... but it's the situation that we're in, and there is some scope for making things more efficient for the end user. A consistent naming convention would make it easier/more efficient for end users to filter out certain groups of messages regardless of what the server admins were or weren't doing. Server admins could also use these conventions for any custom filters they created to provide additional improvements. Cheers Ben Kreunen Imaging and IT Coordinator Department of Pathology The University of Melbourne
RE: Naming conventions for tests
Title: RE: Naming conventions for tests -Original Message- From: Ben Kreunen [mailto:[EMAIL PROTECTED]] Sent: Monday, May 22, 2006 8:07 PM To: SPAMAssassin email list Subject: Naming conventions for tests Hi All I've been approaching the problem of filtering spam at the email client end using the SpamAssassin (3.x) header. Our email server (over which I have no control) has a couple of server-side filters that reject emails with infected attachments and messages with a spam score 15. This leaves me with about 100 spam messages per day. Rather than rely on the numerical value of the X-Spam-Score header I've been looking at client side filters using text strings to pick out groups of SpammAssassin tests. Many tests that are similar in nature have common text strings, allowing you to create a filter for a single term that includes a wide number of tests. The effectiveness of this approach could be improved with a better naming scheme for the tests. The first filter I trialled picks up many tests for blacklisted domains/urls using two text strings: X-Spam-Score contains RCVD_IN OR contains BL_ Unfortunately RCVD_IN also includes some good tests so I had to split this into two filters: X-Spam-Score contains RCVD_IN AND does not contain _IADB_ AND does not contain _BSP_ X-Spam-Score contains BL_ While these two filters do not cover all blacklist tests (and includes other types of tests) they do pick up 90% of spam (for me), with numerical scores down to 0.35. The main problem with this approach is that it requires monitoring of the SPAM assassin tests being applied as the software is updated to ensure that it doesn't pick up additional tests for good email. On the positive side, the learning aspect of this filter is done by the various blacklists. If the SpamAssassin test could be named with more consistent text strings it would be simpler to set up client side filters. E.g. All tests for blacklists contain _BL_ All possible porn to start with PORN_ Cheers Ben Kreunen Imaging and IT Coordinator Department of Pathology The University of Melbourne Would it not be easier to create meta rules for the rules you are looking for, then simply add more points for those? Thats what most of us do. Otherwise you are prbly fighting a losing battle trying to get a standard naming scheme. Its a great idea, that simply won't get followed. And it might FP less. I can get lots of Ham that hits PORN_ rules. I have lots of friends with potty mouths :) Chris Santerre SysAdmin and SARE/URIBL ninja http://www.uribl.com http://www.rulesemporium.com
RE: Naming conventions for tests
Would it not be easier to create meta rules for the rules you are looking for, then simply add more points for those? Thats what most of us do. Otherwise you are prbly fighting a losing battle trying to get a standard naming scheme. Its a great idea, that simply won't get followed. It would, except that I am working solely at the client end, ie. I have no direct (or indirect) influence on what happens on the server. From where I stand it's a toss up as to which organisational change is easier to affect ;-) And it might FP less. I can get lots of Ham that hits PORN_ rules. I have lots of friends with potty mouths :) And that's where working at the client end has its benefits. When incorporating spam filters into standard email filters, users have greater flexibility as to when a filter is applied. They can filter out ham first and then apply a filter to treat the remainder as spam. Having looked through the emails on this list it seems that most of the focus is on removing spam at the server, but SpamAssassin also provides users with a useful tool to exercise their own control over what they decide is spam. Cheers Ben Kreunen Imaging and IT Coordinator Department of Pathology The University of Melbourne
Re: Naming conventions for tests
The main problem with this approach is that it requires monitoring of the SPAM assassin tests being applied as the software is updated... Well, I'd say this is a problem chiefly because whoever _is_ administering the server -- not spamassassin.apache.org -- is clearly not encouraging the use of granular client-side filtering. If filtering on more than the Spam Score were an expectation from end-to-end, you would have a consistently updated list provided to you by your mail admin, through an intranet portal or whatever. It's a virtual certainty that your mail admin is using rules and metas that don't ship with SA. What would you do about those? --Sandy