Body-only checks?

2006-12-24 Thread Garry Glendown
Hi,

for some project I was wondering if I could use SA's Bayes methods and
rules to recognize spam ... problem is, it will be body only checks,
so no email headers, etc., plus I would only want a return-code that
stands for the spam score calculated ... is there any way to do that?
Also, I would need a personal database that would not mess up the
system-wide Bayes database ...

Couldn't find anything appropriate in the SA docs ... !?

Tnx  merry Christmas,

-garry

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.



SA filter load: massive increase

2006-11-07 Thread Garry Glendown
Hi,

after fixing sone lint errors that had gone unnoticed for some time, our
MailScanner/SA filter server has started bogging under the daily flood
of mail (~100k mails per day) - a load that had not done anything to the
box before ... As the only change had been fixing the lint error,
followed by RDJ update, I suspect one or multiple of the rules have
caused the load increase ... here's the list of rules I use:

TRUSTED_RULESETS=SARE_REDIRECT_POST300 SARE_EVILNUMBERS2
SARE_BAYES_POISON_NXM SARE_HTML0 SARE_HTML1 SARE_HTML2 SARE_HTML3
SARE_HTML0 SARE_HTML1 SARE_HTML2 SARE_HTML3 SARE_SPECIFIC SARE_ADULT
SARE_BML SARE_FRAUD SARE_SPOOF SARE_RANDOM SARE_SPAMCOP_TOP200 SARE_OEM
SARE_GENLSUBJ0 SARE_GENLSUBJ1 SARE_GENLSUBJ2 SARE_GENLSUBJ3  SARE_UNSUB
SARE_URI0 SARE_URI1 SARE_URI3 SARE_WHITELIST_SPF SARE_WHITELIST_RCVD
SARE_OBFU SARE_STOCKS EVILNUMBERS SARE_ADULT SARE_BAYES_POISON_NXM
SARE_BML SARE_CODING SARE_FRAUD SARE_HEADER SARE_OEM SARE_RANDOM
SARE_REDIRECT_POST300 SARE_SPECIFIC SARE_SPOOF TRIPWIRE ZMI_GERMAN;

Anything that could cause massive backlog and should be dropped?

Thanks!

-garry


Re: SA filter load: massive increase

2006-11-07 Thread Garry Glendown
Matt Kettler wrote:
 In general I'd take a look at the sizes of the rule files themselves..
 Look for ones that are significantly larger than 128k or so.

Of those, there only few:

-rw-r--r--  1 root root 384645 Oct 30  2005 70_sare_header.cf
-rw-r--r--  1 root root 158513 Oct  1  2005 70_sare_obfu.cf

Given both are significantly older than the occurrence of the
performance decrease, neither should be the cause ... in fact, the only
sare-rules that have dates newer than Oct 1st are sare_stocks and
sc_top200 ...

-gg



weight loss spam

2006-10-12 Thread Garry Glendown

Hi all,

the last couple days have resulted in a lot of new, untagged weight loss 
spam (Anatrim), many scoring in the area of 3.x, though a few are tagged 
due to either stupidity of the sender (date in future), or some DNS 
blacklisting. Has anybody updated some rules yet to catch this?


Tnx, -gg


Re: Re: Hi spam

2006-10-08 Thread Garry Glendown
Daryl C. W. O'Shea wrote:
 Kenneth Porter wrote:
 I noticed today an unusually high incidence of spam subject lines of
 Re: Hi, and I don't see a rule for this in the distribution. Do
 others see this much in legitimate mail? Or could it make a good rule?
 
 I see enough legit mail with such a subject go through my systems that
 would make the rule useless, at least for my users.

True, the subject is a bad idea, but the contents is pretty consistent,
most have something like Vragra in large type, a link and some random
text fragment at the end ...

Some of the Hi-mails are already tagged by our filter (I'd say
somewhere beyond 80%), though some still get through untagged:

HTML_MESSAGE 0.00, URIBL_SBL 1.09, URIBL_WS_SURBL 1.53)

HTML_MESSAGE 0.00, RCVD_IN_BL_SPAMCOP_NET 1.33, URIBL_SBL 1.09,
URIBL_WS_SURBL 1.53)

HTML_MESSAGE 0.00, RCVD_IN_BL_SPAMCOP_NET 1.33, URIBL_SBL 1.09,
URIBL_WS_SURBL 1.53)

Wonder if there's an update to the rules-emporium config mangled some
time soon !?

-gg


Special rules ...

2005-10-08 Thread Garry Glendown
I've run into kind of a problem at a customer installation, someone
suggested part of my problem could be solved w/ SpamAssassin, though at
the moment it might still miss some features required ...

Here we go ... This customer before had (and is still in the process of
changing over from) Novel w/ Tobit David. While the whole system might
be a POS considering a decent Unix system :) it had some features that
come in handy - specifically the customer had been able to define what
happened with certain mails. Before, he was able to:

- quarantine large files for admin approval
- quarantine certain file types for admin approval
- limit number of recipients, mails exceeding the number would be
quarantined again

plus a couple of other minor things that I could implement easily
w/MailScanner or similar tools. Now, I could limit the recipients, but
it's a all or nothing situation at the moment (running sendmail, which I
would rather not change if possible). From browsing the docs, I found
config options for the .cf files that might allow me to change the
recipient header to somebody else if certain rules are met.

What I did not find, either overlooked, by not knowing what to look for,
or because it's simply not there, are the points listed above. In that
combination (I can block files types w/ Mailscanner, but again, they
would not be brought to the admin's attention).

So, is there any chance of implementing the above features with
SpamAssassin, or does anybody happen to know a tool that might be able
to? I'd be willing to go through the sources to tweak them a bit for
added features, too, if someone could point me towards the general
direction ... (not really much of a Perl hacker, though, rather do C...)

Tnx, -garry