RE: Advice for a weekend spam assassin?

2005-06-13 Thread Ugo Bellavance
Stuart Johnston  wrote:
> James Bucanek wrote:
>> 
>> When I installed SA, I also installed Pyzor (there was some
> reason I couldn't get Razor or DCC to compile, but I can't
> remember what that is now).
>> 
>> I was all set to configure it, when I just became totally
> confused.  The only documentation I could find was the man
> pages, in that typically dense Unix man page style:
> "server= sets the server"  Of course, this doesn't
> tell you what a "server" is, does, or what address you should
> put there.  I certainly wasn't going to just start putting in
> random addresses, possibly screwing up the entire Pyzor
> network, when I had no idea what I was doing.
>> 
>> Do you have a link to a step-by-step instructions that
> explains how to set up Pyzor?  Maybe I'll make Pyzor my project or
> next weekend. 
> 
> I've found this page pretty helpful:
> 
> http://wiki.apache.org/spamassassin/SingleUserUnixInstall

SA usually detects pyzor's presence automatically once it is installed.
No config required.

Do a --lint test and look for pyzor entries.


Re: Advice for a weekend spam assassin?

2005-06-13 Thread Stuart Johnston

James Bucanek wrote:


When I installed SA, I also installed Pyzor (there was some reason I couldn't 
get Razor or DCC to compile, but I can't remember what that is now).

I was all set to configure it, when I just became totally confused.  The only documentation I could find 
was the man pages, in that typically dense Unix man page style: "server= sets the 
server"  Of course, this doesn't tell you what a "server" is, does, or what address you 
should put there.  I certainly wasn't going to just start putting in random addresses, possibly screwing 
up the entire Pyzor network, when I had no idea what I was doing.

Do you have a link to a step-by-step instructions that explains how to set up 
Pyzor?  Maybe I'll make Pyzor my project or next weekend.


I've found this page pretty helpful:

http://wiki.apache.org/spamassassin/SingleUserUnixInstall


Re: Advice for a weekend spam assassin?

2005-06-11 Thread jdow
From: "James Bucanek" <[EMAIL PROTECTED]>

jdow wrote on Friday, June 10, 2005:
>1) You need to visit http://www.rulesemporium.com/ and select at least
>   a few of the SARE rules sets. They do really help SA performance.

I'm checking these out now.

>2) I found best results here if I bucked up the BAYES_99 rule to 5
>   points. So far I have not seen that trigger a ham message with per
>   user Bayes. That per user Bayes is important. Shared Bayes is not
>   nearly as effective and should be banned in Boston - and the rest
>   of the world, too. It's a copout. Users MUST be prepared to help
>   by training their personal filters. Otherwise they must accept
>   increased spam escapes.

I'm bumping up my Bayes scores in just a few minutes.  We'll see what
happens.

As for per-user Bayes, I'm afriad that's simply out of the question.  I have
one user who still won't use subject lines, and another who hasn't figured
out how to address e-mail yet (she just uses Reply).  Seriously.  Trying to
explain Bayes filtering would be an exercise in futility.  I have to provide
a server-side solution and manage it myself, or do nothing at all.

[JDOW>>] If that is the case then beef up the SARE rules. You WILL leak
since per site Bayes cannot handle the job. One person's ham is another
person's spam. And if the people will not train on spam then either you
must read every email yourself and decide whether it is ham or spam for
training or you should just take Bayes out of the equation. People who
are too lazy to train Bayes are just going to have to suffer from spam.
Be happy with low catch rates unless all your interests and spam words
are the same.


>3) 3.0.4 is out. It installs nicely. (But give it a lot of time for
>   some of its tests. My first shot at a CPAN install I thought it
>   had died or locked up on a couple tests.)

Does it make that much of a difference over 3.0.2?  If so, I might take a
shot at upgrading later this month or next, when I get the time.

[JDOW>>] Yes.

{^_^}




Re: Advice for a weekend spam assassin?

2005-06-11 Thread James Bucanek
jdow wrote on Friday, June 10, 2005:
>1) You need to visit http://www.rulesemporium.com/ and select at least
>   a few of the SARE rules sets. They do really help SA performance.

I'm checking these out now.

>2) I found best results here if I bucked up the BAYES_99 rule to 5
>   points. So far I have not seen that trigger a ham message with per
>   user Bayes. That per user Bayes is important. Shared Bayes is not
>   nearly as effective and should be banned in Boston - and the rest
>   of the world, too. It's a copout. Users MUST be prepared to help
>   by training their personal filters. Otherwise they must accept
>   increased spam escapes.

I'm bumping up my Bayes scores in just a few minutes.  We'll see what happens.

As for per-user Bayes, I'm afriad that's simply out of the question.  I have 
one user who still won't use subject lines, and another who hasn't figured out 
how to address e-mail yet (she just uses Reply).  Seriously.  Trying to explain 
Bayes filtering would be an exercise in futility.  I have to provide a 
server-side solution and manage it myself, or do nothing at all.

>3) 3.0.4 is out. It installs nicely. (But give it a lot of time for
>   some of its tests. My first shot at a CPAN install I thought it
>   had died or locked up on a couple tests.)

Does it make that much of a difference over 3.0.2?  If so, I might take a shot 
at upgrading later this month or next, when I get the time.

-- 
James Bucanek 


Re: Advice for a weekend spam assassin?

2005-06-11 Thread James Bucanek
Thomas Cameron wrote on Friday, June 10, 2005:

>I have SA (plus spamass-milter to reject, but that's not important for
>this discussion) on a bunch of servers at various client sites.  All of
>them except one just flat stop spam.  Period.  Those clients are just
>tickled pink with the results.
>
>The one client who does not allow me to use Razor, Pyzor and DCC (they
>won't open their firewall) is very dissatisfied with the solution.  It
>is incredibly frustrating.
>
>So my answer to you would be to install those three helpers and make
>sure that you have a recent Net::DNS installation.  You will see
>accuracy go *way* up.

When I installed SA, I also installed Pyzor (there was some reason I couldn't 
get Razor or DCC to compile, but I can't remember what that is now).

I was all set to configure it, when I just became totally confused.  The only 
documentation I could find was the man pages, in that typically dense Unix man 
page style: "server= sets the server"  Of course, this doesn't tell 
you what a "server" is, does, or what address you should put there.  I 
certainly wasn't going to just start putting in random addresses, possibly 
screwing up the entire Pyzor network, when I had no idea what I was doing.

Do you have a link to a step-by-step instructions that explains how to set up 
Pyzor?  Maybe I'll make Pyzor my project or next weekend.

James
-- 
James Bucanek 


Re: Advice for a weekend spam assassin?

2005-06-11 Thread James Bucanek
Steven Dickenson wrote on Friday, June 10, 2005:

>James Bucanek wrote:
>> Greetings, As you can see, the Bayes filter has nailed it as spam,
>> but it still only gets a score of 3.6.
>
>Bayes scores are really quite low in SA v3 - 3.0.2.  You may want to 
>upgrade to 3.0.3 to get the newer Bayes scores, or revert to the v2.6x 
>scores in your local.cf.  We've done the later here with no ill effect, 
>by putting the following block in our local.cf.
>
>score BAYES_00 0 0 -1.665 -2.599
>score BAYES_05 0 0 -0.925 -0.413
>score BAYES_20 0 0 -0.730 -1.951
>score BAYES_40 0 0 -0.276 -1.096
>score BAYES_50 0 0 1.567 0.001
>score BAYES_60 0 0 3.515 1.372
>score BAYES_80 0 0 3.608 2.087
>score BAYES_95 0 0 3.514 3.063
>score BAYES_99 0 0 4.070 4.886

Thanks Steven.  It's the weekend, so it's time for me to get on the server and 
start wrecking things.

I'm going to start by upping the Bayes scores as you have suggested.  This has 
alwasy been a consistent suggestion from others, and it's an easy first step.

>> I currently have my threshold set to 7.0.  I've been considering
>> lowering it again (maybe to 5.0), but am paranoid about false
>> positives.  I can go through my mailbox and see ham that has scores
>> of 3 or even 4.
>
>I only tag my personal/family accounts, so FP's, while annoying, are 
>only a folder away (I tag at 4, everyone else at 5).  However, I've only 
>had 2 FP in the last year, and both were from mortgage companies when I 
>was going through a refi.  Would you mind posting some of your 
>higher-scoring ham, with headers?  It's possible you have a 
>misconfiguration in some of your settings.

It's possible.  Here's an example.  Note that I don't have too many ham 
messages that get a score of more than 1 or even 2, but I'd still hate to lose 
them.  ;)

From [EMAIL PROTECTED] Sun Jun  5 17:24:23 2005
Return-Path: <[EMAIL PROTECTED]>
Received: from murder ([unix socket])
 by twilightandbarking.com (Cyrus v2.2.12-OS X 10.3) with LMTPA;
 Sun, 05 Jun 2005 10:24:23 -0700
X-Sieve: CMU Sieve 2.2
Received: by mail.twilightandbarking.com (Postfix, from userid -2)
id 1150C27DC807; Sun,  5 Jun 2005 10:24:23 -0700 (MST)
Received: from phxamgw02.aexp.com (phxamgw02.aexp.com [193.32.34.74])
by mail.twilightandbarking.com (Postfix) with ESMTP id 59B5627DC805
for <[EMAIL PROTECTED]>; Sun,  5 Jun 2005 10:24:22 -0700 (MST)
Received: by phxamgw02.aexp.com; id KAA18183; Sun, 5 Jun 2005 10:23:46 -0700 
(MST)
Date: Sun, 5 Jun 2005 10:23:46 -0700 (MST)
Message-Id: <[EMAIL PROTECTED]>
Received: from unknown(148.173.240.35) by phxamgw02.aexp.com via smap (V5.5)
id xmapb7017; Sun, 5 Jun 05 09:42:53 -0700
To: [EMAIL PROTECTED]
Reply-To: "\"American Express\"
From: "American Express" <[EMAIL PROTECTED]>
Mime-Version: 1.0
Subject: Alert: Payment Reminder
Message-Source: ENG-ALERTS
Content-Type: 
multipart/alternative;boundary="0__=85256B8B0056C1C08f9e8a93df938690918c85256B8B0056C1C0"
X-Spam-Level: ***
X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on 
twilightandbarking.com
X-Spam-Status: No, score=3.1 required=7.0 tests=BAYES_99,HTML_90_100,
HTML_FONT_BIG,HTML_FONT_TINY,HTML_MESSAGE,MSGID_FROM_MTA_HEADER 
autolearn=no version=3.0.2

As you can see, this one got pegged as SPAM by Bayes.  Which is what makes me 
nervous about raising the Bayes scores.  I'll run these all through my 
learn-ham script and see if the scores don't improve (i.e. get lower).

>> I was previously using a client-side Bayes filtering system and was
>> getting 99.8+% spam identification rates.  SA has been, so far, a bit
>> of a disappointment and I'm sure it's my fault.  :)
>
>My home account probably gets a 5 9's identification rate, with a near 
>zero FP rate.  SARE rulesets, network tests, and a well trained Bayes 
>database make a huge difference in the performance of SA.  Make sure 
>your trusted_networks are set correct and enable network tests, URIBL 
>tests, and Razor/Pyzor.  Check out the CustomRuleset section of the wiki 
>for info on SARE and other rulesets.

I'm still so confused about how to set up Razor, that I haven't even looked at 
it since I downloaded and compiled it.  Maybe I'll take a stab at Razor again 
next weekend.

-- 
James Bucanek 


Re: Advice for a weekend spam assassin?

2005-06-10 Thread jdow



Why such teeny tiny letters? None of us here seem 
to have teeny tiny eyes.
 
{O.O}

  - Original Message - 
  From: 
  Andy 
  Jezierski 
  To: users@spamassassin.apache.org 
  
  Sent: 2005 June, 10, Friday 11:22
  Subject: Re: Advice for a weekend spam 
  assassin?
  Dimitri Yioulos <[EMAIL PROTECTED]> wrote on 
  06/10/2005 01:00:51 PM:> > I hope you don't mind my breaking 
  into this thread, but I have a question > regarding SA helpers. 
   My mail setup is sendmail with spamassassin, and I'm > using SARE 
  rules and bayes.  Ours is a small shop (mail vol. about 1000 > 
  msgs./day).  Right now, I get virtually no FPs, and an occaisional FN. 
   As we > grow, I would expect that we'll be receiving more spam. 
   What would Razor, > Pyzor, and DCC do for me?> > 
  Sorry for my ignorance.> > DimitriThey'll increase your chance of catching that occaisional spam that 
  might slip through.  The DCC and Razor rules are routinely 3, 4 & 5 
  in my top 10 most active rules. Right behind HTML_Message & Bayes_99. 
  YMMV Andy 



Re: Advice for a weekend spam assassin?

2005-06-10 Thread Andy Jezierski

Dimitri Yioulos <[EMAIL PROTECTED]> wrote
on 06/10/2005 01:00:51 PM:

> 
> I hope you don't mind my breaking into this thread, but I have a question

> regarding SA helpers.  My mail setup is sendmail with spamassassin,
and I'm 
> using SARE rules and bayes.  Ours is a small shop (mail vol.
about 1000 
> msgs./day).  Right now, I get virtually no FPs, and an occaisional
FN.  As we 
> grow, I would expect that we'll be receiving more spam.  What
would Razor, 
> Pyzor, and DCC do for me?
> 
> Sorry for my ignorance.
> 
> Dimitri

They'll increase your chance of catching that occaisional
spam that might slip through.  The DCC and Razor rules are routinely
3, 4 & 5 in my top 10 most active rules. Right behind HTML_Message
& Bayes_99. YMMV

Andy 

Re: Advice for a weekend spam assassin?

2005-06-10 Thread Dimitri Yioulos
On Friday June 10 2005 11:16 am, Thomas Cameron wrote:
> On Fri, 2005-06-10 at 08:06 -0700, James Bucanek wrote:
> > Greetings,
> >
> > I consider myself a "weekend" spam assassin.  I run my own server
> > (co-located), and have about a dozen users (mostly friends and family,
> > but a few paying customers).  But running a mail server isn't my day job.
> >  I don't run Razor or any of the cooperative spam filters simply because
> > I didn't have the time to figure them out and set them up.
> >
> > I'm running Spamassassin 3.0.2 which I installed a few months ago.
> >
> > SA is still only catching about 50-75% of the spam.  I've set up Bayes
> > learn ham/spam mailboxes, and I regularly feed them 200 to 500 messages a
> > day.  Yet even after months of training, I still get messages like this:
> >
> > Subject: (6/10/05) Mortgage Rate Report
> > X-Spam-Status: No, score=3.6 required=7.0 tests=BAYES_99,HTML_80_90,
> >
> > HTML_FONT_TINY,HTML_IMAGE_RATIO_04,HTML_MESSAGE,NORMAL_HTTP_TO_IP,
> > OPTING_OUT autolearn=no version=3.0.2
> >
> > As you can see, the Bayes filter has nailed it as spam, but it still only
> > gets a score of 3.6.
> >
> > I currently have my threshold set to 7.0.  I've been considering lowering
> > it again (maybe to 5.0), but am paranoid about false positives.  I can go
> > through my mailbox and see ham that has scores of 3 or even 4.
> >
> > I was hoping that someone here could give me some quick advice as to what
> > I might be doing wrong, or point me to a trouble-shooting site for SA.
> >
> > I was previously using a client-side Bayes filtering system and was
> > getting 99.8+% spam identification rates.  SA has been, so far, a bit of
> > a disappointment and I'm sure it's my fault.  :)
>
> I have SA (plus spamass-milter to reject, but that's not important for
> this discussion) on a bunch of servers at various client sites.  All of
> them except one just flat stop spam.  Period.  Those clients are just
> tickled pink with the results.
>
> The one client who does not allow me to use Razor, Pyzor and DCC (they
> won't open their firewall) is very dissatisfied with the solution.  It
> is incredibly frustrating.
>
> So my answer to you would be to install those three helpers and make
> sure that you have a recent Net::DNS installation.  You will see
> accuracy go *way* up.
>
> Thomas

I hope you don't mind my breaking into this thread, but I have a question 
regarding SA helpers.  My mail setup is sendmail with spamassassin, and I'm 
using SARE rules and bayes.  Ours is a small shop (mail vol. about 1000 
msgs./day).  Right now, I get virtually no FPs, and an occaisional FN.  As we 
grow, I would expect that we'll be receiving more spam.  What would Razor, 
Pyzor, and DCC do for me?

Sorry for my ignorance.

Dimitri


Re: Advice for a weekend spam assassin?

2005-06-10 Thread jdow
1) You need to visit http://www.rulesemporium.com/ and select at least
   a few of the SARE rules sets. They do really help SA performance.
2) I found best results here if I bucked up the BAYES_99 rule to 5
   points. So far I have not seen that trigger a ham message with per
   user Bayes. That per user Bayes is important. Shared Bayes is not
   nearly as effective and should be banned in Boston - and the rest
   of the world, too. It's a copout. Users MUST be prepared to help
   by training their personal filters. Otherwise they must accept
   increased spam escapes.
3) 3.0.4 is out. It installs nicely. (But give it a lot of time for
   some of its tests. My first shot at a CPAN install I thought it
   had died or locked up on a couple tests.)
4) 5 is a good threshold. NEVER discard messages marked as spam unless
   you do this at a rather high markup level. (SARE rules help make THAT
   happen.) A subject markup that includes the spam score is handy for
   the users. (I use a three digit markup since I have seen really nasty
   messages rack up 100 point scores here - on small score rules.) Then
   the user can feed *** SPAM(099) *** messages into a spam folder by
   sorting on the "*** SPAM" part. They should review the contents before
   discarding. Sort the mailbox alphabetically and look at the low scores
   briefly - a minute suffices for me even when I see something peculiar
   I want to make sure is already properly Bayesed. (You can verb ANY
   noun. {^_-})
5) For children's accounts modify the procedure so that their parent can
   vet the mail and drop any false markups into their children's folders.
   If the parents take a little extra time they can take the false markup
   message and extract the real message attachment to put in the child's
   mailbox. That part is up to them.
6) Do NOT use autolearn or autowhitelist. The idea is intriguing but I
   see too many busted Bayes databases from those abuse tools. I manual
   train rather seldom. About every 6 months I remember to run some
   random batches of ham though the ham training. Every time I see a
   very low score spam (or an escaped spam) with low Bayes I train on
   those messages. Otherwise I just let it perk along doing its thing.
   I do use wetware Bayes phrase filtering better known as the SARE
   rule sets and update them periodically.

Practical results:
   About 1 escaped spam a day out of 300+ spams.
   About 2 mismarks a day chiefly from the Linux Kernel Mailing List.
   (Patch sets and bug reports with dumps confuse the SARE rules.)
   (And sometimes AOL mails come through mismarked because they
   yet again screwed up their server configuration.)
   Specifically: Yesterday out of 700+ messages I had no escaped spam
   and 3 mismarked LKML spams. In the last 9 hours I've already
   received one Mexican language spam get through. That may be my
   escaped spam for the day or I might get another. No ham has been
   mismarked.

{^_^}   Thus be Joanne's configuration du jour. By the way, I use some
43 of the SARE and other rule sets. I go a trifle overboard,
methinks. It's a dangerous job but somebody has t0 do it -
Super Chicken.
- Original Message - 
From: "James Bucanek" <[EMAIL PROTECTED]>
To: 
Sent: 2005 June, 10, Friday 08:06
Subject: Advice for a weekend spam assassin?


Greetings,

I consider myself a "weekend" spam assassin.  I run my own server
(co-located), and have about a dozen users (mostly friends and family, but a
few paying customers).  But running a mail server isn't my day job.  I don't
run Razor or any of the cooperative spam filters simply because I didn't
have the time to figure them out and set them up.

I'm running Spamassassin 3.0.2 which I installed a few months ago.

SA is still only catching about 50-75% of the spam.  I've set up Bayes learn
ham/spam mailboxes, and I regularly feed them 200 to 500 messages a day.
Yet even after months of training, I still get messages like this:

Subject: (6/10/05) Mortgage Rate Report
X-Spam-Status: No, score=3.6 required=7.0 tests=BAYES_99,HTML_80_90,
HTML_FONT_TINY,HTML_IMAGE_RATIO_04,HTML_MESSAGE,NORMAL_HTTP_TO_IP,
OPTING_OUT autolearn=no version=3.0.2

As you can see, the Bayes filter has nailed it as spam, but it still only
gets a score of 3.6.

I currently have my threshold set to 7.0.  I've been considering lowering it
again (maybe to 5.0), but am paranoid about false positives.  I can go
through my mailbox and see ham that has scores of 3 or even 4.

I was hoping that someone here could give me some quick advice as to what I
might be doing wrong, or point me to a trouble-shooting site for SA.

I was previously using a client-side Bayes filtering system and was getting
99.8+% spam identification rates.  SA has been, so far, a bit of a
disappointment and I'm sure it's my fault.  :)

-- 
James Bucanek 




Re: Advice for a weekend spam assassin?

2005-06-10 Thread Thomas Cameron
On Fri, 2005-06-10 at 08:06 -0700, James Bucanek wrote:
> Greetings,
> 
> I consider myself a "weekend" spam assassin.  I run my own server 
> (co-located), and have about a dozen users (mostly friends and family, but a 
> few paying customers).  But running a mail server isn't my day job.  I don't 
> run Razor or any of the cooperative spam filters simply because I didn't have 
> the time to figure them out and set them up.
> 
> I'm running Spamassassin 3.0.2 which I installed a few months ago.
> 
> SA is still only catching about 50-75% of the spam.  I've set up Bayes learn 
> ham/spam mailboxes, and I regularly feed them 200 to 500 messages a day.  Yet 
> even after months of training, I still get messages like this:
> 
> Subject: (6/10/05) Mortgage Rate Report
> X-Spam-Status: No, score=3.6 required=7.0 tests=BAYES_99,HTML_80_90,
> HTML_FONT_TINY,HTML_IMAGE_RATIO_04,HTML_MESSAGE,NORMAL_HTTP_TO_IP,
> OPTING_OUT autolearn=no version=3.0.2
> 
> As you can see, the Bayes filter has nailed it as spam, but it still only 
> gets a score of 3.6.
> 
> I currently have my threshold set to 7.0.  I've been considering lowering it 
> again (maybe to 5.0), but am paranoid about false positives.  I can go 
> through my mailbox and see ham that has scores of 3 or even 4.
> 
> I was hoping that someone here could give me some quick advice as to what I 
> might be doing wrong, or point me to a trouble-shooting site for SA.
> 
> I was previously using a client-side Bayes filtering system and was getting 
> 99.8+% spam identification rates.  SA has been, so far, a bit of a 
> disappointment and I'm sure it's my fault.  :)

I have SA (plus spamass-milter to reject, but that's not important for
this discussion) on a bunch of servers at various client sites.  All of
them except one just flat stop spam.  Period.  Those clients are just
tickled pink with the results.

The one client who does not allow me to use Razor, Pyzor and DCC (they
won't open their firewall) is very dissatisfied with the solution.  It
is incredibly frustrating.

So my answer to you would be to install those three helpers and make
sure that you have a recent Net::DNS installation.  You will see
accuracy go *way* up.

Thomas



Re: Advice for a weekend spam assassin?

2005-06-10 Thread Mike Jackson

Running SpamAssassin is more of an art than a science. You'll probably
catch more spam by picking up most of the rules at:

http://www.rulesemporium.com/rules.htm


Or better yet, use RulesDuJour:

http://www.exit0.us/index.php?pagename=RulesDuJour

Configure it up to get the rulesets you want, run it once or twice a week 
from cron, and you'll rarely have to think about it again. Just make sure to 
look at the report mails it sends for notifications of new versions of RDJ, 
or for problems with the update process. 



Re: Advice for a weekend spam assassin?

2005-06-10 Thread Steven Dickenson

James Bucanek wrote:

Greetings, As you can see, the Bayes filter has nailed it as spam,
but it still only gets a score of 3.6.


Bayes scores are really quite low in SA v3 - 3.0.2.  You may want to 
upgrade to 3.0.3 to get the newer Bayes scores, or revert to the v2.6x 
scores in your local.cf.  We've done the later here with no ill effect, 
by putting the following block in our local.cf.


score BAYES_00 0 0 -1.665 -2.599
score BAYES_05 0 0 -0.925 -0.413
score BAYES_20 0 0 -0.730 -1.951
score BAYES_40 0 0 -0.276 -1.096
score BAYES_50 0 0 1.567 0.001
score BAYES_60 0 0 3.515 1.372
score BAYES_80 0 0 3.608 2.087
score BAYES_95 0 0 3.514 3.063
score BAYES_99 0 0 4.070 4.886


I currently have my threshold set to 7.0.  I've been considering
lowering it again (maybe to 5.0), but am paranoid about false
positives.  I can go through my mailbox and see ham that has scores
of 3 or even 4.


I only tag my personal/family accounts, so FP's, while annoying, are 
only a folder away (I tag at 4, everyone else at 5).  However, I've only 
had 2 FP in the last year, and both were from mortgage companies when I 
was going through a refi.  Would you mind posting some of your 
higher-scoring ham, with headers?  It's possible you have a 
misconfiguration in some of your settings.



I was previously using a client-side Bayes filtering system and was
getting 99.8+% spam identification rates.  SA has been, so far, a bit
of a disappointment and I'm sure it's my fault.  :)


My home account probably gets a 5 9's identification rate, with a near 
zero FP rate.  SARE rulesets, network tests, and a well trained Bayes 
database make a huge difference in the performance of SA.  Make sure 
your trusted_networks are set correct and enable network tests, URIBL 
tests, and Razor/Pyzor.  Check out the CustomRuleset section of the wiki 
for info on SARE and other rulesets.


- S




RE: Advice for a weekend spam assassin?

2005-06-10 Thread Bret Miller
> I consider myself a "weekend" spam assassin.  I run my own
> server (co-located), and have about a dozen users (mostly
> friends and family, but a few paying customers).  But running
> a mail server isn't my day job.  I don't run Razor or any of
> the cooperative spam filters simply because I didn't have the
> time to figure them out and set them up.
>
> I'm running Spamassassin 3.0.2 which I installed a few months ago.
>
> SA is still only catching about 50-75% of the spam.  I've set
> up Bayes learn ham/spam mailboxes, and I regularly feed them
> 200 to 500 messages a day.  Yet even after months of
> training, I still get messages like this:
>
> Subject: (6/10/05) Mortgage Rate Report
> X-Spam-Status: No, score=3.6 required=7.0
> tests=BAYES_99,HTML_80_90,
>
> HTML_FONT_TINY,HTML_IMAGE_RATIO_04,HTML_MESSAGE,NORMAL_HTTP_TO_IP,
> OPTING_OUT autolearn=no version=3.0.2
>
> As you can see, the Bayes filter has nailed it as spam, but
> it still only gets a score of 3.6.
>
> I currently have my threshold set to 7.0.  I've been
> considering lowering it again (maybe to 5.0), but am paranoid
> about false positives.  I can go through my mailbox and see
> ham that has scores of 3 or even 4.
>
> I was hoping that someone here could give me some quick
> advice as to what I might be doing wrong, or point me to a
> trouble-shooting site for SA.
>
> I was previously using a client-side Bayes filtering system
> and was getting 99.8+% spam identification rates.  SA has
> been, so far, a bit of a disappointment and I'm sure it's my
> fault.  :)

Running SpamAssassin is more of an art than a science. You'll probably
catch more spam by picking up most of the rules at:

http://www.rulesemporium.com/rules.htm

Bayes isn't scored terribly high in the default rule scorings. If you're
diligently training bayes, you might find that increasing the score in
local.cf would help you:

For example:

Score bayes_99 4.0


I run a mail server for a non-profit, so I don't get to spend too much
time on it. With SA 2.5x and 2.6x we used a threshhold of 8 to drop
spam. With SA 3.x, I had to lower the threshhold to 4 to catch the same
amount of spam. It's fairly rare that we have FPs even at 4.

HTH

Bret