Re: Bayes is letting too much spam through

2004-12-23 Thread Jeff Ramsey
Per the advice of Loren, I have started my bayes db over. And so far so
good. SA is working like I wanted it to. I have another question about
my learnspam script. Here is the script:

 START SCRIPT 

#!/bin/sh
# learnspam v0.34

HAMBOX=~/evolution/local/HamLearn/mbox
SPAMBOX=~/evolution/local/SpamLearn/mbox

[EMAIL PROTECTED]
TMPHDIR=~/tmp/ham
TMPSDIR=~/tmp/spam
VERBOSE=1


echo Synchronizing $HAMBOX and $SPAMBOX to $SERVER
rsync --partial --progress -z -e ssh $HAMBOX $SERVER:$TMPHDIR
rsync --partial --progress -z -e ssh $SPAMBOX $SERVER:$TMPSDIR

ssh $SERVER 
echo ; echo 'Learning ham...' ; echo ;
sa-learn --ham --showdots --mbox $TMPHDIR ;
echo 'Unlearning bad ham...' ; echo ;
sa-learn --ham --forget --showdots --mbox $TMPSDIR ;
echo 'Learning spam...' ; echo ;
sa-learn --spam --showdots --mbox $TMPSDIR ;
echo 'Removing spam senders from AWL...' ; echo ;
spamassassin -R --mbox $TMPSDIR
- END SCRIPT --

I run this script via a cron event a couple of times per day, and I move
ham to the ham mbox and spam to the spam mbox via Novell Evolution.

Do I have the sa-learn --forget line correct? Do I need it there at all?
I placed it there because I wanted to make sure that all the junkmail
not getting marked spam was not only being learned as spam, but
unlearned as ham, just in case it was auto-learned as ham.

-- 
Jeff Ramsey
MIS Administrator
Tubafor Mill, Inc.


signature.asc
Description: This is a digitally signed message part


Re: Bayes is letting too much spam through

2004-12-23 Thread Michael Parker
On Wed, Dec 22, 2004 at 07:47:35PM +, Jeff Ramsey wrote:
 ssh $SERVER 
 echo ; echo 'Learning ham...' ; echo ;
 sa-learn --ham --showdots --mbox $TMPHDIR ;

Ok

 echo 'Unlearning bad ham...' ; echo ;
 sa-learn --ham --forget --showdots --mbox $TMPSDIR ;

Ok, unless you consider the next line

 echo 'Learning spam...' ; echo ;
 sa-learn --spam --showdots --mbox $TMPSDIR ;

Ok, but really if you are gonna do this then you don't need the
--forget, bayes will do the right thing.

 echo 'Removing spam senders from AWL...' ; echo ;
 spamassassin -R --mbox $TMPSDIR

You realize that AWL also serves as a blacklist right?  I guess you
could remove it all, but I wouldn't recommend it.  Granted, it is
possible that if you're talking about mail that passed as ham, then
the address might have a positive AWL score that you don't really
want.  If anything here, I would call --add-addr-to-blacklist to give
them 100 points in AWL.

Michael


pgprBbqHWbMzR.pgp
Description: PGP signature


Re: Bayes is letting too much spam through

2004-12-23 Thread Jeff Ramsey
On Thu, 2004-12-23 at 04:17, Michael Parker wrote:
 On Wed, Dec 22, 2004 at 07:47:35PM +, Jeff Ramsey wrote:
  ssh $SERVER 
  echo ; echo 'Learning ham...' ; echo ;
  sa-learn --ham --showdots --mbox $TMPHDIR ;
 
 Ok
 
  echo 'Unlearning bad ham...' ; echo ;
  sa-learn --ham --forget --showdots --mbox $TMPSDIR ;
 
 Ok, unless you consider the next line
 
  echo 'Learning spam...' ; echo ;
  sa-learn --spam --showdots --mbox $TMPSDIR ;
 
 Ok, but really if you are gonna do this then you don't need the
 --forget, bayes will do the right thing.

I removed the unlearn line. I was not sure if bayes would do the right
thing.

 
  echo 'Removing spam senders from AWL...' ; echo ;
  spamassassin -R --mbox $TMPSDIR
 
 You realize that AWL also serves as a blacklist right?  I guess you
 could remove it all, but I wouldn't recommend it.  Granted, it is
 possible that if you're talking about mail that passed as ham, then
 the address might have a positive AWL score that you don't really
 want.  If anything here, I would call --add-addr-to-blacklist to give
 them 100 points in AWL.

The only problem that I see with doing the '--add-addr-to-blacklist' is
that it would 'blacklist' my email address as well. spamassassin -R
--mbox reads the addresses in the header as well as any that are in the
body. Is there a way that I can add the senders to blacklist, and not
myself in the process?


-- 
Jeff Ramsey
MIS Administrator
Tubafor Mill, Inc.


signature.asc
Description: This is a digitally signed message part


Re: Bayes is letting too much spam through

2004-12-23 Thread Michael Parker
On Wed, Dec 22, 2004 at 08:26:35PM +, Jeff Ramsey wrote:
  You realize that AWL also serves as a blacklist right?  I guess you
  could remove it all, but I wouldn't recommend it.  Granted, it is
  possible that if you're talking about mail that passed as ham, then
  the address might have a positive AWL score that you don't really
  want.  If anything here, I would call --add-addr-to-blacklist to give
  them 100 points in AWL.
 
 The only problem that I see with doing the '--add-addr-to-blacklist' is
 that it would 'blacklist' my email address as well. spamassassin -R
 --mbox reads the addresses in the header as well as any that are in the
 body. Is there a way that I can add the senders to blacklist, and not
 myself in the process?
 

Hmmm...first off, obviously I meant --add-to-blacklist, but you knew
that.

I've never had it add my own address to AWL.  I think the docs are a
little liberal in this case.  I just double checked and sure enough it
works the way I suspected.

Michael



pgpxeDccmBsXk.pgp
Description: PGP signature


Re: Bayes is letting too much spam through

2004-12-23 Thread Robert Brooks
Jeff Ramsey wrote:
Per the advice of Loren, I have started my bayes db over. And so far so
good. SA is working like I wanted it to. I have another question about
my learnspam script. Here is the script:
 START SCRIPT 
#!/bin/sh
# learnspam v0.34
HAMBOX=~/evolution/local/HamLearn/mbox
SPAMBOX=~/evolution/local/SpamLearn/mbox
[EMAIL PROTECTED]
TMPHDIR=~/tmp/ham
TMPSDIR=~/tmp/spam
VERBOSE=1
echo Synchronizing $HAMBOX and $SPAMBOX to $SERVER
rsync --partial --progress -z -e ssh $HAMBOX $SERVER:$TMPHDIR
rsync --partial --progress -z -e ssh $SPAMBOX $SERVER:$TMPSDIR
ssh $SERVER 
echo ; echo 'Learning ham...' ; echo ;
sa-learn --ham --showdots --mbox $TMPHDIR ;
echo 'Unlearning bad ham...' ; echo ;
sa-learn --ham --forget --showdots --mbox $TMPSDIR ;
this will be your problem, you've got --ham and --forget pointing at 
your temp spam directory, I'm betting the --forget argument is ignored 
and all your spam gets learnt as ham.

echo 'Learning spam...' ; echo ;
sa-learn --spam --showdots --mbox $TMPSDIR ;
echo 'Removing spam senders from AWL...' ; echo ;
spamassassin -R --mbox $TMPSDIR
- END SCRIPT --
I run this script via a cron event a couple of times per day, and I move
ham to the ham mbox and spam to the spam mbox via Novell Evolution.
Do I have the sa-learn --forget line correct? Do I need it there at all?
I placed it there because I wanted to make sure that all the junkmail
not getting marked spam was not only being learned as spam, but
unlearned as ham, just in case it was auto-learned as ham.

--
Robert Brooks,   Network Manager,  Cable  Wireless UK
[EMAIL PROTECTED] http://hyperlink-interactive.co.uk/
Tel: +44 (0)20 7339 8600  Fax: +44 (0)20 7339 8601
-  Help Microsoft stamp out piracy.  Give Linux to a friend today!   -


Re: Bayes is letting too much spam through

2004-12-23 Thread Thomas Arend
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Am Mittwoch, 22. Dezember 2004 20:47 schrieb Jeff Ramsey:
 Per the advice of Loren, I have started my bayes db over. And so far so
 good. SA is working like I wanted it to. I have another question about
 my learnspam script. Here is the script:

[..]
 echo Synchronizing $HAMBOX and $SPAMBOX to $SERVER
 rsync --partial --progress -z -e ssh $HAMBOX $SERVER:$TMPHDIR
 rsync --partial --progress -z -e ssh $SPAMBOX $SERVER:$TMPSDIR

 ssh $SERVER 
 echo ; echo 'Learning ham...' ; echo ;
 sa-learn --ham --showdots --mbox $TMPHDIR ;
 echo 'Unlearning bad ham...' ; echo ;
 sa-learn --ham --forget --showdots --mbox $TMPSDIR ;
 echo 'Learning spam...' ; echo ;

I have tried to look into the sa-learn script. But I have in the moment no 
clue which option preceeds --ham or --forget. Maybe you are learnng all spam 
as  ham?  sa-learn --forget --mbox should do.

 sa-learn --spam --showdots --mbox $TMPSDIR ;

[..]

 Do I have the sa-learn --forget line correct? Do I need it there at all?
 I placed it there because I wanted to make sure that all the junkmail
 not getting marked spam was not only being learned as spam, but
 unlearned as ham, just in case it was auto-learned as ham.

SA keeps track which messages have been learned as what. 

Thomas
- -- 
icq:133073900
aim:tawhv
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.0 (GNU/Linux)

iD8DBQFBypKGHe2ZLU3NgHsRAhmTAJwKYX/MEkxgLD8VblSgty1G86QJHACghUSE
Uhk+U6zIO6wypR1bTTmA8bw=
=QCHc
-END PGP SIGNATURE-