Re: problems, problems

2006-08-08 Thread Logan Shaw

On Tue, 8 Aug 2006, Wolfgang Jeltsch wrote:

I was kind of shocked when I discovered that there is no SpamAssassin manual
or tutorial.  For me, it's unimaginable that the world's leading open source
spam detection software is missing such an important piece of documentation.


Well, it's not entirely true that there isn't a manual.
The various components do have manuals.  Here are the most
commonly useful ones:

perldoc Mail::SpamAssassin
perldoc Mail::SpamAssassin::Conf
man spamassassin
man sa-learn
man sa-update

And some other ones:

perldoc Mail::SpamAssassin::Plugin
perldoc Mail::SpamAssassin::Bayes
perldoc Mail::SpamAssassin::BayesStore
perldoc Mail::SpamAssassin::Plugin::Hashcash

Not all the modules that should have documentation do have
documentation (for instance, Mail::SpamAssassin::BayesStore::DBM
doesn't have any), but there is at least some information.

You can root around in the Mail/SpamAssassin directory (should
be somewhere inside your site_perl directory) to find more
modules that might have documentation.  There may be a more
elegant way, but this is one of seeing a list of modules which
have documentation:

cd ./site_perl/./Mail/SpamAssassin
find . -name '*.pm' -print | xargs grep -l '^=head'


The wiki pages are more bits and pieces than a coherent documentation and
often don't explain things in principal but give you finished configuration
files for procmail & Co.  But what if I don't use procmail?


Well, SpamAssassin doesn't deliver mail, so this question,
which is about delivery methods, isn't really relevant.


First, SpamAssassin seems to do autolearning.  What does this mean?  Does it
learn that messages which it already considers spam are spam, and messages
which it already considers ham are ham?  Wouldn't this mean that SpamAssassin
is just doing self-affirmation?


The Bayes database needs to be fed training data in order to
be effective.  It needs to see several (preferably hundreds and
hundreds) of known spam and known ham messages.  sa-learn is the
command that is used to do this manually.  Autolearning means
to do the same thing as sa-learn, but automatically.

Basically, the other rules work well enough that they can identify
obvious spam and ham.  Those messages can be used to train the
Bayes database.


Second, I often have a message of the following form in my mail log:

courierlocal: [???] Cannot open bayes databases
/home/wolfgang/.spamassassin/bayes_* R/W: lock failed: File exists

What's the problem here, and how can I get rid of it?


Without any more information than that, I would say that
something is either still using the Bayes database in your
home directory or it is finished but the lock file hasn't
been removed.  I haven't tried using SpamAssassin with Courier
anything, so I'm not really familiar with how it's normally
invoked.

  - Logan

Re: problems, problems

2006-08-08 Thread jdow

man spamassassin is the key to the whole thing beyond the INSTALL files.

Then you have things like "man Mail::SpamAssassin" and its kith and kin
like "man Mail::SpamAssassin::Conf". These will generally be more up to
date than any documentation file that exists. And of course the original
man spamassassin results point to some of the other that are important.

{^_^}
- Original Message - 
From: "Wolfgang Jeltsch" <[EMAIL PROTECTED]>



Hello,

I was kind of shocked when I discovered that there is no SpamAssassin manual
or tutorial.  For me, it's unimaginable that the world's leading open source
spam detection software is missing such an important piece of documentation.

The wiki pages are more bits and pieces than a coherent documentation and
often don't explain things in principal but give you finished configuration
files for procmail & Co.  But what if I don't use procmail?  (I use Courier
maildrop.)

At the moment, I run spamassassin with no arguments as an ordinary user on
every message I receive and decied what to do with the message accoring to
the X-Spam-Flag: header line.  But I have some problems with this.

First, SpamAssassin seems to do autolearning.  What does this mean?  Does it
learn that messages which it already considers spam are spam, and messages
which it already considers ham are ham?  Wouldn't this mean that SpamAssassin
is just doing self-affirmation?

Second, I often have a message of the following form in my mail log:

courierlocal: […] Cannot open bayes databases
/home/wolfgang/.spamassassin/bayes_* R/W: lock failed: File exists

What's the problem here, and how can I get rid of it?

I'm using SpamAssassin 3.0.3 on Debian GNU/Linux 3.1.

Thanks for you help.

Best wishes,
Wolfgang 



RE: problems, problems

2006-08-08 Thread Gary V

Hello,

I was kind of shocked when I discovered that there is no SpamAssassin 
manual
or tutorial.  For me, it's unimaginable that the world's leading open 
source
spam detection software is missing such an important piece of 
documentation.


http://spamassassin.apache.org/doc.html

There are a large number of ways SpamAssassin can be incorporated into 
someone's system. Besides what is provided on the SpamAssassin site and the 
documentation provided with SpamAssassin itself, there are many HOWTOs out 
there that deal with particular setups. Google is your friend.




The wiki pages are more bits and pieces than a coherent documentation and
often don't explain things in principal but give you finished configuration
files for procmail & Co.  But what if I don't use procmail?  (I use Courier
maildrop.)

At the moment, I run spamassassin with no arguments as an ordinary user on
every message I receive and decied what to do with the message accoring to
the X-Spam-Flag: header line.  But I have some problems with this.

First, SpamAssassin seems to do autolearning.  What does this mean?  Does 
it

learn that messages which it already considers spam are spam, and messages
which it already considers ham are ham?  Wouldn't this mean that 
SpamAssassin

is just doing self-affirmation?



Bayes builds a database of the tokens in obvious spam, and in obvious ham. 
When a message is recieved its tokens are compared to the database to help 
push the score one way or the other (or not). It's not self-affirmation 
because Bayes itself does not influence whether something is autolearned or 
not. The Bayes score tweak happens afterwards. It's more akin to learning 
from experience.



Second, I often have a message of the following form in my mail log:

courierlocal: […] Cannot open bayes databases
/home/wolfgang/.spamassassin/bayes_* R/W: lock failed: File exists

What's the problem here, and how can I get rid of it?


I would first try setting
lock_method flock
in local.cf

and if that does not help, try

bayes_learn_to_journal 1

http://spamassassin.apache.org/full/3.0.x/dist/doc/Mail_SpamAssassin_Conf.html#learning_options

Better yet, move Bayes to MySQL. This HOWTO is geared towards amavisd-new, 
but could be used for any other user and would be good for site-wide use, 
simply substitute the user name:


http://www200.pair.com/mecham/spam/debian-spamassassin-sql.html



I'm using SpamAssassin 3.0.3 on Debian GNU/Linux 3.1.

Thanks for you help.

Best wishes,
Wolfgang


Gary V

_
Is your PC infected? Get a FREE online computer virus scan from McAfee® 
Security. http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963




Re: problems, problems

2006-08-08 Thread Wolfgang Jeltsch
Am Dienstag, 8. August 2006 23:54 schrieb Logan Shaw:
> On Tue, 8 Aug 2006, Wolfgang Jeltsch wrote:
> [...]

> > Second, I often have a message of the following form in my mail log:
> >
> > courierlocal: [...] Cannot open bayes databases
> > /home/wolfgang/.spamassassin/bayes_* R/W: lock failed: File exists
> >
> > What's the problem here, and how can I get rid of it?
>
> Without any more information than that, I would say that
> something is either still using the Bayes database in your
> home directory or it is finished but the lock file hasn't
> been removed.  I haven't tried using SpamAssassin with Courier
> anything, so I'm not really familiar with how it's normally
> invoked.

What I do currently, is to just pipe each message through spamassassin as an 
ordinary user before the mail is delivered.  How do you normally invoke 
SpamAssassin in conjunction with mail software other than the Courier tools?

>- Logan

Best wishes,
Wolfgang