Re: Bayes Filtering

2015-08-02 Thread Reindl Harald



Am 02.08.2015 um 14:57 schrieb Roman Gelfand:

Could somebody post a successful bayes configuration?


??

you just need to *train* it for ham *and* spam





signature.asc
Description: OpenPGP digital signature


Bayes Filtering

2015-08-02 Thread Roman Gelfand
Could somebody post a successful bayes configuration?

Thanks in advance


Re: Bayes Filtering

2015-08-02 Thread Christian Jaeger
On August 2, 2015 6:40:10 PM CEST, Reindl Harald h.rei...@thelounge.net wrote:
 no idea what you are talking about by saying
 I can't find anything about this in the docs

I'm talking about the bundled docs. The man / perldoc pages of 
Mail::SpamAssassin::Plugin::Bayes / Mail::SpamAssassin::*Bayes* and the default 
config files. That's where I expected this info to be. It's something simple 
and basic, i.e. something that the writer of the software can foresee the need 
for documentation, so it makes sense that it's in the same files that the 
programmers wrote. That's where I start looking. That's where qpsmtpd, which 
I'm configuring around the same time, has its basic docs.

Ch.


Re: Bayes Filtering

2015-08-02 Thread Christian Jaeger
On August 2, 2015 7:36:36 PM CEST, RW rwmailli...@googlemail.com wrote:
 In future start with
 
  man spamassassin 
 
 which will lead you to:
 
 CONFIGURATION
Mail::SpamAssassin::Conf  SpamAssassin configuration files

I think I've actually seen this page rececntly. I also remember having looked 
through the bayes_* options (about a week ago) to see whether there's one that 
might indicate the number of required messages learnt, but couldn't find any 
(now I've seen bayes_min_ham_num / bayes_min_spam_num). I don't know how that 
happened, perhaps I was seeing another page (perhaps online), perhaps I had too 
many things in my mind then at the same time and was interrupted or 
unconcentrated (when starting to configure these systems (DNS, qmail, ezmlm, 
qpsmtpd, dovecot, SA), there are just too many things to take care of to not 
make errors sometimes).

 Normally the main man page has the name of the project or its main
 executable. It's not normal to document how a feature is configured
 in the documentation for library that implements that feature. 

qpsmtpd is different here since it has a plugin architecture and then it 
definitely makes more sense to document things in the plugins, which are just 
modules. If spamassassin does not have such an architecture then I agree it 
makes sense to document options where they are processed, i.e. the module which 
parses them. 

I know I'm sometimes confusing spamassassin and qpsmtpd. Both are in Perl and 
used together in my setup. I've grown a habit to thinking docs are in the 
modules and when I was checking the SA docs again before sending my post I 
followed this habit without realizing that it's not the configuration of a 
qpsmtpd plugin in this case. Please don't judge me too hard, I'm trying to get 
on with things as quickly as I can like most everybody, I've got other things 
on my plate, too.

So, I don't have a suggestion for improvement. Hopefully my post still helped 
the OP?

Cheers,
Christian.


Re: Bayes Filtering

2015-08-02 Thread Christian Jaeger
On August 2, 2015 5:15:08 PM CEST, Reindl Harald h.rei...@thelounge.net wrote:
 
 Am 02.08.2015 um 14:57 schrieb Roman Gelfand:
  Could somebody post a successful bayes configuration?
 
 ??
 
 you just need to *train* it for ham *and* spam

I think I remember from past use of SA that it only uses the bayes database 
once a certain number of messages have been learnt. It has confused me, too, 
now. I can't find anything about this in the docs, though, and neither have I 
found a test in the sources by way of searching for 'number', but that's not a 
thorough check. If I remember this detail correctly, it would be a good idea to 
add it to the docs.

Ch.


Re: Bayes Filtering

2015-08-02 Thread RW
On 2 Aug 2015 18:52:38 +0200
Christian Jaeger wrote:

 On August 2, 2015 6:40:10 PM CEST, Reindl Harald
 h.rei...@thelounge.net wrote:
  no idea what you are talking about by saying
  I can't find anything about this in the docs
 
 I'm talking about the bundled docs. The man / perldoc pages of
 Mail::SpamAssassin::Plugin::Bayes / Mail::SpamAssassin::*Bayes* and
 the default config files. That's where I expected this info to be.

In future start with

 man spamassassin 

which will lead you to:

CONFIGURATION
   Mail::SpamAssassin::Conf  SpamAssassin configuration files




 It's something simple and basic, i.e. something that the writer of
 the software can foresee the need for documentation, so it makes
 sense that it's in the same files that the programmers wrote. That's
 where I start looking. That's where qpsmtpd, which I'm configuring
 around the same time, has its basic docs.


Normally the main man page has the name of the project or its main
executable. It's not normal to document how a feature is configured
in the documentation for library that implements that feature. 


Re: Bayes Filtering

2015-08-02 Thread Reindl Harald



Am 02.08.2015 um 18:36 schrieb Christian Jaeger:

On August 2, 2015 5:15:08 PM CEST, Reindl Harald h.rei...@thelounge.net wrote:


Am 02.08.2015 um 14:57 schrieb Roman Gelfand:

Could somebody post a successful bayes configuration?


??

you just need to *train* it for ham *and* spam


I think I remember from past use of SA that it only uses the bayes database 
once a certain number of messages have been learnt. It has confused me, too, 
now. I can't find anything about this in the docs, though, and neither have I 
found a test in the sources by way of searching for 'number', but that's not a 
thorough check. If I remember this detail correctly, it would be a good idea to 
add it to the docs.


no idea what you are talking about by saying
I can't find anything about this in the docs

https://wiki.apache.org/spamassassin/BayesFaq
https://wiki.apache.org/spamassassin/BayesNotWorking

there's a minimum threshold on how many messages must be in the Bayes 
database, before SA will use it while scanning. By default, there must 
be 200 ham messages and 200 spam messages learned before it will be used.




signature.asc
Description: OpenPGP digital signature


Re: Bayes Filtering

2015-08-02 Thread Dave Funk

On Sun, 2 Aug 2015, Christian Jaeger wrote:


On August 2, 2015 6:40:10 PM CEST, Reindl Harald h.rei...@thelounge.net wrote:

no idea what you are talking about by saying
I can't find anything about this in the docs


I'm talking about the bundled docs. The man / perldoc pages of 
Mail::SpamAssassin::Plugin::Bayes / Mail::SpamAssassin::*Bayes* and the default 
config files. That's where I expected this info to be. It's something simple 
and basic, i.e. something that the writer of the software can foresee the need 
for documentation, so it makes sense that it's in the same files that the 
programmers wrote. That's where I start looking. That's where qpsmtpd, which 
I'm configuring around the same time, has its basic docs.

Ch.


In the man page for the spamassasin config file there is a paragraph:

   bayes_min_ham_num (Default: 200)
   bayes_min_spam_num   (Default: 200)
   To be accurate, the Bayes system does not activate until a
   certain number of ham (non-spam) and spam have been learned.
   The default is 200 of each ham and spam, but you can tune
   these up or down with these two settings.

You might argue about the clarity, but the info is there.

--
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{


Re: Bayes Filtering

2015-08-02 Thread RW
On 2 Aug 2015 21:06:40 +0200
Christian Jaeger wrote:


 qpsmtpd is different here since it has a plugin architecture and then
 it definitely makes more sense to document things in the plugins,
 which are just modules. If spamassassin does not have such an
 architecture then I agree it makes sense to document options where
 they are processed, i.e. the module which parses them. 

It does use plugins it's just that the configuration for the core
plugins is in a single file. The documentation would be too
chaotic if it weren't.