Re: Clustering spamassassin + autolearning
Thank you all for your (quick) answers! @Kai: mailwatch has a training facility built in. But this is only possible on messages in quarantine. If a message is passed by mailscanner (for example, because of BAYES_00, which is sometimes the case), it is sent to the mailbox server, and it's not possible to train the message as spam on the mailwatch server. Peter On Tue, Nov 25, 2008 at 9:11 PM, Kai Schaetzl [EMAIL PROTECTED]wrote: Peter Fastré wrote on Tue, 25 Nov 2008 16:04:19 +0100: 2. On my mailbox server I'd like to have a script which goes into the mailfolders, searches for a folder with the name 'Spam', feeds the message to sa-learn (which should be feeding it to the same bayes database of course), and then delete the message. Do you think this is a well-thought approach of having my users train the spam filters this way? Generally yes, but since you are already using MailScanner+Mailwatch: That's already built-in and users can just train any messages from MailWatch. Why duplicate that? Kai -- Kai Schätzl, Berlin, Germany Get your web at Conactive Internet Services: http://www.conactive.com
Clustering spamassassin + autolearning
Hello guys, I'm running a small-sized hosting provider and currently our setup is following: 1 mailbox server running exim (only local delivery) 1 antispam server running exim + mailscanner + spamassassin + mailwatch - sends all approved mail to mailbox server 2 mysql servers (master-slave) running databases for mailwatch + spamassassin (bayes) I have two questions about this, hope someone can help me. 1. Because all the load goes to the smtp server now and to add some redundancy to our setup, we would like to add another antispam server, with the same setup (which is working fine). Will this be possible (concerning spamassassin), with two nodes sharing the same bayes database on the mysql servers? Is it possible to have two nodes feeding the same bayes database? 2. On my mailbox server I'd like to have a script which goes into the mailfolders, searches for a folder with the name 'Spam', feeds the message to sa-learn (which should be feeding it to the same bayes database of course), and then delete the message. Do you think this is a well-thought approach of having my users train the spam filters this way? Maybe there are already such scripts available? Thanks in advance, Peter
Re: Clustering spamassassin + autolearning
On Tue, November 25, 2008 16:04, Peter Fastré wrote: approach of having my users train the spam filters this way? Maybe there are already such scripts available? http://johannes.sipsolutions.net/Projects/dovecot-antispam http://dovecot.org/ and full enabled with sieve / managesieve http://sieve.info/ more info on what sieve is dovecot-antispam can use sa-learn if you like that pr msg, this way it works in outlook aswell as its handled as a imap hook sorry you did not post what lda you have but all the rest was there -- Benny Pedersen Need more webspace ? http://www.servage.net/?coupon=cust37098
Re: Clustering spamassassin + autolearning
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hey Peter, I have been working on this kind of setup last week. On Nov 25, 2008, at 4:04 PM, Peter Fastré wrote: Hello guys, I'm running a small-sized hosting provider and currently our setup is following: 1 mailbox server running exim (only local delivery) 1 antispam server running exim + mailscanner + spamassassin + mailwatch - sends all approved mail to mailbox server 2 mysql servers (master-slave) running databases for mailwatch + spamassassin (bayes) I have two questions about this, hope someone can help me. 1. Because all the load goes to the smtp server now and to add some redundancy to our setup, we would like to add another antispam server, with the same setup (which is working fine). Will this be possible (concerning spamassassin), with two nodes sharing the same bayes database on the mysql servers? Is it possible to have two nodes feeding the same bayes database? I'm not 100% sure, but since MySQL is ACID compliant, it should be very possible to autolearn from multiple locations to one central database. This is the setup I have made too. If you have multiple database servers and both are used by scan hosts, make sure one of them replicates the bayes stuff from the other, which is fed by sa- learn. Afaik, when using a SQL database for bayes, no important bayes stuff is stored on the host, so there's nothing that can get out of sync. 2. On my mailbox server I'd like to have a script which goes into the mailfolders, searches for a folder with the name 'Spam', feeds the message to sa-learn (which should be feeding it to the same bayes database of course), and then delete the message. Do you think this is a well-thought approach of having my users train the spam filters this way? Maybe there are already such scripts available? With the database already setup, I have made a IMAP box with some dirs (Ham, Spam, Archive/Ham, Archive/Spam). The people I work with can configure this account and drop mail in the right folders. I have a PHP script that teaches SA by feeding it the Ham and Spam dir contents. It then archives the mail, if told to do so. It might have been a basic shell script too, since it is just calls to sa-learn. The results are automaticly shared between scan hosts. We do not use per user learning (yet), but all that would be needed is some iteration that wraps and repeats the above for all users, using the -u user option. If you make sure the teaching is done per user, it is sufficiently 'thought-through' imho. If the input of messages is not correct, only that users' database will be soiled. Samy Thanks in advance, Peter -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.8 (Darwin) iEYEARECAAYFAkksMCIACgkQKIdvzp2UK/F45wCeK/5xQ2fmZf77DbwX4wMDrsYR 6bAAn2yQI7h8HC1biJPuZeRCYKufIoAP =cTQw -END PGP SIGNATURE-
Re: Clustering spamassassin + autolearning
Peter Fastré wrote on Tue, 25 Nov 2008 16:04:19 +0100: 2. On my mailbox server I'd like to have a script which goes into the mailfolders, searches for a folder with the name 'Spam', feeds the message to sa-learn (which should be feeding it to the same bayes database of course), and then delete the message. Do you think this is a well-thought approach of having my users train the spam filters this way? Generally yes, but since you are already using MailScanner+Mailwatch: That's already built-in and users can just train any messages from MailWatch. Why duplicate that? Kai -- Kai Schätzl, Berlin, Germany Get your web at Conactive Internet Services: http://www.conactive.com