Re: Bayes problems and German Spam
On Monday 16 May 2005 12:15, Ronan McGlue typed: > I too have all net tests enabled and have started from a fresh clean new > database friday, and already Im seeing the german spams hit bayes_00... > I dont want to switch autolearning off becuase well i find it incredibly > usefull. i have spam/ham thresholds at 10/0 respectivly and all appears > well aside from the german bunch of spams... The prolocation rulesets based on subject seem to be working quite well here. They're slowing dragging Bayes up - currently at _44 after ~50 spams that scored nice and high. At least one server had worked out on its own that the spams should be _99.
Re: Bayes problems and German Spam
Simon Byrnand wrote: At 09:53 16/05/2005, Jo wrote: Simon Byrnand wrote: Hi All, After going from 2.64 to 3.0.3 I thought Bayes was working much better - previously certain classes of spam were being consistently reported as ham, scoring BAYES_00 no matter what I did, or how much manual training I did. (Autolearning enabled) After upgrading to 3.0.3 and clearing the Bayes database everything seemed fine for a week or so, now it's back to its old habits :( Particularly frustrating is the complete inability of sa-learn to correct the thinking of Bayes - all the recent flood of German spams are scoring BAYES_00, and DESPITE the fact that I have manually learnt well over two dozen of these as spam (which includes all the variations of them I've seen so far) new copies of identical spams STILL score BAYES_00. WHY ? If the autolearn system can't be overridden with some manual learning, it makes it more of less useless :( A few other spams that were previously getting BAYES_99 are now down to BAYES_00 for no apparent reason. It's highly unlikely that they were autolearnt as ham, as they hit several other tests too. It seems that Bayes is still exploitable... :( Any suggestions ? Regards, Simon Clear your bayes database and start all over again. Switch off auto-learning and rely purely on manual learning in a feedback loop. Grab a mail box of known ham and another folder of known spam. Preferably use a thousand of each. Hmm, not very practical when the system has several thousand users/mailboxes. There is no way I would be able to keep current with manual learning just based on my own personal mailbox...(and I can hardly go poking around in other peoples mailboxes to gather ham/spam to learn) If you ever switch on autolearning again. Set the treshold at -0.2 for ham and 10 or 15 for spam. Are there even any negative scores in 3.0.3 ? I thought negative scores were pretty much eliminated in recent versions, so with -0.2 it would never learn any ham. Enable network tests, razor2, pyzor and dcc work wonders on the site I administer. Already have all network tests enabled, always have done. Regards, Simon I too have all net tests enabled and have started from a fresh clean new database friday, and already Im seeing the german spams hit bayes_00... I dont want to switch autolearning off becuase well i find it incredibly usefull. i have spam/ham thresholds at 10/0 respectivly and all appears well aside from the german bunch of spams... dont know what else i can do... *cluches at straws* Is there a way to tie in a positive net test... say multi.surbl.org to sway the bayes as generally if the SURBL reports spam you can guaratee that all the other rules are surplus to requiremtns... IMHO ronan -- Regards Ronan McGlue Info. Services QUB
Re: Bayes problems and German Spam
At 09:53 16/05/2005, Jo wrote: Simon Byrnand wrote: Hi All, After going from 2.64 to 3.0.3 I thought Bayes was working much better - previously certain classes of spam were being consistently reported as ham, scoring BAYES_00 no matter what I did, or how much manual training I did. (Autolearning enabled) After upgrading to 3.0.3 and clearing the Bayes database everything seemed fine for a week or so, now it's back to its old habits :( Particularly frustrating is the complete inability of sa-learn to correct the thinking of Bayes - all the recent flood of German spams are scoring BAYES_00, and DESPITE the fact that I have manually learnt well over two dozen of these as spam (which includes all the variations of them I've seen so far) new copies of identical spams STILL score BAYES_00. WHY ? If the autolearn system can't be overridden with some manual learning, it makes it more of less useless :( A few other spams that were previously getting BAYES_99 are now down to BAYES_00 for no apparent reason. It's highly unlikely that they were autolearnt as ham, as they hit several other tests too. It seems that Bayes is still exploitable... :( Any suggestions ? Regards, Simon Clear your bayes database and start all over again. Switch off auto-learning and rely purely on manual learning in a feedback loop. Grab a mail box of known ham and another folder of known spam. Preferably use a thousand of each. Hmm, not very practical when the system has several thousand users/mailboxes. There is no way I would be able to keep current with manual learning just based on my own personal mailbox...(and I can hardly go poking around in other peoples mailboxes to gather ham/spam to learn) If you ever switch on autolearning again. Set the treshold at -0.2 for ham and 10 or 15 for spam. Are there even any negative scores in 3.0.3 ? I thought negative scores were pretty much eliminated in recent versions, so with -0.2 it would never learn any ham. Enable network tests, razor2, pyzor and dcc work wonders on the site I administer. Already have all network tests enabled, always have done. Regards, Simon
Re: Bayes problems and German Spam
Simon Byrnand wrote: Hi All, After going from 2.64 to 3.0.3 I thought Bayes was working much better - previously certain classes of spam were being consistently reported as ham, scoring BAYES_00 no matter what I did, or how much manual training I did. (Autolearning enabled) After upgrading to 3.0.3 and clearing the Bayes database everything seemed fine for a week or so, now it's back to its old habits :( Particularly frustrating is the complete inability of sa-learn to correct the thinking of Bayes - all the recent flood of German spams are scoring BAYES_00, and DESPITE the fact that I have manually learnt well over two dozen of these as spam (which includes all the variations of them I've seen so far) new copies of identical spams STILL score BAYES_00. WHY ? If the autolearn system can't be overridden with some manual learning, it makes it more of less useless :( A few other spams that were previously getting BAYES_99 are now down to BAYES_00 for no apparent reason. It's highly unlikely that they were autolearnt as ham, as they hit several other tests too. It seems that Bayes is still exploitable... :( Any suggestions ? Regards, Simon Clear your bayes database and start all over again. Switch off auto-learning and rely purely on manual learning in a feedback loop. Grab a mail box of known ham and another folder of known spam. Preferably use a thousand of each. If you ever switch on autolearning again. Set the treshold at -0.2 for ham and 10 or 15 for spam. Enable network tests, razor2, pyzor and dcc work wonders on the site I administer. Good luck, Jo
Re: Bayes Problems
On 4/14/05, J Thomas Hancock <[EMAIL PROTECTED]> wrote: > I am having one heck of a time getting Bayes working with SpamAssassin. > > I am using postfix 2.2.2 and SA 3.00.2. Postfix is being ran as the user > postfix. SA is being ran as postdrop. > > The following is the output from the syslog. > > spamd[22065]: debug: plugin: > Mail::SpamAssassin::Plugin::Hashcash=HASH(0xa8b6820) implements > 'parse_config' > spamd[22065]: debug: bayes: 22065 tie-ing to DB file R/O > /home/postdrop/.spamassassin_toks > spamd[22065]: debug: bayes: 22065 tie-ing to DB file R/O > /home/postdrop/.spamassassin_seen > spamd[22065]: debug: bayes: found bayes db version 3 > spamd[22065]: debug: bayes: Not available for scanning, only 35 ham(s) in > Bayes DB < 200 > spamd[22065]: debug: bayes: 22065 untie-ing > spamd[22065]: debug: bayes: 22065 untie-ing db_toks > spamd[22065]: debug: bayes: 22065 untie-ing db_seen > spamd[22065]: debug: Score set 1 chosen. > spamd[22065]: debug: MIME PARSER START > spamd[22065]: debug: main message type: text/plain > spamd[22065]: debug: parsing normal part > spamd[22065]: debug: added part, type: text/plain > spamd[22065]: debug: MIME PARSER END > spamd[22065]: debug: using "/tmp/spamd-22065-init/.spamassassin" for user > state dir > spamd[22065]: debug: bayes: no dbs present, cannot tie DB R/O: > /tmp/spamd-22065-init/.spamassassin/bayes_toks > spamd[22065]: debug: metadata: X-Spam-Relays-Trusted: > > Unfortunately I have tinkered with this too much so I really can not list > what I have or have not tried. > > Any input would be appreciated. > > Thank you, > Tom > Don't worry. This is spam behaviour change. For making your spam database quickly work you can pick up Bayes stater database from this site link below. http://www.fsl.com/support/ But it is always suggested that spam data base which is basically based on bayesian logic should learn from its own. Also command for making spamassassin learn any file or mail as spam or ham mail, you can use this command sa-learn --spam file/mail_box_path sa-learn --ham file/mail_box_path -- Crisppy Fernandes
RE: Bayes Problems
[clipped for brevity]... The source of your problem is indicated by > spamd[22065]: debug: bayes: Not available for scanning, only 35 ham(s) in Bayes DB < 200 To use Bayes with SA, you need a minimum of 200 HAM and SPAM messages learned into the db. Hope this helps. -Joe K.