Re: Bayes problems and German Spam
On Monday 16 May 2005 12:15, Ronan McGlue typed: > I too have all net tests enabled and have started from a fresh clean new > database friday, and already Im seeing the german spams hit bayes_00... > I dont want to switch autolearning off becuase well i find it incredibly > usefull. i have spam/ham thresholds at 10/0 respectivly and all appears > well aside from the german bunch of spams... The prolocation rulesets based on subject seem to be working quite well here. They're slowing dragging Bayes up - currently at _44 after ~50 spams that scored nice and high. At least one server had worked out on its own that the spams should be _99.
Re: Bayes problems and German Spam
Simon Byrnand wrote: At 09:53 16/05/2005, Jo wrote: Simon Byrnand wrote: Hi All, After going from 2.64 to 3.0.3 I thought Bayes was working much better - previously certain classes of spam were being consistently reported as ham, scoring BAYES_00 no matter what I did, or how much manual training I did. (Autolearning enabled) After upgrading to 3.0.3 and clearing the Bayes database everything seemed fine for a week or so, now it's back to its old habits :( Particularly frustrating is the complete inability of sa-learn to correct the thinking of Bayes - all the recent flood of German spams are scoring BAYES_00, and DESPITE the fact that I have manually learnt well over two dozen of these as spam (which includes all the variations of them I've seen so far) new copies of identical spams STILL score BAYES_00. WHY ? If the autolearn system can't be overridden with some manual learning, it makes it more of less useless :( A few other spams that were previously getting BAYES_99 are now down to BAYES_00 for no apparent reason. It's highly unlikely that they were autolearnt as ham, as they hit several other tests too. It seems that Bayes is still exploitable... :( Any suggestions ? Regards, Simon Clear your bayes database and start all over again. Switch off auto-learning and rely purely on manual learning in a feedback loop. Grab a mail box of known ham and another folder of known spam. Preferably use a thousand of each. Hmm, not very practical when the system has several thousand users/mailboxes. There is no way I would be able to keep current with manual learning just based on my own personal mailbox...(and I can hardly go poking around in other peoples mailboxes to gather ham/spam to learn) If you ever switch on autolearning again. Set the treshold at -0.2 for ham and 10 or 15 for spam. Are there even any negative scores in 3.0.3 ? I thought negative scores were pretty much eliminated in recent versions, so with -0.2 it would never learn any ham. Enable network tests, razor2, pyzor and dcc work wonders on the site I administer. Already have all network tests enabled, always have done. Regards, Simon I too have all net tests enabled and have started from a fresh clean new database friday, and already Im seeing the german spams hit bayes_00... I dont want to switch autolearning off becuase well i find it incredibly usefull. i have spam/ham thresholds at 10/0 respectivly and all appears well aside from the german bunch of spams... dont know what else i can do... *cluches at straws* Is there a way to tie in a positive net test... say multi.surbl.org to sway the bayes as generally if the SURBL reports spam you can guaratee that all the other rules are surplus to requiremtns... IMHO ronan -- Regards Ronan McGlue Info. Services QUB
Re: Bayes problems and German Spam
At 09:53 16/05/2005, Jo wrote: Simon Byrnand wrote: Hi All, After going from 2.64 to 3.0.3 I thought Bayes was working much better - previously certain classes of spam were being consistently reported as ham, scoring BAYES_00 no matter what I did, or how much manual training I did. (Autolearning enabled) After upgrading to 3.0.3 and clearing the Bayes database everything seemed fine for a week or so, now it's back to its old habits :( Particularly frustrating is the complete inability of sa-learn to correct the thinking of Bayes - all the recent flood of German spams are scoring BAYES_00, and DESPITE the fact that I have manually learnt well over two dozen of these as spam (which includes all the variations of them I've seen so far) new copies of identical spams STILL score BAYES_00. WHY ? If the autolearn system can't be overridden with some manual learning, it makes it more of less useless :( A few other spams that were previously getting BAYES_99 are now down to BAYES_00 for no apparent reason. It's highly unlikely that they were autolearnt as ham, as they hit several other tests too. It seems that Bayes is still exploitable... :( Any suggestions ? Regards, Simon Clear your bayes database and start all over again. Switch off auto-learning and rely purely on manual learning in a feedback loop. Grab a mail box of known ham and another folder of known spam. Preferably use a thousand of each. Hmm, not very practical when the system has several thousand users/mailboxes. There is no way I would be able to keep current with manual learning just based on my own personal mailbox...(and I can hardly go poking around in other peoples mailboxes to gather ham/spam to learn) If you ever switch on autolearning again. Set the treshold at -0.2 for ham and 10 or 15 for spam. Are there even any negative scores in 3.0.3 ? I thought negative scores were pretty much eliminated in recent versions, so with -0.2 it would never learn any ham. Enable network tests, razor2, pyzor and dcc work wonders on the site I administer. Already have all network tests enabled, always have done. Regards, Simon
Re: Bayes problems and German Spam
Simon Byrnand wrote: Hi All, After going from 2.64 to 3.0.3 I thought Bayes was working much better - previously certain classes of spam were being consistently reported as ham, scoring BAYES_00 no matter what I did, or how much manual training I did. (Autolearning enabled) After upgrading to 3.0.3 and clearing the Bayes database everything seemed fine for a week or so, now it's back to its old habits :( Particularly frustrating is the complete inability of sa-learn to correct the thinking of Bayes - all the recent flood of German spams are scoring BAYES_00, and DESPITE the fact that I have manually learnt well over two dozen of these as spam (which includes all the variations of them I've seen so far) new copies of identical spams STILL score BAYES_00. WHY ? If the autolearn system can't be overridden with some manual learning, it makes it more of less useless :( A few other spams that were previously getting BAYES_99 are now down to BAYES_00 for no apparent reason. It's highly unlikely that they were autolearnt as ham, as they hit several other tests too. It seems that Bayes is still exploitable... :( Any suggestions ? Regards, Simon Clear your bayes database and start all over again. Switch off auto-learning and rely purely on manual learning in a feedback loop. Grab a mail box of known ham and another folder of known spam. Preferably use a thousand of each. If you ever switch on autolearning again. Set the treshold at -0.2 for ham and 10 or 15 for spam. Enable network tests, razor2, pyzor and dcc work wonders on the site I administer. Good luck, Jo
Bayes problems and German Spam
Hi All, After going from 2.64 to 3.0.3 I thought Bayes was working much better - previously certain classes of spam were being consistently reported as ham, scoring BAYES_00 no matter what I did, or how much manual training I did. (Autolearning enabled) After upgrading to 3.0.3 and clearing the Bayes database everything seemed fine for a week or so, now it's back to its old habits :( Particularly frustrating is the complete inability of sa-learn to correct the thinking of Bayes - all the recent flood of German spams are scoring BAYES_00, and DESPITE the fact that I have manually learnt well over two dozen of these as spam (which includes all the variations of them I've seen so far) new copies of identical spams STILL score BAYES_00. WHY ? If the autolearn system can't be overridden with some manual learning, it makes it more of less useless :( A few other spams that were previously getting BAYES_99 are now down to BAYES_00 for no apparent reason. It's highly unlikely that they were autolearnt as ham, as they hit several other tests too. It seems that Bayes is still exploitable... :( Any suggestions ? Regards, Simon