Re: BAYES_00 BODY. Negative score?
Hi, >*-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% > >* [score: 0.] > > This indicates a mistrained database, which means you have trained too > many > spams or spam-like messages (commercial messages) as ham. > > Proper training of spams should help. Just keep your spam (and optionally > ham) corpora for retraining in case you would drop the database. > > I also recommend to abstain from training commercial mail (notices from > e-shops, companies you done business with etc) as ham, unless they > generate > BAYES_999 score and you want it lower. I often train them as spam so > those > give uncertain BAYES_50 result. > Is there any ability to distinguish a legitimate newsletter from a spam newsletter? In other words, if I train emails from Forbes or Washington Post as ham, then train similar newsletter emails from other other providers that are more suspect, will bayes still be able to distinguish Forbes and WP as ham? The problem is that if I avoid training newsletters or bulk email altogether, then I'm also left with spam newsletters still only hitting bayes50. I'm actually in a situation now where Forbes and WP newsletters are being marked as spam, so considering retraining, but wondering what approach/best practices I should be following. # sa-learn --dump magic 0.000 0 3 0 non-token data: bayes db version 0.000 0 97002 0 non-token data: nspam 0.000 0 90173 0 non-token data: nham 0.000 0 11581565 0 non-token data: ntokens 0.000 0 1054224948 0 non-token data: oldest atime 0.000 0 1676433889 0 non-token data: newest atime 0.000 0 0 0 non-token data: last journal sync atime 0.000 0 1648164856 0 non-token data: last expiry atime 0.000 0 0 0 non-token data: last expire atime delta 0.000 0 0 0 non-token data: last expire reduction count
Re: BAYES_00 BODY. Negative score?
Please let this sit for a while, I've discovered a fundamental issue with my scheme of feeding messages to BAYES. Unfortunately I was remiss, apparently, it setting up logging for some bits, so have no idea how long this has been failing. Sorry for the clutter. joe a. On 2/14/2023 5:37 PM, joe a wrote: On 2/14/2023 2:56 AM, Matus UHLAR - fantomas wrote: On 13.02.23 17:42, joe a wrote: Have some annoying SPAM that consistently shows a negative score on BAYES. Is the default scoring or influenced by BAYES in some way? *-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% * [score: 0.] This indicates a mistrained database, which means you have trained too many spams or spam-like messages (commercial messages) as ham. Proper training of spams should help. Just keep your spam (and optionally ham) corpora for retraining in case you would drop the database. I also recommend to abstain from training commercial mail (notices from e-shops, companies you done business with etc) as ham, unless they generate BAYES_999 score and you want it lower. I often train them as spam so those give uncertain BAYES_50 result. Those mails resemble spam too much to be used for training. All, The term "proper training" has always seemed a bit problematic to me. That aside, experiencing an error trying attempting: sa-learn -D --spam /var/mail/spamd/Cabinet.saved-spam The last line shows: *** Learned tokens from 0 message(s) (1 message(s) examined) ERROR: the Bayes learn function returned an error, please re-run with -D for more information at /usr/bin/sa-learn line 500. *** Which may be permissions related. However, there seem to be some errors/warning at the beginning, starting with: *** Feb 14 17:26:14.956 [2855] dbg: plugin: loading Mail::SpamAssassin::Plugin::Razo r2 from @INC Feb 14 17:26:14.959 [2855] dbg: razor2: razor2 is not available Feb 14 17:26:14.959 [2855] dbg: plugin: loading Mail::SpamAssassin::Plugin::SpamCop from @INC plugin: failed to parse plugin (from @INC): Can't locate Mail/SpamAssassin/Plugin/SpamCop.pm: lib/Mail/SpamAssassin/Plugin/SpamCop.pm: Permission denied at (eval 44) line 1. *** While this also suggests a permissions issue the only place I find SpamCom.pm (even as root) is at: "/usr/lib/perl5/vendor_perl/5.26.1/Mail/SpamAssassin/Plugin/SpamCop.pm", which is not in the path sa-learn concocted when invoked. Sorry if the formatting is weird or if this is useless information.
Re: BAYES_00 BODY. Negative score?
On 2/14/2023 2:56 AM, Matus UHLAR - fantomas wrote: On 13.02.23 17:42, joe a wrote: Have some annoying SPAM that consistently shows a negative score on BAYES. Is the default scoring or influenced by BAYES in some way? *-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% * [score: 0.] This indicates a mistrained database, which means you have trained too many spams or spam-like messages (commercial messages) as ham. Proper training of spams should help. Just keep your spam (and optionally ham) corpora for retraining in case you would drop the database. I also recommend to abstain from training commercial mail (notices from e-shops, companies you done business with etc) as ham, unless they generate BAYES_999 score and you want it lower. I often train them as spam so those give uncertain BAYES_50 result. Those mails resemble spam too much to be used for training. All, The term "proper training" has always seemed a bit problematic to me. That aside, experiencing an error trying attempting: sa-learn -D --spam /var/mail/spamd/Cabinet.saved-spam The last line shows: *** Learned tokens from 0 message(s) (1 message(s) examined) ERROR: the Bayes learn function returned an error, please re-run with -D for more information at /usr/bin/sa-learn line 500. *** Which may be permissions related. However, there seem to be some errors/warning at the beginning, starting with: *** Feb 14 17:26:14.956 [2855] dbg: plugin: loading Mail::SpamAssassin::Plugin::Razo r2 from @INC Feb 14 17:26:14.959 [2855] dbg: razor2: razor2 is not available Feb 14 17:26:14.959 [2855] dbg: plugin: loading Mail::SpamAssassin::Plugin::SpamCop from @INC plugin: failed to parse plugin (from @INC): Can't locate Mail/SpamAssassin/Plugin/SpamCop.pm: lib/Mail/SpamAssassin/Plugin/SpamCop.pm: Permission denied at (eval 44) line 1. *** While this also suggests a permissions issue the only place I find SpamCom.pm (even as root) is at: "/usr/lib/perl5/vendor_perl/5.26.1/Mail/SpamAssassin/Plugin/SpamCop.pm", which is not in the path sa-learn concocted when invoked. Sorry if the formatting is weird or if this is useless information.
Re: Seeing big (>1MB) spam
I started seeing some spam today in the 1-1.5 MB range. It's been over a year now, but for a while I was getting a huge number of spams that were either 1143 KB or 3831 KB. The 3831 KB variant used the same obfuscation payload as the 1143 KB spams, they just put it in twice in a row. Loren
Seeing big (>1MB) spam
I started seeing some spam today in the 1-1.5 MB range. I was surprised to see obvious spam in my Inbox, but discovered it had no SA headers. It turned out that my procmailrc rule was only scanning messages smaller than 700k. I boosted it to 2MB: :0fw * < 200 | /usr/bin/spamc -s 200
[Off-Topic] Blog from KAM on Cybersecurity and Looking for Hecklers for my workshop at InboxExpo
Thanks to Inbox Expo for publishing my 2 Secrets to Streamline Cybersecurity Projects. You can read it at https://inboxexpo.com/2-secrets-from-from-kam/ and no registration or silliness required! I will also be presenting the keynote and a workshop for InboxExpo.com on February 27th. While the onsite venue is full, free virtual tickets are available thanks to Dotdigital. Register today at https://lnkd.in/gATaQaGX. My Workshop will be a facilitated discussion on deliverability, SEO, Spam, Marketing, Branding, etc. If you are interested in more content from me and want to learn more about CRM, Emails, Marketing, Email Security, and using Google Cloud & AI, I will be working with emailexpert.org to give free classes as part of the 2023 membership drive running now. Join today! Regards, KAM -- Kevin A. McGrail Member, Apache Software Foundation Chair Emeritus Apache SpamAssassin Project https://www.linkedin.com/in/kmcgrail - 703.798.0171