Re: Really hard-to-filter spam

2023-08-04 Thread Thomas Cameron via users




On 8/4/23 02:15, Sean Greenslade wrote:

On Wed, Aug 02, 2023 at 04:17:22PM -0500, Thomas Cameron via users wrote:

On 8/2/23 15:52, David B Funk wrote:



I have the users move spam to an imap folder, and then run (via the user's
cron job):

sa-learn --mbox --spam /home/[username]/mail/spam

If something is flagged as spam and it's not supposed to be, I have them
copy it to the ham folder and I run (also via cron job):

sa-learn --mbox --ham /home/[username]/mail/spam


   
Hopefully this is just a typo in your email, but the above line trains
your spam folder as if it's ham. That could easily cause your screwed-up
bayes scores.

--Sean


It was a typo, sorry. I have a cron job that uses --spam against the 
spam folder, and --ham against the ham folder. I just copied and pasted 
poorly. This is the actual script for my account:


[thomas.cameron@mail-east ~]$ cat bin/spamcheck
#!/bin/bash
sa-learn --progress --spam --mbox /home/thomas.cameron/mail/INBOX/spam
sa-learn --progress --ham --mbox /home/thomas.cameron/mail/INBOX/ham

Bayes tests for other messages, like the one you sent me, looks like this:

--
Return-Path: 
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
mail-east.camerontech.com
X-Spam-Level:
X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,
DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI,SPF_HELO_NONE,
SPF_PASS,T_SCC_BODY_TEXT_LINE shortcircuit=no autolearn=ham
autolearn_force=no version=3.4.6
--

But messages flagged as spam look like this:

--
Return-Path: 


X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
mail-east.camerontech.com
X-Spam-Flag: YES
X-Spam-Level: 
X-Spam-Status: Yes, score=36.8 required=5.0 tests=BAYES_99,BAYES_999,
DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FROM_FMBLA_NEWDOM,
FROM_SUSPICIOUS_NTLD,FROM_SUSPICIOUS_NTLD_FP,HTML_IMAGE_ONLY_32,
HTML_MESSAGE,PDS_OTHER_BAD_TLD,RAZOR2_CF_RANGE_51_100,RAZOR2_CHECK,
RCVD_IN_DNSWL_HI,RDNS_NONE,SH_HELO_DBL,SH_HELO_ZRD_FRESH,
SH_ZRD_HEADERS_FRESH,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,
URIBL_ABUSE_SURBL,URIBL_BLACK,URIBL_ZRD shortcircuit=no autolearn=spam
autolearn_force=no version=3.4.6
--

The previous email I copied headers from as an example was just a bad 
example. Usually Bayes is /pretty/ accurate on my system. I only used 
that one because it was a message which made it through SpamAssassin. I 
was trying to demonstrate that the checks were not failing, as suggested 
in an earlier comment.


Thanks for catching that, though. I have made silly mistakes like that 
so I appreciate you checking me.


--
Thomas


Re: Really hard-to-filter spam

2023-08-04 Thread Sean Greenslade
On Wed, Aug 02, 2023 at 04:17:22PM -0500, Thomas Cameron via users wrote:
> On 8/2/23 15:52, David B Funk wrote:
>
> 
>
> I have the users move spam to an imap folder, and then run (via the user's
> cron job):
> 
> sa-learn --mbox --spam /home/[username]/mail/spam
> 
> If something is flagged as spam and it's not supposed to be, I have them
> copy it to the ham folder and I run (also via cron job):
> 
> sa-learn --mbox --ham /home/[username]/mail/spam

  
Hopefully this is just a typo in your email, but the above line trains
your spam folder as if it's ham. That could easily cause your screwed-up
bayes scores.

--Sean