More Detailed info in SUMMARY report
I've got spamassassin setup to report safe on suspected spam. I like the summary report, but I *really *like when some of the report identifies the specific rule trigger. How can I, for example, have the summary report show: 1.0 RELAYCOUNTRY_AWAY Sent through a non-US/CA server [here: XX] Where XX is the 2-letter code identified by the test? I already added a header with the information, but it sure would be nice to have this in the report.
Re: spamassassin and *compressed* Maildir
I have a mail folder that I put false negatives in (i.e., spam which ends up in my inbox) and another for false negatives (ham that ends up in my spam folder). Each night I run sa-learn on each folder (sa-learn will munch on entire Maildirs) and also feed each message to spamassassin -r to report it. So using zcat or gunzip -c will work for spamassassin -r, but not for sa-learn. Unless sa-learn can munch on stdin as well as files -CJ On Fri, May 21, 2021 at 3:28 PM Lucas Rolff wrote: > You can do `zcat -f` or `gunzip -c -f` and avoid having to have .gz > extension, that way you can skip the rename step > > > > Best Regards, > > Lucas Rolff > > > > *From: *Clive Jacques > *Date: *Friday, 21 May 2021 at 21.04 > *To: *"users@spamassassin.apache.org" > *Subject: *Re: spamassassin and *compressed* Maildir > > > > That's confirmed. sa-learn doesn't like compressed files. I don't know > if it will dine on compressed files with the correct extension (i.e., > .gz). Unfortunately, when using compression with Maildir format, Dovecot > doesn't seem to like to use extensions. So, I copied the directory to a > temporary location, decompressed the files and then set sa-learn on them. > Even getting gunzip to operate on the files was a pain because it only > wants files with the .gz extension (so I had to rename all 6,000 of them > first - using a utility like 'rename'). I then did the same thing with > about 9,000 hams. > > > > There was much good news. Learning proceeded about the same pace, but > syncing the journal to the database was *much *faster. Maybe the tokens > were smaller? I verified that it seemed to work with --dump magic. > > > > Then, all by itself, Spamassassin's bayes filtering was instantly much > better. Stuff that was tripping BAYES_00 was suddenly popping BAYES_99. > > > > Now, I just need to update my nightly learning/reporting script. > > > > Still, a very nice result. > > > > On Fri, May 21, 2021 at 11:30 AM Henrik K wrote: > > On Fri, May 21, 2021 at 10:54:54AM -0400, Clive Jacques wrote: > > Do spamassassin or sa-learn understand compressed files or compressed > Maildir? > > I believe sa-learn will automatically decompress if the files have .gz or > .bz2 extension, but yes Maildir files without extension will not work. > > Should be easy to detect compressed Maildir files, perhaps file enhancement > request in bugzilla. > >
Re: spamassassin and *compressed* Maildir
That's confirmed. sa-learn doesn't like compressed files. I don't know if it will dine on compressed files with the correct extension (i.e., .gz). Unfortunately, when using compression with Maildir format, Dovecot doesn't seem to like to use extensions. So, I copied the directory to a temporary location, decompressed the files and then set sa-learn on them. Even getting gunzip to operate on the files was a pain because it only wants files with the .gz extension (so I had to rename all 6,000 of them first - using a utility like 'rename'). I then did the same thing with about 9,000 hams. There was much good news. Learning proceeded about the same pace, but syncing the journal to the database was *much *faster. Maybe the tokens were smaller? I verified that it seemed to work with --dump magic. Then, all by itself, Spamassassin's bayes filtering was instantly much better. Stuff that was tripping BAYES_00 was suddenly popping BAYES_99. Now, I just need to update my nightly learning/reporting script. Still, a very nice result. On Fri, May 21, 2021 at 11:30 AM Henrik K wrote: > On Fri, May 21, 2021 at 10:54:54AM -0400, Clive Jacques wrote: > > Do spamassassin or sa-learn understand compressed files or compressed > Maildir? > > I believe sa-learn will automatically decompress if the files have .gz or > .bz2 extension, but yes Maildir files without extension will not work. > > Should be easy to detect compressed Maildir files, perhaps file enhancement > request in bugzilla. > >
spamassassin and *compressed* Maildir
Do spamassassin or sa-learn understand compressed files or compressed Maildir? I've been running spamassassin on my ubuntu mail server for years very successfully. Recently, I've been experiencing a lot of difficulty and I'm trying to figure it out. Earlier this year we upgraded the server from Trusty Tahr to Xenial (long time coming!) and some other stuff got upgraded as well. We run an IMAP server with Dovecot against a Maildir formatted message store. I noticed the message store was taking a fair amount of space, so I decided to compress it with zlib (gz compression). Pretty much since the upgrade (and simultaneous switch to compressed Maildir) spamassassin has been doing a much worse job. I upgraded from the distribution version of spamassassin (3.4.2) to the most recent version (3.4.6) but no real joy. I keep a 'learn spam' folder to put false negatives in (stuff that makes it into my inbox which ought not to), and every night, run sa-learn on it and also spamassassin -r to report it. I started noticing that DCC was complaining on report that "missing message body; fatal error". I ran spamassassin -d -r to see what was happening and noticed that it interacted with dcc using dccproc. Maybe dccproc doesn't understand compressed mail? Well, if it doesn't then perhaps sa-learn doesn't either. That might explain why my bayes rules don't seem to be working very well despite retraining. -CJ
Re: Detect Emoticons in Subject
That's fine - I'm not saying all email containing emojis in the subject (or elsewhere) *is *spam - just that it's uncommon and right now, about 90% of the time it is *for me*. I just want to score it as part of the greater constellation of factors (just like DKIM, SPF etc.). On Thu, May 20, 2021 at 2:48 PM Bill Cole < sausers-20150...@billmail.scconsult.com> wrote: > > People send wanted mail with all sorts of weirdness. > >
Detect Emoticons in Subject
Hi, I've been using SA a long time. Lately, I'm getting more and more spam with emoticons in the subject line. I'd say about 90% of my emails with emoticons in the subject are spam. I'd like to create a local rule which scores email with emoticons in the subject. I saw a previous discussion on this in the archive, but it was focused on whether such emails were *always *spam. I think an emoticon rule, in combination with other rules, will help my installation. I've tried to match as follows, but it won't lint. I'm not really a perl programmer. I've written several other more conventional local rules, but here I'm a bit out of my depth. I'd appreciate some guidance. # Local Rule for Emoticons in subject subjectEMOTICON_IN_SUBJECT Subject =~ /\p{Emoticons}/ score EMOTICON_IN_SUBJECT 3.0 describeEMOTICON_IN_SUBJECT Subject Line Has Emoticons -CJ