I have a mail folder that I put false negatives in (i.e., spam which ends
up in my inbox) and another for false negatives (ham that ends up in my
spam folder).  Each night I run sa-learn on each folder (sa-learn will
munch on entire Maildirs) and also feed each message to spamassassin -r to
report it.  So using zcat or gunzip -c will work for spamassassin -r, but
not for sa-learn.

Unless sa-learn can munch on stdin as well as files....

-CJ

On Fri, May 21, 2021 at 3:28 PM Lucas Rolff <lu...@lucasrolff.com> wrote:

> You can do `zcat -f` or `gunzip -c -f` and avoid having to have .gz
> extension, that way you can skip the rename step
>
>
>
> Best Regards,
>
> Lucas Rolff
>
>
>
> *From: *Clive Jacques <westriverp...@gmail.com>
> *Date: *Friday, 21 May 2021 at 21.04
> *To: *"users@spamassassin.apache.org" <users@spamassassin.apache.org>
> *Subject: *Re: spamassassin and *compressed* Maildir
>
>
>
> That's confirmed.  sa-learn doesn't like compressed files.  I don't know
> if it will dine on compressed files with the correct extension (i.e.,
> .gz).  Unfortunately, when using compression with Maildir format, Dovecot
> doesn't seem to like to use extensions.  So, I copied the directory to a
> temporary location, decompressed the files and then set sa-learn on them.
> Even getting gunzip to operate on the files was a pain because it only
> wants files with the .gz extension (so I had to rename all 6,000 of them
> first - using a utility like 'rename').  I then did the same thing with
> about 9,000 hams.
>
>
>
> There was much good news.  Learning proceeded about the same pace, but
> syncing the journal to the database was *much *faster.  Maybe the tokens
> were smaller?  I verified that it seemed to work with --dump magic.
>
>
>
> Then, all by itself, Spamassassin's bayes filtering was instantly much
> better.  Stuff that was tripping BAYES_00 was suddenly popping BAYES_99.
>
>
>
> Now, I just need to update my nightly learning/reporting script.
>
>
>
> Still, a very nice result.
>
>
>
> On Fri, May 21, 2021 at 11:30 AM Henrik K <h...@hege.li> wrote:
>
> On Fri, May 21, 2021 at 10:54:54AM -0400, Clive Jacques wrote:
> > Do spamassassin or sa-learn understand compressed files or compressed
> Maildir?
>
> I believe sa-learn will automatically decompress if the files have .gz or
> .bz2 extension, but yes Maildir files without extension will not work.
>
> Should be easy to detect compressed Maildir files, perhaps file enhancement
> request in bugzilla.
>
>

Reply via email to