On 13-05-16 18:29, Reindl Harald wrote: > > Am 13.05.2016 um 18:11 schrieb John Hardin: >> On Fri, 13 May 2016, Reindl Harald wrote: >> >>> the problem is blowing out such rules with such scores at all with a >>> non working auto-QA (non-working in: no correction for days as well as >>> dangerous scoring of new rules from the start) >>> >>> 02-Mai-2016 00:12:34: SpamAssassin: No update available >>> 03-Mai-2016 01:55:05: SpamAssassin: No update available >>> 04-Mai-2016 00:43:33: SpamAssassin: No update available >>> 05-Mai-2016 01:48:15: SpamAssassin: Update processed successfully >>> 06-Mai-2016 00:53:17: SpamAssassin: No update available >>> 07-Mai-2016 01:21:23: SpamAssassin: No update available >>> 08-Mai-2016 01:38:23: SpamAssassin: No update available >>> 09-Mai-2016 00:02:56: SpamAssassin: No update available >>> 10-Mai-2016 01:10:29: SpamAssassin: No update available >>> 11-Mai-2016 00:55:46: SpamAssassin: No update available >>> 12-Mai-2016 00:21:17: SpamAssassin: Update processed successfully >>> 13-Mai-2016 00:33:31: SpamAssassin: No update available >> >> Perhaps you could help with that by participating in masscheck. You seem >> to get a lot of FPs on base rules; contributing masscheck results on >> your ham would reduce those > > i can't rsync customer mails to a 3rd party
That is not necessary for masscheck. > > if that would be based on some webervice where you just feed local > samples and only give the rules which hitted and spam/ham flag out it > would be somehow possible The process is clearly documented on the wiki: https://wiki.apache.org/spamassassin/MassCheck > > especially you would not have much from the bayes-samples because they > would trigger all sort of wrong rules after strip most headers and and a > generic received header (which seems to be needed by the bayes-engine > for whatever reason since it otherwise scores samples completly different) This is an assumption: you can't know what your data would contribute to the masscheck process. > > in any case: such a rule with 3.7 must not happen at all, even if it has > no such bad impact - 3.7 is very high and only deserved when you are > certain that a mail is spam which is *not* backed by a single header, > deep inspection or not > That is true, but I think you should put your money where your mouth is: just run the masscheck on your corpus and send the results to the devs for inspection. If it's not working, you lost nothing. If the data *is* useful, we all win from your work by getting better scores. Just my 2 cents. Regards, Tom
signature.asc
Description: OpenPGP digital signature
