On 13-05-16 18:29, Reindl Harald wrote:
> 
> Am 13.05.2016 um 18:11 schrieb John Hardin:
>> On Fri, 13 May 2016, Reindl Harald wrote:
>>
>>> the problem is blowing out such rules with such scores at all with a
>>> non working auto-QA (non-working in: no correction for days as well as
>>> dangerous scoring of new rules from the start)
>>>
>>> 02-Mai-2016 00:12:34: SpamAssassin: No update available
>>> 03-Mai-2016 01:55:05: SpamAssassin: No update available
>>> 04-Mai-2016 00:43:33: SpamAssassin: No update available
>>> 05-Mai-2016 01:48:15: SpamAssassin: Update processed successfully
>>> 06-Mai-2016 00:53:17: SpamAssassin: No update available
>>> 07-Mai-2016 01:21:23: SpamAssassin: No update available
>>> 08-Mai-2016 01:38:23: SpamAssassin: No update available
>>> 09-Mai-2016 00:02:56: SpamAssassin: No update available
>>> 10-Mai-2016 01:10:29: SpamAssassin: No update available
>>> 11-Mai-2016 00:55:46: SpamAssassin: No update available
>>> 12-Mai-2016 00:21:17: SpamAssassin: Update processed successfully
>>> 13-Mai-2016 00:33:31: SpamAssassin: No update available
>>
>> Perhaps you could help with that by participating in masscheck. You seem
>> to get a lot of FPs on base rules; contributing masscheck results on
>> your ham would reduce those
> 
> i can't rsync customer mails to a 3rd party

That is not necessary for masscheck.
> 
> if that would be based on some webervice where you just feed local
> samples and only give the rules which hitted and spam/ham flag out it
> would be somehow possible

The process is clearly documented on the wiki:
https://wiki.apache.org/spamassassin/MassCheck
> 
> especially you would not have much from the bayes-samples because they
> would trigger all sort of wrong rules after strip most headers and and a
> generic received header (which seems to be needed by the bayes-engine
> for whatever reason since it otherwise scores samples completly different)

This is an assumption: you can't know what your data would contribute to
the masscheck process.
> 
> in any case: such a rule with 3.7 must not happen at all, even if it has
> no such bad impact - 3.7 is very high and only deserved when you are
> certain that a mail is spam which is *not* backed by a single header,
> deep inspection or not
> 
That is true, but I think you should put your money where your mouth is:
just run the masscheck on your corpus and send the results to the devs
for inspection. If it's not working, you lost nothing. If the data *is*
useful, we all win from your work by getting better scores.

Just my 2 cents.
Regards,
        Tom

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to