Re: Masscheck statistics

2019-05-16 Thread RW
On Wed, 15 May 2019 18:47:01 +0300 Henrik K wrote: > On Wed, May 15, 2019 at 04:14:30PM +0100, RW wrote: > > > > > I think the concept of scoresets is pointless these days anyway. > > > Does someone actually run legit mailserver without bayes and > > > network tests? > > > > But if you do tha

Re: Masscheck statistics

2019-05-15 Thread Paul Stead
I've already moved over to using my personal gmail. Sadly I don't have control over my corporate email account and the signature keep reappearing. On Wed, 15 May 2019 at 21:28, @lbutlr wrote: > This garbafge is inappropriate for a mailing list. It is also enforceable > BS (I use to post ever em

Re: Masscheck statistics

2019-05-15 Thread @lbutlr
This garbafge is inappropriate for a mailing list. It is also enforceable BS (I use to post ever email with this kind of garbage to a public website) On 15 May 2019, at 10:55, Paul Stead wrote: > This message is private and confidential. If you have received this message > in error, please noti

Re: Masscheck statistics

2019-05-15 Thread Paul Stead
On 15/05/2019, 16:19, "RW" wrote: That's my point. It leaves little incentive to distinguish between network and non-network runs. Good point... I don't know the QA scripts well enough to be able to comment more. It does look like the net contributions during the week from jarif are

Re: Masscheck statistics

2019-05-15 Thread Henrik K
On Wed, May 15, 2019 at 04:14:30PM +0100, RW wrote: > > > I think the concept of scoresets is pointless these days anyway. Does > > someone actually run legit mailserver without bayes and network tests? > > But if you do that you are running a score set that has been optimized > for only network

Re: Masscheck statistics

2019-05-15 Thread RW
On Wed, 15 May 2019 14:59:18 + Paul Stead wrote: > On 15/05/2019, 15:45, "RW" wrote: > > > > >Network rules are only run every saturday: > >https://ruleqa.spamassassin.org/20190511-r1859108-n > > Why is that necessary when network results should be reused? Most > of them

Re: Masscheck statistics

2019-05-15 Thread RW
On Wed, 15 May 2019 17:51:54 +0300 Henrik K wrote: > > That not the point. Without taking account of Bayes, the other rules > > get tuned differently. Bayes has a substantial effect on the score > > of almost everything scanned. > > I think the concept of scoresets is pointless these days anyw

Re: Masscheck statistics

2019-05-15 Thread Paul Stead
On 15/05/2019, 15:45, "RW" wrote: > >Network rules are only run every saturday: >https://ruleqa.spamassassin.org/20190511-r1859108-n Why is that necessary when network results should be reused? Most of them are meaningless if retested after several days. That's the reaso

Re: Masscheck statistics

2019-05-15 Thread Henrik K
On Wed, May 15, 2019 at 03:45:22PM +0100, RW wrote: > On Wed, 15 May 2019 16:41:00 +0300 > Henrik K wrote: > > > On Wed, May 15, 2019 at 02:15:19PM +0100, RW wrote: > > > > > > Why are there no QA statistics for BAYES_* rules? > > > > How do you propose to generate such statistics, when all co

Re: Masscheck statistics

2019-05-15 Thread RW
On Wed, 15 May 2019 16:41:00 +0300 Henrik K wrote: > On Wed, May 15, 2019 at 02:15:19PM +0100, RW wrote: > > > > Why are there no QA statistics for BAYES_* rules? > > How do you propose to generate such statistics, when all contributors > already are supposed to have fully sorted ham/spam corp

Re: Masscheck statistics

2019-05-15 Thread Paul Stead
I noticed the jarif ruleset contributing net scores during nightlies a few weeks ago - I've asked jarif about this but couldn't see an immediate problem/solution. I've also raised a potential issue on the ruleqa RE some potential problems. On 15/05/2019, 14:16, "RW" wrote: Also why do al

Re: Masscheck statistics

2019-05-15 Thread Paul Stead
On 15/05/2019, 14:41, "Henrik K" wrote: jarif has some flags wrong if doing it every day.. https://lists.apache.org/thread.html/ff734261cb1d8ec9dea9df42f314a60ec20c1919b8bd21c71b38553f@%3Cruleqa.spamassassin.apache.org%3E -- Paul Stead Senior Engineer Zen Internet Direct: 01706 902018

Re: Masscheck statistics

2019-05-15 Thread Henrik K
On Wed, May 15, 2019 at 02:15:19PM +0100, RW wrote: > > Why are there no QA statistics for BAYES_* rules? How do you propose to generate such statistics, when all contributors already are supposed to have fully sorted ham/spam corpuses? Seems kind of redundant as all spam would hit BAYES_99 etc.

Masscheck statistics

2019-05-15 Thread RW
Why are there no QA statistics for BAYES_* rules? Also why do all the network rule statistics come from a single contributor labelled 'jarif'? A corpus with only 484 ham in it. If this is genuinely what is being contributed, how is it possible to generate all four score sets?