Re: Perceptron/GA logic w/r/t low-scoring high-S/O rules?

2014-04-05 Thread John Hardin
On Sat, 5 Apr 2014, John Hardin wrote: On Sat, 5 Apr 2014, Axb wrote: On 04/05/2014 07:33 PM, John Hardin wrote: > The masscheck spam corpus isn't pathetically small, but at the moment > it's *strongly* biased towards the traffic *you* are seeing. Your spam > is 490k+ of the 510k total

Re: Perceptron/GA logic w/r/t low-scoring high-S/O rules?

2014-04-05 Thread John Hardin
On Sat, 5 Apr 2014, Axb wrote: On 04/05/2014 07:33 PM, John Hardin wrote: The masscheck spam corpus isn't pathetically small, but at the moment it's *strongly* biased towards the traffic *you* are seeing. Your spam is 490k+ of the 510k total corpus. Should I feel guilty for only masscheck

Re: Perceptron/GA logic w/r/t low-scoring high-S/O rules?

2014-04-05 Thread Axb
On 04/05/2014 07:33 PM, John Hardin wrote: The masscheck spam corpus isn't pathetically small, but at the moment it's *strongly* biased towards the traffic *you* are seeing. Your spam is 490k+ of the 510k total corpus. Should I feel guilty for only masschecking the last 21 days? That was on

Re: Perceptron/GA logic w/r/t low-scoring high-S/O rules?

2014-04-05 Thread John Hardin
On Sat, 5 Apr 2014, Axb wrote: On 04/05/2014 06:42 PM, John Hardin wrote: I'd rather not have to resort to hitting the masscheck system over the head with the "tflags publish" cluebat, but I will if it keeps ignoring these rules. this would by very unwise and would create rule bloat as obv

Re: Perceptron/GA logic w/r/t low-scoring high-S/O rules?

2014-04-05 Thread Axb
On 04/05/2014 06:59 PM, Axb wrote: If Darxus sees so much of this type, why isn't he running a masschecker? opps. sorry- I hand't seen he is indeed participating.

Re: Perceptron/GA logic w/r/t low-scoring high-S/O rules?

2014-04-05 Thread Axb
On 04/05/2014 06:42 PM, John Hardin wrote: I'd rather not have to resort to hitting the masscheck system over the head with the "tflags publish" cluebat, but I will if it keeps ignoring these rules. this would by very unwise and would create rule bloat as obviosuly the corpus isn't seeing much

Perceptron/GA logic w/r/t low-scoring high-S/O rules?

2014-04-05 Thread John Hardin
Could someone who understands the scoring logic used by the perceptron or GA please comment on why this rule (and others like it) are only being scored at 0.01? http://ruleqa.spamassassin.org/20140404-r1584563-n/T_DX_TEXT_02/detail I would think that a rule which hits nothing but spam (S/O 1.0

[Bug 6444] tok_touch_all update forces full table scan, kills performance.

2014-04-05 Thread bugzilla-daemon
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6444 --- Comment #27 from peter gervai --- Sorry, it seems only the latest release contains the patch and I didn't find it in the svn right away. 3.4.x contains the patch. Still I see a few deadlocks but not that much: 2014-04-05 11:40:55 C

Rule updates are too old - 2014-04-05

2014-04-05 Thread darxus
20140404: Spam or ham is below threshold of 150,000: http://ruleqa.spamassassin.org/?daterev=20140404 20140404: Spam: 516670, Ham: 144507

[Bug 6444] tok_touch_all update forces full table scan, kills performance.

2014-04-05 Thread bugzilla-daemon
https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6444 --- Comment #26 from peter gervai --- Well I was sent here (by based on the duplicate at the bottom) and yes, this is still problem, and additional problem is that the suggested code to be renamed use a nonexistant SQL function: 2014-04