I thought a bit more about the --reuse problem. While there are pros and
cons to reuse, I guess there is more benefit to --reuse than without. So I
now recommend it in all cases of masscheck.
On Fri, Dec 24, 2010 at 1:58 PM, Warren Togami Jr. wtog...@gmail.comwrote:
This does remind me
On Fri, 24 Dec 2010, Warren Togami Jr. wrote:
Also current is referring to the nightly masscheck snapshot of svn trunk
including the latest rules.
Sorry, I realize now that was unclear. What does current in current
emails mean? What time window? Since the last masscheck? A week? Six
months?
On 12/25, John Hardin wrote:
Sorry, I realize now that was unclear. What does current in
current emails mean? What time window? Since the last masscheck? A
week? Six months?
Since the last mass check of that type (network / nightly), yes.
And how do you ensure a sufficiently large corpora
In general, please stop worrying about your corpus being ideal. Our sample
size right now is so small that even non-ideal corpora would be helpful.
Get started with cron nightly masschecks then work on improving your corpus
later.
I personally include:
* The last 4 weeks of spam. I use
I am one of the editors of the dnswl.org database, and while it is tempting
to participate in the mass-checks, considering the effects that would have
on the dnswl tests or not, I think it's better to not have that skew. I
like having the QA test results to independently evaluate dnswl.
I wonder
On Fri, 24 Dec 2010, dar...@chaosreigns.com wrote:
And it still disturbs me that mass checks use anything but the test
results at the time the email is originally scored (like from the
tests value of the X-Spam-Status header). Since I'm sure the time
variance improves the accuracy of things
On 12/24, John Hardin wrote:
If there was some way to capture the score of RBL tests separately
from non-RBL tests and use them in place of the current RBL results
I might agree you have a point; but if the mass checks ignore the
scores that the current ruleset generates against historical
http://www.mail-archive.com/users@spamassassin.apache.org/msg69546.html
Whitelists have almost zero impact on spamassassin's determination of ham vs
spam. Believe me. This is not harmful.
If you have any ham corpus it would be extremely useful to spamassassin. We
have a severe lack of variety
On Fri, 24 Dec 2010, dar...@chaosreigns.com wrote:
On 12/24, John Hardin wrote:
If there was some way to capture the score of RBL tests separately
from non-RBL tests and use them in place of the current RBL results
I might agree you have a point; but if the mass checks ignore the
scores that
I think what he is failing to understand is the scores are irrelevant, as
the masscheck is only determining yes or no for each rule across a corpus.
Also current is referring to the nightly masscheck snapshot of svn trunk
including the latest rules.
This does remind me however that there is a
10 matches
Mail list logo