Randal, Phil wrote:
Arik Raffael Funke wrote:
Matthias Haegele wrote:
Arik Raffael Funke schrieb:
I.e. what about expiring tags, etc. Sa-learn would routinely
re-encounter 5 year-old spam...
Q: Would it be useful (regarding cpu and i/o performance) if only
learned messages (copied from a maildir) that are new (e.g. not older
than a week) or would checking this (date of file), be almost as bad
as copying it for sa-learn?
I would have thought that relearning age-old ham & spam would have the
effect of polluting the Bayes database, not enhancing it, because both
ham and spam characteristics change over time.

Thanks everybody. Opinion on whether this training procedure is counter-productive seems divided... There seems quite a lot anecdotal evidence that it does not have negative effects on one side and "theoretical objections" on the other.

The effect Phil mentioned was actually what prompted me to ask my question. More clearly phrased the question is: whether previously seen, old messages _really_ pollutes the Bayes database. In my opinion this depends on the actual implementation of the learning function in spamassassin or resp. the implementation of the "skipping" of previously seen messages by sa-learn.

Is anybody familiar with the inner workings of spamassassin and thus able to provide an answer to the question on that basis?

Best regards,
Arik

Reply via email to