--As of September 27, 2006 5:43:28 PM -0700, Kelson is alleged to have said:

Daniel T. Staal wrote:
True.  So...  Optimal is obviously to train, once and correctly, on all
messages.  Sending a message through that has been trained will consume
*some* resources, but less then one that still needs to be learned.

So the exact balance is a complicated question.  ;)

I just train on everything.  If it's already learned from a message, it
takes a few resources for it to recognize that, but almost certainly less
time than it would have taken me to separate them out.

--As for the rest, it is mine.

Depends on the setup. For instance, given the explanations above, I'll start a system to automatically learn from my 'checkspam' folder, but not my 'highspam' folder. I have procmail automatically sort my spam by score, so I can pay extra attention to low-scoring spam. (Which is more likely to be ham which was misplaced than the high-scoring spam.)

So, since I *already* have them separated out, I can avoid the double-check. ;)

Anyway, I just knew that there was an automatic system, and at the very least there is *some* load to re-learning, even if a full analysis is skipped. It would be interesting to see how much it actually is, compared to an easy filter. If I find time, I may try to figure out a good test.

Daniel T. Staal

---------------------------------------------------------------
This email copyright the author.  Unless otherwise noted, you
are expressly allowed to retransmit, quote, or otherwise use
the contents for non-commercial purposes.  This copyright will
expire 5 years after the author's death, or in 30 years,
whichever is longer, unless such a period is in excess of
local copyright law.
---------------------------------------------------------------

Reply via email to