On 12/25, John Hardin wrote:
> Sorry, I realize now that was unclear. What does "current" in
> "current emails" mean? What time window? Since the last masscheck? A
> week? Six months? 

Since the last mass check of that type (network / nightly), yes.  

> And how do you ensure a sufficiently large corpora
> if you tightly restrict that time window?

I can see how that would be a problem, and my first thought is... how old
is the average email that SA test scores are currently based on?  This
stuff changes.

And my second thought is:  I think it would be best to run two sets
of mass checks.  One using the test results at the time each email was
received, using only emails since the last mass check of the same
type, to get more useful data on potentially time sensitive tests.
One re-running all current tests on the entire available corpora, to
have a "sufficiently large corpora".

-- 
"Life is either a daring adventure or it is nothing at all."
- Helen Keller
http://www.ChaosReigns.com

Reply via email to