On 12/25, John Hardin wrote: > Sorry, I realize now that was unclear. What does "current" in > "current emails" mean? What time window? Since the last masscheck? A > week? Six months?
Since the last mass check of that type (network / nightly), yes. > And how do you ensure a sufficiently large corpora > if you tightly restrict that time window? I can see how that would be a problem, and my first thought is... how old is the average email that SA test scores are currently based on? This stuff changes. And my second thought is: I think it would be best to run two sets of mass checks. One using the test results at the time each email was received, using only emails since the last mass check of the same type, to get more useful data on potentially time sensitive tests. One re-running all current tests on the entire available corpora, to have a "sufficiently large corpora". -- "Life is either a daring adventure or it is nothing at all." - Helen Keller http://www.ChaosReigns.com