Daryl C. W. O'Shea writes: > [to the dev@ list] > > Justin Mason wrote: > > Daryl C. W. O'Shea writes: > >> We're lacking > >> data. We really need to do nightly net enabled checks for the updates > >> to be really useful. > > > > urgh. that'd be tricky. I don't know if you've noticed, but the > > --net mass-check corpus is a *lot* smaller than the set0 one, > > purely because it takes so much longer :( > > That's dependent on whether or not people have already scanned their > corpus messages. If they're all already scanned it runs at the same speed. > > How about extending mass-check to either markup corpus messages that it > scans (while net-enabled) that have never been scanned before or caching > (to disk) the net rule hits that it gets when it does the (net-enabled) > scan. In either case eliminating ever having to do the net checks on > the message again. > > If for some reason that's not favoured, I'd settle for a --reuse-only > run that includes all of your messages for set0 results and only > reusable messages for set1 results... all done in a single mass-check.
+1 OK, I like that. We should not be attempting to use non-reused results for rescoring, at all, given the temporal sensitivity of net-rule lookups. We should keep the "full" --net run at the weekends, which can do net lookups against non-reused messages, to measure new dev rules. mass-check logs the status of reuse in the output lines, btw, logging either "reuse=yes" or "reuse=no", so we should be able to estimate usability of this now... > >> If you're running with set0 only your detection > >> rate already sucks, and if you're running with set1 you'll only get the > >> new rules once a week. > > > > Can we not just assume that it's safe to copy the set0 scores for > > the rest of the week? > > I don't believe that it is safe. Often the set1 scores are a *lot* > lower than the set0 scores. The set0 scores are weighted a lot heavier > (by the GA) to move the spam TP rate from 46% to 80% (seriously, check > out the scores/stats-set0 file) while set1 only moves from 88% to 96%. > > If we had to just use the set0 scores I don't think I'd be comfortable > with an adjustment factor of more than 25% (that is the set1 scores > would only be a quarter of the set0 scores). wow. those are big differences :( ok, if we can get the --reuse-only trick working, I think that'll work fine -- allowing nightly set1 mass-checks without taking forever. > >> Additionally, I think we should re-use bayes results so we can more > >> accurately generate scores for set2 and 3. Otherwise I think I'm going > >> to just copy them over from sets0 and 1 and lower them with some random > >> adjustment factor. > > > > Either of those options make sense for me. > > > > I think we need to come up with some kind of extrapolation algorithm for > > these, to be honest; I don't think 4 mass-checks are at all possible. :( > > The only reason we would need 4 mass-checks is if there are meta rules > that fire in the non-net or non-bayes scoresets that won't fire if a net > or bayes rule does fire. I'm not aware of any such rules, but it's > possible for it to happen (although I'd rather just let the GA decide > whether or not the rule should be used by the net or bayes scoreset > rather than the meta rule). Otherwise, we can extract everything we > need from a single mass-check. yeah, I'm not worried about those cases. --j.
