Hello Justin, Wednesday, October 5, 2005, 6:11:26 PM, you wrote:
JM> "Daryl C. W. O'Shea" writes: >> Sounds good, but I think the limited (and relatively static?) corpus may >> be an issue for rule development aimed at catching new spam signs. JM> A static-ish ham corpus isn't a big problem, but we may need to supplement JM> the spam corpus with fresh feeds of new spam. It should be possible to do JM> this either from trap feeds, or via submissions from the nightly corpus JM> submitters (rsync up bits of your corpus as you see fit). Traps is JM> probably easier. Yes, a ham corpus has some privacy concerns, but a spam corpus, especially spam sent to non-existant addresses captured via a catch-all account, has almost none. I believe it'd be feasible for me to commit 5k or so spam weekly. Given the almost instant feedback of the preflight buildbot, I can easily see a SARE Ninja like (well, no, let's not mention any names) submitting a rules test at 6:00, modifying the rule and resubmitting at 7:00, again at 8:00, again at noon, again at 4:00 pm, and again shortly before the nightly corpus run. It'd be good to be able to specify that the nightly corpus run should test only the last version. it'd also be good to be able to specify that the nightly corpus run should test more than one version. Any way to do that? I'm thinking that not only should we be able to submit rules, but also submit control parameters such as "preflight run 05-10-08-04-44 rule TEST_STOCK_EXPLODES -- don't bother" to remove replaced/useless rules from the nightly queue. Finally, according to http://wiki.apache.org/spamassassin/RulesProjBuildBot, the BuildBot provides "Good web UI for "builds in progress"; you can monitor progress as it happens", but I can't see where to review that progress. Any pointers? Bob Menschel
