Daryl C. W. O'Shea writes: > Justin Mason wrote: > > is this not reinventing a bit of the stuff in "masses"? > > The bulk of work is done by the existing masses code. The stuff I added > deals with making sure the scores generated for for the new rules don't > affect the existing scores for the old rules. > > merge-scores does something similar to rewrite-cf-with-new-scores but > rewrite-cf-with-new-scores didn't work (I can't remember why). > > It's all quite quick and dirty right now... I need to clean it up a lot > (and merge where possible with existing masses code). After putting off > doing this since October I just needed to get it working so I can deploy > 3.2 on production systems.
ok cool, sounds sensible! > > Also, are you using the perceptron? don't ;) the GA produces better > > results with current spam and rules, I've found. That would explain > > the poor results on set0, I'd guess. > > I'm using the GA. The ~52% hit rate is with the scores you generated > two months ago (with the GA). The ~94% hit rate is with the new rules > (and new scores) along with the old rules and scores. wow, that's pretty bad... it might be worth investigating this to make sure it's not biased data in the nightly logs. mind you, set 0 is always pretty bad nowadays.... --j.
