so I was thinking about the issue of how to assign good scores for new rules in rule updates.
It occurred to me that we can do it reasonably simply. Typically there will be a very small number of rules, and we can probably assume that existing scores can be reused safely -- in fact, I think we *want* to keep existing scores the same, or roughly the same, if possible, since it reduces user confusion. I don't think picking scores based solely on freqs is safe, btw. This doesn't take overlap into account, which has proven to be an issue in the 3.1.x updates (esp when 3 people write the same rule ;). Here are the two ways I'm thinking of: 1. Perceptron. If we lock all the existing rules, and run the perceptron on just the small number (let's say 10) of new rules, it should be pretty fast. However, I'm slightly worried that the perceptron's idea of "good" may not match ours -- it could decide that increased FPs from the new rules are acceptable, if the existing scores have a low FP rate, for instance. I'm not quite sure exactly how the current perceptron code can be controlled to give a certain FP/FN%. It looks like the best approach is currently to just run it multiple times, tweaking the input config, and pick the one that looks nicest in terms of results, rather than knowing that certain input will have certain output effects. I may be missing something ;) 2. Brute force. If the number of rules is small enough, we can quickly brute-force-search the score-space, instead of having to search all possibilities using the perceptron or a GA. This may work nicely, since we can aim for a specific target FP/FN% ratio, and search for optimal scores for the new rules; something like "for each rule, find the lowest score that (a) decreases FN% (b) without increasing FP% more than 0.0X%". It might be worth doing both, since #2 probably won't work well if/when we add, let's say, 100 new rules in an update. Assuming the commercial vendors have looked into this -- anyone care to comment on how they solved it, if they have? --j.
