On Tue, Aug 30, 2005 at 06:29:20AM -0700, Dale Luck wrote: > In my study of where SA is spending most of its time, it became > quickly apparent the do_body_tests is by far the largest cpu hog. > Indeed i've seen just a single file (sare_fraud) can use up half of > the cpu cycles for every spam scan. > > I was wondering if anyone investigating flipping inside out the > algorithm used to apply the rules to the body.
I believe I tried to look at this one time, but it got pretty messy to hack that in and I didn't have enough time to spend on it. Any speedup seemed to be minimal, but it might be worth looking into in greater detail. Also, I'm not convinced study helps a whole lot. Having said that, some of our regular expressions could probably be tuned better so that study helps more. The case insensitive thing can be a very large speedup; however, we do have many tests that rely on capitalization. We'd need a way of splitting them up or something, since we definitely need some case sensetive rules -- Duncan Findlay
signature.asc
Description: Digital signature
