From: Duncan Findlay [mailto:[EMAIL PROTECTED]
Sent: Tue 8/30/2005 3:04 PM
To: [email protected]
Subject: Re: body rule speed
On Tue, Aug 30, 2005 at 06:29:20AM -0700, Dale Luck
wrote:
> In my study of where SA is spending most of its time, it
became
> quickly apparent the do_body_tests is by far the largest cpu
hog.
> Indeed i've seen just a single file (sare_fraud) can use up half
of
> the cpu cycles for every spam scan.
>
> I was
wondering if anyone investigating flipping inside out the
> algorithm used
to apply the rules to the body.
I believe I tried to look at this one
time, but it got pretty messy to
hack that in and I didn't have enough time
to spend on it. Any speedup
seemed to be minimal, but it might be worth
looking into in greater
detail. Also, I'm not convinced study helps a whole
lot. Having said
that, some of our regular expressions could probably be
tuned better
so that study helps more.
The case insensitive thing can
be a very large speedup; however, we do
have many tests that rely on
capitalization. We'd need a way of
splitting them up or something, since we
definitely need some case
sensetive rules
--
Duncan
Findlay
