http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4860
------- Additional Comments From [EMAIL PROTECTED] 2006-04-15 18:31 -------
I've had a chance to benchmark this now. Using this test directory
of rules, with just the DNSBLs and URIBLs active (no Razor/Pyzor/DCC
or non-net rules):
: jm 493...; l tstrules/
total 36
-rw-rw-r-- 1 jm jm 5516 Apr 15 18:58 10_default_prefs.cf
-rw-r--r-- 1 jm jm 14312 Apr 15 18:58 20_dnsbl_tests.cf
-rw-r--r-- 1 jm jm 7536 Apr 15 18:58 25_uribl.cf
-rw-rw-r-- 1 jm jm 50 Apr 15 18:58 plugins.pre
A 30-message corpus of spam, and this command line:
sudo /etc/init.d/bind9 restart; time ./mass-check -C tstrules --net -n -o
spam:dir:/tmpfs/tstcor/ > ~/DL/o1a
These are the timings:
trunk:
real 0m55.442s user 0m4.662s sys 0m0.157s
real 1m02.561s user 0m4.506s sys 0m0.158s
real 0m56.902s user 0m4.283s sys 0m0.152s
with patch:
real 0m47.561s user 0m3.888s sys 0m0.157s
real 0m50.360s user 0m4.872s sys 0m0.169s
real 0m48.711s user 0m5.367s sys 0m0.206s
So avg real time of 58.301 vs 48.877 -- that's a 19% speedup!
In my opinion that's a *very* nice result ;)
Next, check out the scan time distribution:
: jm 489...; perl -ne '/scantime=(\d+)/ and print "$1\n"' ~/DL/o1* |sort
-n|uniq -c
15 0
22 1
27 2
15 3
8 4
2 5
1 6
: exit=0 Sat Apr 15 19:20:11 IST 2006; cd /home/jm/ftp/spamassassin/masses
: jm 490...; perl -ne '/scantime=(\d+)/ and print "$1\n"' ~/DL/o2* |sort
-n|uniq -c
13 0
34 1
31 2
5 3
4 4
3 5
Quite an improvement.
Score distribution --
trunk:
jm 498...; perl -ne 'next if /^#/; /^(\S+\s+\S+) / and print "$1\n"' ~/DL/o1*|
sort -n | uniq -c
3 . 0
16 . 2
14 . 3
3 . 4
6 Y 5
9 Y 6
11 Y 7
4 Y 8
12 Y 9
6 Y 10
3 Y 11
3 Y 12
with patch:
: exit=0 Sat Apr 15 19:24:27 IST 2006; cd /home/jm/ftp/spamassassin/masses
: jm 499...; perl -ne 'next if /^#/; /^(\S+\s+\S+) / and print "$1\n"' ~/DL/o2*|
sort -n | uniq -c
3 . 0
15 . 2
15 . 3
3 . 4
7 Y 5
12 Y 6
11 Y 7
12 Y 9
6 Y 10
3 Y 11
3 Y 12
Not a whole lot of excitement there, which is a good result ;)
Given these results, and the lack of any negative comments, I think it's
check-in-able... there is still the issue with test_log() though, but
I think that is quite minor and can be taken care of later.
Attached is a bugfix patch for a duplicated "poll_responses()" call,
and a tarball with the test files -- mass-check logs and test corpus
messages -- from the above benchmarking run.
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.