http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4860





------- Additional Comments From [EMAIL PROTECTED]  2006-04-15 18:31 -------
I've had a chance to benchmark this now.  Using this test directory
of rules, with just the DNSBLs and URIBLs active (no Razor/Pyzor/DCC
or non-net rules):

: jm 493...; l tstrules/
total 36
-rw-rw-r--  1 jm jm  5516 Apr 15 18:58 10_default_prefs.cf
-rw-r--r--  1 jm jm 14312 Apr 15 18:58 20_dnsbl_tests.cf
-rw-r--r--  1 jm jm  7536 Apr 15 18:58 25_uribl.cf
-rw-rw-r--  1 jm jm    50 Apr 15 18:58 plugins.pre

A 30-message corpus of spam, and this command line:

sudo /etc/init.d/bind9 restart; time ./mass-check -C tstrules --net -n -o
spam:dir:/tmpfs/tstcor/ > ~/DL/o1a

These are the timings:

  trunk:

  real    0m55.442s user    0m4.662s sys     0m0.157s
  real    1m02.561s user    0m4.506s sys     0m0.158s
  real    0m56.902s user    0m4.283s sys     0m0.152s

  with patch:

  real    0m47.561s user    0m3.888s sys     0m0.157s
  real    0m50.360s user    0m4.872s sys     0m0.169s
  real    0m48.711s user    0m5.367s sys     0m0.206s

So avg real time of 58.301 vs 48.877 -- that's a 19% speedup!
In my opinion that's a *very* nice result ;)



Next, check out the scan time distribution:

: jm 489...; perl -ne '/scantime=(\d+)/ and print "$1\n"' ~/DL/o1* |sort 
-n|uniq -c
     15 0
     22 1
     27 2
     15 3
      8 4
      2 5
      1 6
: exit=0 Sat Apr 15 19:20:11 IST 2006; cd /home/jm/ftp/spamassassin/masses
: jm 490...; perl -ne '/scantime=(\d+)/ and print "$1\n"' ~/DL/o2* |sort 
-n|uniq -c
     13 0
     34 1
     31 2
      5 3
      4 4
      3 5

Quite an improvement.




Score distribution --

trunk:
 jm 498...; perl -ne 'next if /^#/; /^(\S+\s+\S+) / and print "$1\n"' ~/DL/o1*|
sort -n | uniq -c
      3 .  0
     16 .  2
     14 .  3
      3 .  4
      6 Y  5
      9 Y  6
     11 Y  7
      4 Y  8
     12 Y  9
      6 Y 10
      3 Y 11
      3 Y 12

with patch:
: exit=0 Sat Apr 15 19:24:27 IST 2006; cd /home/jm/ftp/spamassassin/masses
: jm 499...; perl -ne 'next if /^#/; /^(\S+\s+\S+) / and print "$1\n"' ~/DL/o2*|
sort -n | uniq -c
      3 .  0
     15 .  2
     15 .  3
      3 .  4
      7 Y  5
     12 Y  6
     11 Y  7
     12 Y  9
      6 Y 10
      3 Y 11
      3 Y 12

Not a whole lot of excitement there, which is a good result ;)


Given these results, and the lack of any negative comments, I think it's
check-in-able... there is still the issue with test_log() though, but
I think that is quite minor and can be taken care of later.

Attached is a bugfix patch for a duplicated "poll_responses()" call,
and a tarball with the test files -- mass-check logs and test corpus
messages -- from the above benchmarking run.






------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

Reply via email to