Michael Parker wrote to Ryan Thompson:
I've done a little benchmarking, probably several hundred different benchmarks. I'm currently in the middle of nowhere with really bad dialup so can't elaborate much. For a quick look at some very recent benchmarks look at Bug 3331 in Bugzilla. The base case is very close to the code in 3.0.0-rc1. Bug 3331 also briefly discusses the benchmark methodology.
I had half followed that thread already, but had missed the benchmarks in the noise. Thanks for the pointer.
The short answer, keeping in mind SQL is optimized for scan operations, is:
SQL is twice as slow for learn operations. SQL is a few % faster for scanning via spamd SQL is ~7 times faster for expire SQL is pretty slow for forgetting SQL is a few % faster for scanning via spamassassin
So, that more or less agrees with what I saw, then. (Considering I only tested scanning; as that's the one that matters :-)).
I'll happily answer questions about bayes in SQL, it's not for everyone but a lot of work was put in to make it as fast and useful as possible.
"Useful", I'd definitely agree with, and "fast" seems to be about the same as DB_file (a few % on Bayes doesn't make a difference to the overall scanning process), so credit is due for the more scalable alternative to DB_file.
- Ryan
-- Ryan Thompson <[EMAIL PROTECTED]>
SaskNow Technologies - http://www.sasknow.com 901-1st Avenue North - Saskatoon, SK - S7K 1Y4
Tel: 306-664-3600 Fax: 306-244-7037 Saskatoon Toll-Free: 877-727-5669 (877-SASKNOW) North America
