On Fri, 29 Jul 2011 11:26:57 -0400
Michael Scheidell <michael.scheid...@secnap.com> wrote:

> if you use mysql.pm for other things (sql params, user's, etc), it
> still doesn't seem to make sense to use sdbm AND mysql.

We use PostgreSQL for a number of things, but we found that CDB is
much faster than all competitors for Bayes.  (CDB is weird in that you
can't incrementally update it.  You have to rewrite the entire
database each time.  However, this is pretty fast and the huge
increase in read speed more than makes up for the rewriting.)

My colleague has benchmarks at
http://www.dmo.ca/blog/benchmarking-hash-databases-on-large-data/

Has anyone investigated writing a CDB backend for SpamAssassin's Bayes
implementation?  I'm guessing the need to rewrite the DB each time makes
it a bit complex.

Regards,

David.

Reply via email to