On Fri, 29 Jul 2011 11:59:14 -0400
Michael Scheidell <michael.scheid...@secnap.com> wrote:

> in mysql, we don't journal.  what does that journaling time do to SA 
> processing times? Id hate to think we go from 1 s/email processing
> time to 60 seconds or something while journal is locked.

Journalling *improves* training speed.  Our system works like this: Whenever
someone wants to train a message, we just append a note to a journal table
saying "Please train message xxx as {spam,nonspam}".  This INSERT-only
operation cannot block under PostgreSQL MVCC.

Periodically, the journal runner pulls the list of awaiting trainings
from the journal table and actually runs the Bayes trainings.
Obviously, we batch them up so each Bayes database is only rewritten
at most once per journal run rather than once per message trained.

Regards,

DAvid.

Reply via email to