Today I had an interesting situation.

This is more of an FYI in case anyone else has run into similar problems. (cross-posted to MIMEDefang list as well)

I use SpamAssassin with MIMEDefang.

I got notified by one of my users that they were unable to send mail suddenly. after checking the logs I determined that MIMEDefang was timing out and returning errors. the cause for this was very unclear (which is why i'm sharing my findings with all of you)...

After digging around (and some assistance from David Skoll on the MIMEDefang list) I was able to determine that the problem was caused by SpamAssassin not being able to connect to the database server where the bayes database is stored. (using MySQL on a remote host)

this caused all sorts of "weirdness for no apparently good reason" and was initially very confusing to diagnose.

The symptoms were:

* mimedefang started to return "busy timeout" errors.
* when restarting MIMEDefang (with embedded perl enabled) the multiplexor wouldn't complete loading and mimedefang wouldn't create the socket, causing sendmail to spit out "file /path/to/mimedefang/socket/file unsafe" errors.
* turning off embedded perl would allow mimedefang to start and create the socket, but then would spawn multiple instances of mimedefang.pl which just hung.
* mimedefang.pl -test and/or mimedefang.pl -features would hang indefinitely with no output.



the workaround:

after determining the problem to be the connection to the SQL server, simply setting "use_bayes 0" in sa-mimedefang.cf and restarting mimedefang resolved the problem. however, this obviously didn't utilize the bayes facilities.


the questions:

I understand that the SQL code for SA is still 'experimental'. is there any way currently to set a forced timeout to connect to the SQL server?

is this something I should open a BZ ticket about?

being that I'm definitely not an SQL guru, does anyone have any suggestions for configuring a high-availability MySQL server configuration that could failover to a backup server should the primary one become incapacitated by a low-level hard drive failure?

Currently I have 1 MySQL database server with the bayes databases on it (among other databases) and my primary and secondary mail servers both make connections to it to check the bayes database.

This may be somewhat specific to the MIMEDefang implentation, but I suspect that there is a possibility that this type of behavior could have negative impact in other types of SA implementations as well.
again, this is mostly an FYI, but any suggestions are welcome.


Thanks,

Alan

Reply via email to