On Tue, 14 May 2002 00:23:07 -0500 Dustin Puryear <[EMAIL PROTECTED]> wrote:
> This is extremely interesting. Michael, do you find this happens at > seemingly random times though? We can go a week or two with no > problems, and then bam, I get a 911. Of course, our volume is > considerably lower than yours. Another issue, and one that may > differentiate our problems from yours (but hopefully not as your at > least have a work-around), is that I can sometimes restart Cyrus, and > even after a restart, no new connections are serviced. (They connect, > but get no service.) I've found that when this happens Cyrus will > often appear to work for a VERY short while, and then revert back to > the point where connections occur but no service (pop3d) responds. > > Shouldn't a restart completely fix the problem? If so we may be > fighting something different. A reboot also doesn't always clear up > the problem. Again, Cyrus will come up, but then fail shortly > thereafter. I've seen the exact same behaviour you're describing here on Linux when there's a lack of available entropy. Pop3d is more susceptible to this problem, because it calls some sasl functions that read some bits from /dev/random. I didn't check why exactly are they necessary and why imapd does not use these functions. Linux gathers entropy from disk i/o events and keyboard keystrokes iirc, but i saw it drain even on an extremly busy mail server. Imapd goes on, but pop3d processes just pile up, each one accepting a connection but not starting any service. During the peak hours i've seen this effect numerous times, resulting in too many open files and thus blocking other processes. One of the solutions is to recompile the kernel with the netdev-random patches (http://www.tech9.net/rml/linux/), and if this does not help, find /dev/random references in the sasl library and replace them with /dev/urandom. Iirc it's only defined in one place. -- Jure Pecar
msg07742/pgp00000.pgp
Description: PGP signature