Re: [Mimedefang] Freezing/hanging MIMEDefang 2.56
Bernd Petrovitsch wrote: > Does anyone else experience a freezing mimedefang-multiplexor parent > process with up to 200 child processes? Not I. (I develop on Sarge, and we have some Sarge-based CanIt appliances running MIMEDefang/CanIt that experience pretty heavy load.) > This happens on Linux Debian/Sarge if we try to stress-test the > installation. Is there a known maximum number of child processes so that > it works? The maximum number of child processes is typically limited by the number of file descriptors in an "fd_set" structure. This number (FD_SETSIZE) is typically 1024, and each slave consumes 4 or 5 descriptors (depending on whether or not you use the "-Z" flag), limiting you to around 200-256 slaves. However, you shouldn't see hangs if you exceed that number. Instead, you should see error messages about bad file descriptors in your logs. > The mimedefang-multiplexor process hangs in a futex(2) SysCall - so it > seems to be some locking problem. There's no explicit locking. The multiplexor is a single-threaded, event-driven process. All synchronization is implicit, and happens in a select() call. > But what I found was in activateSlave() around line 2313: > snip > sigemptyset(&sigs); > sigprocmask(SIG_BLOCK, &sigs, NULL); > snip Doh... that IS a no-op. It should be SIG_SETMASK. But that shouldn't be the cause of the problem. Regards, David. ___ NOTE: If there is a disclaimer or other legal boilerplate in the above message, it is NULL AND VOID. You may ignore it. Visit http://www.mimedefang.org and http://www.roaringpenguin.com MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com http://lists.roaringpenguin.com/mailman/listinfo/mimedefang
[Mimedefang] Freezing/hanging MIMEDefang 2.56
Does anyone else experience a freezing mimedefang-multiplexor parent process with up to 200 child processes? This happens on Linux Debian/Sarge if we try to stress-test the installation. Is there a known maximum number of child processes so that it works? /var/spool/MIMEDefang is on a tmpfs. EMB_PERL is enabled. We also use Spam-Assassin with bayes_auto_learn (with the default DBM backend) but disabled the occasitional --sync (and do it once in the night). But since the parent process just manges the children, this should IMHO not matter. There is not almost no swapping activity during the test (which is pretty short since the freeze occurs within a few minutes). The mimedefang-multiplexor process hangs in a futex(2) SysCall - so it seems to be some locking problem. AFAICS (reading mimedefang-multiplexor.c and friends) there is no explicit synchronization/locking in mimedefang-multiplexor which may use futex(2) below. But what I found was in activateSlave() around line 2313: snip sigemptyset(&sigs); sigprocmask(SIG_BLOCK, &sigs, NULL); snip Reading the manual pages and unless I'm missing something, this is basically a no-op. So either this should have been "sigprocmask(SIG_SETMASK, ...)" (and enables thus all signals) or the two lines can be deleted completely. Any hints or ideas where to look into? Bernd -- Firmix Software GmbH http://www.firmix.at/ mobil: +43 664 4416156 fax: +43 1 7890849-55 Embedded Linux Development and Services ___ NOTE: If there is a disclaimer or other legal boilerplate in the above message, it is NULL AND VOID. You may ignore it. Visit http://www.mimedefang.org and http://www.roaringpenguin.com MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com http://lists.roaringpenguin.com/mailman/listinfo/mimedefang