Re: [Mailman-Users] OutgoingRunner Failing
Hi all, I have the OutgoingRunner process failing quite regularly, and this has resulted in a build up of mail in the qfiles/out directory. How do I find out what is going on in this process for it to be failing like this? I experienced the same problem running Mailman 2.1.5 on a FreeBSD 4.6 box. I couldn't find anything telling in the logs. I upped the number of times that the qrunner would restart from 10 to 50. Still had the problem occur. Now I have a cron job restart mailman every 4 hours. Nothing backs up in the outgoing directory now and I haven't notice any residual problems from this change. This doesn't get to the source of the problem though. :( I figured it might be something with the content of the messages but with hundreds of lists running on 4 different servers it isn't something easily tracked down. Additionally no other mailman users complained about it until now. Sean -- Mailman-Users mailing list [EMAIL PROTECTED] http://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Re: [Mailman-Users] OutgoingRunner Failing (fwd)
Oh, I forgot to mention. I'm not running multiple slices so Greenburg's patch doesn't apply. Sean -- Forwarded message -- Date: Wed, 15 Sep 2004 08:03:57 -0400 (EDT) From: Sean [EMAIL PROTECTED] To: [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] Subject: Re: [Mailman-Users] OutgoingRunner Failing Hi all, I have the OutgoingRunner process failing quite regularly, and this has resulted in a build up of mail in the qfiles/out directory. How do I find out what is going on in this process for it to be failing like this? I experienced the same problem running Mailman 2.1.5 on a FreeBSD 4.6 box. I couldn't find anything telling in the logs. I upped the number of times that the qrunner would restart from 10 to 50. Still had the problem occur. Now I have a cron job restart mailman every 4 hours. Nothing backs up in the outgoing directory now and I haven't notice any residual problems from this change. This doesn't get to the source of the problem though. :( I figured it might be something with the content of the messages but with hundreds of lists running on 4 different servers it isn't something easily tracked down. Additionally no other mailman users complained about it until now. Sean -- Mailman-Users mailing list [EMAIL PROTECTED] http://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Re: Fwd: Re: [Mailman-Users] OutgoingRunner Failing
David Richards wrote: Hi Brian, I have seen the same problem as you described in your post, I was running a shared installation via NFS. With OutgoingRunner's running on each box. Would you expect this to have the same results? While the idea of running MM on multiple machines in itself kind of makes my head hurt, I'm pretty sure that you're seeing a similar if not the same problem that I encountered. The logs you've posted look identical. From what I could tell from the code, if you are running more instance of OutgoingRunner (or any qrunner, for that matter), you *will* have regular crashes. This is because (assuming 4 slices, numbered 0 through 3) each slice should manage 1/4 of the queue hash space. However, as coded, slice 0 will grab files from the *entire* queue, not just the first quarter. This results in a race condition. The qrunner crash is a result of both slice 0 and another slice seeing a file in the last 3/4 of the hash space, and both beginning to process it -- one will finish and erase the file, the other slice will crash. Try making the following change: In mailman-2.1.5/Mailman/Queue/Switchboard.py, change line 167 from if not lower or (lower = long(digest, 16) upper): to if (lower == upper) or (lower = long(digest, 16) upper): This completely eliminated my problem. Brian. Date: Tue, 14 Sep 2004 14:25:42 +0900 From: Jim Tittsler [EMAIL PROTECTED] Subject: Re: [Mailman-Users] OutgoingRunner Failing To: David Richards [EMAIL PROTECTED] Cc: [EMAIL PROTECTED] On Sep 14, 2004, at 09:04, David Richards wrote: I have the OutgoingRunner process failing quite regularly, and this has resulted in a build up of mail in the qfiles/out directory. How do I find out what is going on in this process for it to be failing like this? Are there any clues in your logs/error log? If you are lucky, there will be a traceback showing why OutgoingRunner is crashing. (Have you configured Mailman to run with multiple OutgoingRunners in your mm_cfg file? If so, check for Brian Greenberg's recent problem report and fix.) -- +++ + Brian Greenberg + University of Manitoba + + [EMAIL PROTECTED] + ACN -- Unix Software Admin + +-+ + Tasklist and PGP key at http://home.cc.umanitoba.ca/~grnbrg + signature.asc Description: OpenPGP digital signature -- Mailman-Users mailing list [EMAIL PROTECTED] http://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
[Mailman-Users] OutgoingRunner Failing
Hi all, I have the OutgoingRunner process failing quite regularly, and this has resulted in a build up of mail in the qfiles/out directory. How do I find out what is going on in this process for it to be failing like this? Mailman version: 2.1.5 Platform: Linux copperhead.qut.edu.au 2.4.9-e.27smp #1 SMP Tue Aug 5 15:49:54 EDT 2003 i686 unknown (Redhat Enterprise) See logs: Sep 13 17:18:04 2004 (12790) OutgoingRunner qrunner started. (pid: 30230, sig: None, sts: 1, class: OutgoingRunner, slice: 1/1) [restarting] Sep 13 17:20:11 2004 (11267) OutgoingRunner qrunner started. (pid: 12790, sig: None, sts: 1, class: OutgoingRunner, slice: 1/1) [restarting] Sep 13 17:20:15 2004 (14127) OutgoingRunner qrunner started. (pid: 11267, sig: None, sts: 1, class: OutgoingRunner, slice: 1/1) [restarting] Sep 13 17:20:22 2004 (11356) OutgoingRunner qrunner started. (pid: 14127, sig: None, sts: 1, class: OutgoingRunner, slice: 1/1) [restarting] Sep 13 17:23:51 2004 (16171) OutgoingRunner qrunner started. (pid: 11356, sig: None, sts: 1, class: OutgoingRunner, slice: 1/1) [restarting] Sep 13 17:23:56 2004 (12702) OutgoingRunner qrunner started. (pid: 16171, sig: None, sts: 1, class: OutgoingRunner, slice: 1/1) [restarting] Sep 13 17:24:09 2004 (16339) OutgoingRunner qrunner started. (pid: 12702, sig: None, sts: 1, class: OutgoingRunner, slice: 1/1) [restarting] Sep 13 17:24:13 2004 (2321) Qrunner OutgoingRunner reached maximum restart limit of 10, not restarting. (pid: 16339, sig: None, sts: 1, class: OutgoingRunner, slice: 1/1) [restarting] Sep 13 18:29:21 2004 (20319) OutgoingRunner qrunner started. (pid: 20319, sig: None, sts: 1, class: OutgoingRunner, slice: 1/1) [restarting] Sep 13 18:59:32 2004 (3045) OutgoingRunner qrunner started. (pid: 3045, sig: None, sts: 1, class: OutgoingRunner, slice: 1/1) [restarting] Sep 13 19:14:34 2004 (9761) OutgoingRunner qrunner started. (pid: 9761, sig: None, sts: 1, class: OutgoingRunner, slice: 1/1) [restarting] Sep 13 19:33:24 2004 (18412) OutgoingRunner qrunner started. (pid: 18412, sig: None, sts: 1, class: OutgoingRunner, slice: 1/1) [restarting] Sep 13 19:48:36 2004 (7608) Qrunner OutgoingRunner reached maximum restart limit of 10, not restarting. Thanks, David Richards Senior Network Programmer Information Technology Services Queensland University of Technology -- Mailman-Users mailing list [EMAIL PROTECTED] http://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Re: [Mailman-Users] OutgoingRunner Failing
On Sep 14, 2004, at 09:04, David Richards wrote: I have the OutgoingRunner process failing quite regularly, and this has resulted in a build up of mail in the qfiles/out directory. How do I find out what is going on in this process for it to be failing like this? Are there any clues in your logs/error log? If you are lucky, there will be a traceback showing why OutgoingRunner is crashing. (Have you configured Mailman to run with multiple OutgoingRunners in your mm_cfg file? If so, check for Brian Greenberg's recent problem report and fix.) -- Jim Tittsler http://www.OnJapan.net/ GPG: 0x01159DB6 Python Starship http://Starship.Python.net/ Ringo MUG Tokyo http://www.ringo.net/rss.html -- Mailman-Users mailing list [EMAIL PROTECTED] http://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/