Re: [Mailman-Users] OutgoingRunner Failing

2004-09-15 Thread Sean

 Hi all,

 I have the OutgoingRunner process failing quite regularly, and this has
 resulted in a build up of mail in the qfiles/out directory.  How do I
 find out what is going on in this process for it to be failing like
 this?

I experienced the same problem running Mailman 2.1.5 on a FreeBSD 4.6 box.
I couldn't find anything telling in the logs.  I upped the number of times
that the qrunner would restart from 10 to 50.  Still had the problem
occur.  Now I have a cron job restart mailman every 4 hours.   Nothing
backs up in the outgoing directory now and I haven't notice any residual
problems from this change.

This doesn't get to the source of the problem though. :(  I figured it
might be something with the content of the messages but with hundreds of
lists running on 4 different servers it isn't something easily tracked
down.  Additionally no other mailman users complained about it until now.

Sean

--
Mailman-Users mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/


Re: [Mailman-Users] OutgoingRunner Failing (fwd)

2004-09-15 Thread sean

Oh, I forgot to mention.  I'm not running multiple slices so Greenburg's
patch doesn't apply.

Sean



-- Forwarded message --
Date: Wed, 15 Sep 2004 08:03:57 -0400 (EDT)
From: Sean [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: Re: [Mailman-Users] OutgoingRunner Failing


 Hi all,

 I have the OutgoingRunner process failing quite regularly, and this has
 resulted in a build up of mail in the qfiles/out directory.  How do I
 find out what is going on in this process for it to be failing like
 this?

I experienced the same problem running Mailman 2.1.5 on a FreeBSD 4.6 box.
I couldn't find anything telling in the logs.  I upped the number of times
that the qrunner would restart from 10 to 50.  Still had the problem
occur.  Now I have a cron job restart mailman every 4 hours.   Nothing
backs up in the outgoing directory now and I haven't notice any residual
problems from this change.

This doesn't get to the source of the problem though. :(  I figured it
might be something with the content of the messages but with hundreds of
lists running on 4 different servers it isn't something easily tracked
down.  Additionally no other mailman users complained about it until now.

Sean

--
Mailman-Users mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/


Re: Fwd: Re: [Mailman-Users] OutgoingRunner Failing

2004-09-14 Thread Brian Greenberg
David Richards wrote:
Hi Brian,
I have seen the same problem as you described in your post, I was running a 
shared installation via NFS.  With OutgoingRunner's running on each box.

Would you expect this to have the same results?
While the idea of running MM on multiple machines in itself kind of makes 
my head hurt, I'm pretty sure that you're seeing a similar if not the same 
problem that I encountered.  The logs you've posted look identical.

From what I could tell from the code, if you are running more instance of 
OutgoingRunner (or any qrunner, for that matter), you *will* have regular 
crashes.  This is because (assuming 4 slices, numbered 0 through 3) each 
slice should manage 1/4 of the queue hash space.  However, as coded, slice 
0 will grab files from the *entire* queue, not just the first quarter. 
This results in a race condition.  The qrunner crash is a result of both 
slice 0 and another slice seeing a file in the last 3/4 of the hash space, 
and both beginning to process it -- one will finish and erase the file, the 
other slice will crash.

Try making the following change:
In mailman-2.1.5/Mailman/Queue/Switchboard.py, change line 167 from
if not lower or (lower = long(digest, 16)  upper):
to
if (lower == upper) or (lower = long(digest, 16)  upper):
This completely eliminated my problem.
Brian.
Date: Tue, 14 Sep 2004 14:25:42 +0900
From: Jim Tittsler [EMAIL PROTECTED]  
Subject: Re: [Mailman-Users] OutgoingRunner Failing  
To: David Richards [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]

On Sep 14, 2004, at 09:04, David Richards wrote:

I have the OutgoingRunner process failing quite regularly, and this has
resulted in a build up of mail in the qfiles/out directory.  How do I 
find out
what is going on in this process for it to be failing like this?
Are there any clues in your logs/error log?  If you are lucky, there 
will be a traceback showing why OutgoingRunner is crashing.

(Have you configured Mailman to run with multiple OutgoingRunners in 
your mm_cfg file?  If so, check for Brian Greenberg's recent problem 
report and fix.)
--
+++
+  Brian Greenberg   + University of Manitoba +
+   [EMAIL PROTECTED]   +   ACN -- Unix Software Admin   +
+-+
+ Tasklist and PGP key at http://home.cc.umanitoba.ca/~grnbrg +


signature.asc
Description: OpenPGP digital signature
--
Mailman-Users mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/

[Mailman-Users] OutgoingRunner Failing

2004-09-13 Thread David Richards
Hi all,

I have the OutgoingRunner process failing quite regularly, and this has 
resulted in a build up of mail in the qfiles/out directory.  How do I find out 
what is going on in this process for it to be failing like this?

Mailman version: 2.1.5
Platform: Linux copperhead.qut.edu.au 2.4.9-e.27smp #1 SMP Tue Aug 5 15:49:54 
EDT 2003 i686 unknown  (Redhat Enterprise)

See logs:

Sep 13 17:18:04 2004 (12790) OutgoingRunner qrunner started.
(pid: 30230, sig: None, sts: 1, class: OutgoingRunner, slice: 1/1) [restarting]
Sep 13 17:20:11 2004 (11267) OutgoingRunner qrunner started.
(pid: 12790, sig: None, sts: 1, class: OutgoingRunner, slice: 1/1) [restarting]
Sep 13 17:20:15 2004 (14127) OutgoingRunner qrunner started.
(pid: 11267, sig: None, sts: 1, class: OutgoingRunner, slice: 1/1) [restarting]
Sep 13 17:20:22 2004 (11356) OutgoingRunner qrunner started.
(pid: 14127, sig: None, sts: 1, class: OutgoingRunner, slice: 1/1) [restarting]
Sep 13 17:23:51 2004 (16171) OutgoingRunner qrunner started.
(pid: 11356, sig: None, sts: 1, class: OutgoingRunner, slice: 1/1) [restarting]
Sep 13 17:23:56 2004 (12702) OutgoingRunner qrunner started.
(pid: 16171, sig: None, sts: 1, class: OutgoingRunner, slice: 1/1) [restarting]
Sep 13 17:24:09 2004 (16339) OutgoingRunner qrunner started.
(pid: 12702, sig: None, sts: 1, class: OutgoingRunner, slice: 1/1) [restarting]
Sep 13 17:24:13 2004 (2321) Qrunner OutgoingRunner reached maximum restart 
limit of 10, not restarting.
(pid: 16339, sig: None, sts: 1, class: OutgoingRunner, slice: 1/1) [restarting]
Sep 13 18:29:21 2004 (20319) OutgoingRunner qrunner started.
(pid: 20319, sig: None, sts: 1, class: OutgoingRunner, slice: 1/1) [restarting]
Sep 13 18:59:32 2004 (3045) OutgoingRunner qrunner started.
(pid: 3045, sig: None, sts: 1, class: OutgoingRunner, slice: 1/1) [restarting]
Sep 13 19:14:34 2004 (9761) OutgoingRunner qrunner started.
(pid: 9761, sig: None, sts: 1, class: OutgoingRunner, slice: 1/1) [restarting]
Sep 13 19:33:24 2004 (18412) OutgoingRunner qrunner started.
(pid: 18412, sig: None, sts: 1, class: OutgoingRunner, slice: 1/1) [restarting]
Sep 13 19:48:36 2004 (7608) Qrunner OutgoingRunner reached maximum restart 
limit of 10, not restarting.

Thanks, 


David Richards
Senior Network Programmer
Information Technology Services
Queensland University of Technology
--
Mailman-Users mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/


Re: [Mailman-Users] OutgoingRunner Failing

2004-09-13 Thread Jim Tittsler
On Sep 14, 2004, at 09:04, David Richards wrote:
I have the OutgoingRunner process failing quite regularly, and this has
resulted in a build up of mail in the qfiles/out directory.  How do I 
find out
what is going on in this process for it to be failing like this?
Are there any clues in your logs/error log?  If you are lucky, there 
will be a traceback showing why OutgoingRunner is crashing.

(Have you configured Mailman to run with multiple OutgoingRunners in 
your mm_cfg file?  If so, check for Brian Greenberg's recent problem 
report and fix.)

--
Jim Tittsler http://www.OnJapan.net/  GPG: 0x01159DB6
Python Starship  http://Starship.Python.net/
Ringo MUG Tokyo  http://www.ringo.net/rss.html
--
Mailman-Users mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/