Re: [Mailman-Users] Problem with archrunner using large %'s of cpu (read faq & archives)

2003-11-03 Thread Richard Barrett
Scott

Further to my earlier post on this topic, I have taken a look at the  
pipermail archiver code.

I concluded that there is a bug (or is it a feature?) which bloats the  
size of the -article file in the pipermail "database" for each list.  
This bloat will affect archiving performance, particularly for list  
with large amounts of traffic and/or those that have large text  
postings to them.

I think the bug has been around for a number of releases and it would  
explain why I had previously found shortening the archive period  
improved matters.

This may or may not be part of the problem you reported. I have posted  
a patch to correct this problem here which you might like to try if you  
are feeling particularly brave:

http://www.openinfo.co.uk/mailman/patches/835332/index.html

and here:

http://sourceforge.net/tracker/ 
?func=detail&aid=835332&group_id=103&atid=100103

Feedback either +ve or -ve would be appreciated if you try the patch.

Richard

On Friday, October 31, 2003, at 08:52  pm, Scott Lambert wrote:

On Fri, Oct 31, 2003 at 09:40:11AM -0500, Jon Carnes wrote:
On Fri, 2003-10-31 at 09:26, Jay West wrote:
I'm using Mailman 2.1.2 on FreeBSD v4.8-Release, built using the  
port. MTA
is sendmail 8.12.8p1

Very frequently I will see the ArchRunner process using 99+ % of  
cpu. I have
searched the archives and found lots of messages about qrunners  
using large
percentages of cpu, but they all seem to talk about the fixes being  
related
to actual mail processing (sendmail), not archRunner. I am assuming  
that if
the problem was mail delivery or reception I would be seeing the  
large cpu
use on a different qrunner process. My issue is specific to the  
archrunner
process which I don't find much on in the archives/faq.

Well you've pegged it.  That was a bug in version 2.1.2 which is fixed
in 2.1.3.  The patch for 2.1.2 should still be available - you could
probably patch your running system and just leave it at that (an  
upgrade
will bring the patch in anyway).
I still see this problem with Mailman 2.1.3 for a high-volume list.

  PID USERNAME PRI NICE  SIZERES STATE  C   TIME   WCPUCPU  
COMMAND
66428 mailman   64   0   168M   147M CPU1   0 376.7H 99.02% 99.02%  
python2.3

That's the archiver process.  There are 1318 messages in the archive
queue...
12:00:28 Fri Oct 31 # truss -p 66428
break(0x114f6000)= 0 (0x0)
break(0x1302c000)= 0 (0x0)
break(0x114f8000)= 0 (0x0)
break(0x1303)= 0 (0x0)
break(0x114fa000)= 0 (0x0)
break(0x13034000)= 0 (0x0)
break(0x114fc000)= 0 (0x0)
break(0x13038000)= 0 (0x0)
break(0x114fe000)= 0 (0x0)
break(0x1303c000)= 0 (0x0)
break(0x1150)= 0 (0x0)
break(0x1304)= 0 (0x0)
break(0x11502000)= 0 (0x0)
break(0x13044000)= 0 (0x0)
break(0x11504000)= 0 (0x0)
break(0x13048000)= 0 (0x0)
break(0x11506000)= 0 (0x0)
break(0x1304c000)= 0 (0x0)
Once I kill off the mailman queue runners and clean up the several lock
files for this mailing list, it runs just fine and manages to empty the
archive queue.
Two days worth of mailman cron jobs were still stuck in the process  
list.

Supposition: Maybe they were blocked by the list's lockfile?

So, it seems that the archRunner process went off the deep end  
somewhere
between two and three days ago.

I have the htdig patches for 2.1.3 installed.  Which might be  
germane...

--
Scott LambertKC5MLE   Unix  
SysAdmin
[EMAIL PROTECTED]


---
Richard Barrett   http://www.openinfo.co.uk
--
Mailman-Users mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
This message was sent to: [EMAIL PROTECTED]
Unsubscribe or change your options at
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org


Re: [Mailman-Users] Problem with archrunner using large %'s of cpu (read faq & archives)

2003-11-01 Thread Brad Knowles
At 12:52 AM + 2003/11/01, Richard Barrett wrote:

 Rather than just theorize, feel free to make specific suggestions
 about the deficiencies and appropriate remedies based on the code
 being executed. Dare I say it, you could even submit a patch to
 fix any obvious errors in the code.
	I have said before, and I will say again, that I am not a 
programmer.  The last time I did any "real" programming was when I 
was a senior in college, before I graduated -- 1989.

	I can talk intelligently about mechanisms and techniques that are 
known to have specific flaws, but don't ask me to write or comment on 
code.  If you do, please restrict your languages to Bourne shell or 
maybe a bit of Perl (not too obfuscated, please).

--
Brad Knowles, <[EMAIL PROTECTED]>
"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety."
-Benjamin Franklin, Historical Review of Pennsylvania.
GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI$ P+>++ L+ !E-(---) W+++(--) N+
!w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++)
tv+(+++) b+() DI+() D+(++) G+() e++> h--- r---(+++)* z(+++)
--
Mailman-Users mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
This message was sent to: [EMAIL PROTECTED]
Unsubscribe or change your options at
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org


Re: [Mailman-Users] Problem with archrunner using large %'s of cpu (read faq & archives)

2003-11-01 Thread Brad Knowles
At 9:29 PM -0500 2003/10/31, Scott Lambert wrote:

 If we were talking about more than 10,000 files, I might buy it.  But we
 are talking about 1300 files.
	Many filesystems start significantly slowing down around 1,000 
files, not 10,000.  Moreover, are you sure that this is the largest 
number of files you've ever had in that directory?

Also the processing goes something like
 O(n), in reverse, slower as it processes the files in the directory.
	That is a bit strange, but might be explained by holes in the 
directory structure that need to be skipped.

   I
 might buy it staying slow if it started slow but it doesn't.
	I've seen mail servers at large freemail providers that had 
previously grown to very large sizes, and worked reasonably well for 
numbers of files in the low thousands, but seriously flaked out when 
pushed much beyond that.

	Move the directory aside, move the files to a new directory, and 
restart -- suddenly everything works like magic again.

	Unless you know the filesystem code intimately, as well as the 
code that is using the filesystem, it can be difficult to predict how 
or when things will break or how badly they will break.

--
Brad Knowles, <[EMAIL PROTECTED]>
"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety."
-Benjamin Franklin, Historical Review of Pennsylvania.
GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI$ P+>++ L+ !E-(---) W+++(--) N+
!w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++)
tv+(+++) b+() DI+() D+(++) G+() e++> h--- r---(+++)* z(+++)
--
Mailman-Users mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
This message was sent to: [EMAIL PROTECTED]
Unsubscribe or change your options at
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org


Re: [Mailman-Users] Problem with archrunner using large %'s of cpu (read faq & archives)

2003-10-31 Thread Jon Carnes
On Fri, 2003-10-31 at 21:29, Scott Lambert wrote:
> On Sat, Nov 01, 2003 at 12:59:24AM +0100, Brad Knowles wrote:
> > At 6:21 PM -0500 2003/10/31, Scott Lambert wrote:
> > > I haven't looked at the code yet, and probably won't (ENOTIME), but
> > > it almost sounds to me like it's not pruning it's list of handled
> > > messages and has to walk all of them each time.  I would have
> > > expected queue handling to get faster as the queue got smaller due
> > > to fewer files in the directory that it needs to search through.
> > > Maybe it's just a function of the python datastructure being used.
> >
> >   If it's using files as the queue mechanism, then deleting a file
> > simply marks the entry in the directory as "available", and it still
> > takes just at long to scan the directory afterwards as it did before.
> 
> If we were talking about more than 10,000 files, I might buy it.  But we
> are talking about 1300 files.  Also the processing goes something like
> O(n), in reverse, slower as it processes the files in the directory.  I
> might buy it staying slow if it started slow but it doesn't.
>  
To me it sounds like a memory problem.

I wonder how fast we can fix it?


--
Mailman-Users mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/

This message was sent to: [EMAIL PROTECTED]
Unsubscribe or change your options at
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org


Re: [Mailman-Users] Problem with archrunner using large %'s of cpu (read faq & archives)

2003-10-31 Thread Scott Lambert
On Sat, Nov 01, 2003 at 12:59:24AM +0100, Brad Knowles wrote:
> At 6:21 PM -0500 2003/10/31, Scott Lambert wrote:
> > I haven't looked at the code yet, and probably won't (ENOTIME), but
> > it almost sounds to me like it's not pruning it's list of handled
> > messages and has to walk all of them each time.  I would have
> > expected queue handling to get faster as the queue got smaller due
> > to fewer files in the directory that it needs to search through.
> > Maybe it's just a function of the python datastructure being used.
>
>   If it's using files as the queue mechanism, then deleting a file
> simply marks the entry in the directory as "available", and it still
> takes just at long to scan the directory afterwards as it did before.

If we were talking about more than 10,000 files, I might buy it.  But we
are talking about 1300 files.  Also the processing goes something like
O(n), in reverse, slower as it processes the files in the directory.  I
might buy it staying slow if it started slow but it doesn't.
 
-- 
Scott LambertKC5MLE   Unix SysAdmin
[EMAIL PROTECTED]  


--
Mailman-Users mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/

This message was sent to: [EMAIL PROTECTED]
Unsubscribe or change your options at
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org


Re: [Mailman-Users] Problem with archrunner using large %'s of cpu (read faq & archives)

2003-10-31 Thread Richard Barrett
On Friday, October 31, 2003, at 11:59  pm, Brad Knowles wrote:

At 6:21 PM -0500 2003/10/31, Scott Lambert wrote:

 I haven't looked at the code yet, and probably won't (ENOTIME), but  
it
 almost sounds to me like it's not pruning it's list of handled  
messages
 and has to walk all of them each time.  I would have expected queue
 handling to get faster as the queue got smaller due to fewer files
 in the directory that it needs to search through.  Maybe it's just a
 function of the python datastructure being used.
	If it's using files as the queue mechanism, then deleting a file  
simply marks the entry in the directory as "available", and it still  
takes just at long to scan the directory afterwards as it did before.

	This is a known problem with many MTAs handling large amounts of  
messages, and is one reason why you should use a hashed directory  
scheme for your mail queue (a la postfix), or you should periodically  
stop the MTA, move the mail queue directory aside, create a new mail  
queue directory (with appropriate ownership and permissions), then  
move what messages may remain from the old queue back into the new one  
(or fire up queue runners to clear the old queue while the new one is  
being used for new mail).

In MM 2.1.3, the relevant code is in  
$prefix/Mailman/Queue/Switchboard.py function files() starting at line  
204 which is called from $prefix/Mailman/Queue/Runner.py line 89 when  
subclassed from $prefix/Mailman/Queue/ArchRunner.py

Rather than just theorize, feel free to make specific suggestions about  
the deficiencies and appropriate remedies based on the code being  
executed. Dare I say it, you could even submit a patch to fix any  
obvious errors in the code.

	Mailman could very easily be suffering from the same sort of problem  
-- once you get a directory with a large number of entries in it, it  
takes a long time to scan it even if there are only a few files that  
are currently visible.  Same problem, perhaps the same solution?

-- Brad Knowles, <[EMAIL PROTECTED]>

"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety."
-Benjamin Franklin, Historical Review of Pennsylvania.
GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI$ P+>++ L+ !E-(---)  
W+++(--) N+
!w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++)  
R+(+++)
tv+(+++) b+() DI+() D+(++) G+() e++> h--- r---(+++)*  
z(+++)

--
Mailman-Users mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives:  
http://www.mail-archive.com/mailman-users%40python.org/

This message was sent to: [EMAIL PROTECTED]
Unsubscribe or change your options at
http://mail.python.org/mailman/options/mailman-users/ 
r.barrett%40openinfo.co.uk



--
Mailman-Users mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
This message was sent to: [EMAIL PROTECTED]
Unsubscribe or change your options at
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org


Re: [Mailman-Users] Problem with archrunner using large %'s of cpu (read faq & archives)

2003-10-31 Thread Brad Knowles
At 6:21 PM -0500 2003/10/31, Scott Lambert wrote:

 I haven't looked at the code yet, and probably won't (ENOTIME), but it
 almost sounds to me like it's not pruning it's list of handled messages
 and has to walk all of them each time.  I would have expected queue
 handling to get faster as the queue got smaller due to fewer files
 in the directory that it needs to search through.  Maybe it's just a
 function of the python datastructure being used.
	If it's using files as the queue mechanism, then deleting a file 
simply marks the entry in the directory as "available", and it still 
takes just at long to scan the directory afterwards as it did before.

	This is a known problem with many MTAs handling large amounts of 
messages, and is one reason why you should use a hashed directory 
scheme for your mail queue (a la postfix), or you should periodically 
stop the MTA, move the mail queue directory aside, create a new mail 
queue directory (with appropriate ownership and permissions), then 
move what messages may remain from the old queue back into the new 
one (or fire up queue runners to clear the old queue while the new 
one is being used for new mail).

	Mailman could very easily be suffering from the same sort of 
problem -- once you get a directory with a large number of entries in 
it, it takes a long time to scan it even if there are only a few 
files that are currently visible.  Same problem, perhaps the same 
solution?

--
Brad Knowles, <[EMAIL PROTECTED]>
"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety."
-Benjamin Franklin, Historical Review of Pennsylvania.
GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI$ P+>++ L+ !E-(---) W+++(--) N+
!w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++)
tv+(+++) b+() DI+() D+(++) G+() e++> h--- r---(+++)* z(+++)
--
Mailman-Users mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
This message was sent to: [EMAIL PROTECTED]
Unsubscribe or change your options at
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org


Re: [Mailman-Users] Problem with archrunner using large %'s of cpu (read faq & archives)

2003-10-31 Thread Scott Lambert
On Fri, Oct 31, 2003 at 03:52:34PM -0500, Scott Lambert wrote:
> Once I kill off the mailman queue runners and clean up the several lock
> files for this mailing list, it runs just fine and manages to empty the
> archive queue.

Well, the above statement is not entirely accurate.  It was working
quickly immediately after restart but went downhill.  I logged out and
took care of other things after seeing it move a good number of messages
in a short amount of time.  Five hours later, it still had 377 messages
in the archive queue and was taking several minutes per message.  I
trussed it again and saw more of the incredibly long series of breaks,
but watched it long than I did this morning.  After a lot of breaks it
goes to a lot of writes then does some file stuff quickly and repeats for 
the next message.

I restarted the queue runners again and it it processed fourty or so
messages quickly then began the downward spiral again.  Within reducing
the queue to 177 entries, it was back to 3 minutes per message and
expanding.  Restarting knocked it down pretty quick for a while then
started taking longer again.  I was watching more closely this time.
After a couple more restart cycles, the queue was cleaned out quickly
and all is well.

I haven't looked at the code yet, and probably won't (ENOTIME), but it
almost sounds to me like it's not pruning it's list of handled messages
and has to walk all of them each time.  I would have expected queue
handling to get faster as the queue got smaller due to fewer files
in the directory that it needs to search through.  Maybe it's just a
function of the python datastructure being used.

The fast after restart part makes me doubt that it is the size of the
archive that is at issue.

The server we are using is a dual PIII450 machine.  I would guess this
would not show as such a big problem on a more modern system, but other
than the archiver, this box is more than enough for the load on it.

The dual processor aspect of this box is what allows us to miss the
archiver running off the deep end until someone complains that the
archive search feature is broken.  The mail passes through the system
just fine using the other processor. 

 38M2003-October.txt
 13M2003-October.txt.gz
 48Mportsidelist.mbox

-- 
Scott LambertKC5MLE   Unix SysAdmin
[EMAIL PROTECTED]  


--
Mailman-Users mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/

This message was sent to: [EMAIL PROTECTED]
Unsubscribe or change your options at
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org


Re: [Mailman-Users] Problem with archrunner using large %'s of cpu (read faq & archives)

2003-10-31 Thread Richard Barrett
On Friday, October 31, 2003, at 08:52  pm, Scott Lambert wrote:

On Fri, Oct 31, 2003 at 09:40:11AM -0500, Jon Carnes wrote:
On Fri, 2003-10-31 at 09:26, Jay West wrote:
I'm using Mailman 2.1.2 on FreeBSD v4.8-Release, built using the 
port. MTA
is sendmail 8.12.8p1

Very frequently I will see the ArchRunner process using 99+ % of 
cpu. I have
searched the archives and found lots of messages about qrunners 
using large
percentages of cpu, but they all seem to talk about the fixes being 
related
to actual mail processing (sendmail), not archRunner. I am assuming 
that if
the problem was mail delivery or reception I would be seeing the 
large cpu
use on a different qrunner process. My issue is specific to the 
archrunner
process which I don't find much on in the archives/faq.

Well you've pegged it.  That was a bug in version 2.1.2 which is fixed
in 2.1.3.  The patch for 2.1.2 should still be available - you could
probably patch your running system and just leave it at that (an 
upgrade
will bring the patch in anyway).
I still see this problem with Mailman 2.1.3 for a high-volume list.

  PID USERNAME PRI NICE  SIZERES STATE  C   TIME   WCPUCPU 
COMMAND
66428 mailman   64   0   168M   147M CPU1   0 376.7H 99.02% 99.02% 
python2.3

That's the archiver process.  There are 1318 messages in the archive
queue...
12:00:28 Fri Oct 31 # truss -p 66428
break(0x114f6000)= 0 (0x0)
break(0x1302c000)= 0 (0x0)
break(0x114f8000)= 0 (0x0)
break(0x1303)= 0 (0x0)
break(0x114fa000)= 0 (0x0)
break(0x13034000)= 0 (0x0)
break(0x114fc000)= 0 (0x0)
break(0x13038000)= 0 (0x0)
break(0x114fe000)= 0 (0x0)
break(0x1303c000)= 0 (0x0)
break(0x1150)= 0 (0x0)
break(0x1304)= 0 (0x0)
break(0x11502000)= 0 (0x0)
break(0x13044000)= 0 (0x0)
break(0x11504000)= 0 (0x0)
break(0x13048000)= 0 (0x0)
break(0x11506000)= 0 (0x0)
break(0x1304c000)= 0 (0x0)
Once I kill off the mailman queue runners and clean up the several lock
files for this mailing list, it runs just fine and manages to empty the
archive queue.
Two days worth of mailman cron jobs were still stuck in the process 
list.

Supposition: Maybe they were blocked by the list's lockfile?

So, it seems that the archRunner process went off the deep end 
somewhere
between two and three days ago.

I have the htdig patches for 2.1.3 installed.  Which might be 
germane...
If you are referring to patch #444884 then, while I would never say 
never, it is not highly likely to be the cause. The code inserted  by 
patch #444884 impinges very little on the execution path taken when 
mail is being archived and archive pages are being generated by 
pipermail. If you discover any different let me know and I'll take 
another look at the htdig integration patch.

You say you have the problem with a high volume list.  What sort of 
message sizes and traffic volume is the list handling? Do the messages 
tend to have large attachments? I have found that the internal 
pipermail archiver starts to choke on high volume lists and on a least 
one of them I run the solution I adopted was to reduce the archiving 
period from a month to a week, which seemed to alleviate the problem. I 
suspect the problem is partially related to the pickled data structures 
that pipermail uses to control archiver operation and index generation.

I'm now using a fairly tight Mailman/MHonArc integration for such 
lists; I developed it because MHonArc has a reputation for handling 
large archives better than pipermail but I still wanted MM list archive 
privacy, my htdig integration, etc. A patch for this is available at 
http://www.openinfo.co.uk/mailman/patches/mhonarc/index.html or as MM 
patch #820723 on sourceforge. It subcontracts MHonArc to generate the 
message and period index pages in the normal 
$prefix/archives/private// directory 
structure while the pipermail/MM code looks after the top level index, 
archive control and access control. The integration makes the choice of 
pipermail or MHonArc a per-list option so if you change your mind or 
decide it was all a big mistake it is not a disaster; select the 
archiver of choice and run $prefix/bin/arch --wipe to have the archiver 
of choice regenerate the list archive from the its mbox file.

So far this MM/MH integration has worked OK for me but that's a single 
data point.

Enough over-selling of a free product and the usual caveat emptor :) 
but if you give it a try let me know how you get on.

--

Re: [Mailman-Users] Problem with archrunner using large %'s of cpu (read faq & archives)

2003-10-31 Thread Scott Lambert
On Fri, Oct 31, 2003 at 09:40:11AM -0500, Jon Carnes wrote:
> On Fri, 2003-10-31 at 09:26, Jay West wrote:
> > I'm using Mailman 2.1.2 on FreeBSD v4.8-Release, built using the port. MTA
> > is sendmail 8.12.8p1
> > 
> > Very frequently I will see the ArchRunner process using 99+ % of cpu. I have
> > searched the archives and found lots of messages about qrunners using large
> > percentages of cpu, but they all seem to talk about the fixes being related
> > to actual mail processing (sendmail), not archRunner. I am assuming that if
> > the problem was mail delivery or reception I would be seeing the large cpu
> > use on a different qrunner process. My issue is specific to the archrunner
> > process which I don't find much on in the archives/faq.
> > 
> Well you've pegged it.  That was a bug in version 2.1.2 which is fixed
> in 2.1.3.  The patch for 2.1.2 should still be available - you could
> probably patch your running system and just leave it at that (an upgrade
> will bring the patch in anyway).

I still see this problem with Mailman 2.1.3 for a high-volume list.

  PID USERNAME PRI NICE  SIZERES STATE  C   TIME   WCPUCPU COMMAND
66428 mailman   64   0   168M   147M CPU1   0 376.7H 99.02% 99.02% python2.3

That's the archiver process.  There are 1318 messages in the archive
queue...

12:00:28 Fri Oct 31 # truss -p 66428
break(0x114f6000)= 0 (0x0)
break(0x1302c000)= 0 (0x0)
break(0x114f8000)= 0 (0x0)
break(0x1303)= 0 (0x0)
break(0x114fa000)= 0 (0x0)
break(0x13034000)= 0 (0x0)
break(0x114fc000)= 0 (0x0)
break(0x13038000)= 0 (0x0)
break(0x114fe000)= 0 (0x0)
break(0x1303c000)= 0 (0x0)
break(0x1150)= 0 (0x0)
break(0x1304)= 0 (0x0)
break(0x11502000)= 0 (0x0)
break(0x13044000)= 0 (0x0)
break(0x11504000)= 0 (0x0)
break(0x13048000)= 0 (0x0)
break(0x11506000)= 0 (0x0)
break(0x1304c000)= 0 (0x0)

Once I kill off the mailman queue runners and clean up the several lock
files for this mailing list, it runs just fine and manages to empty the
archive queue.

Two days worth of mailman cron jobs were still stuck in the process list.

Supposition: Maybe they were blocked by the list's lockfile?

So, it seems that the archRunner process went off the deep end somewhere
between two and three days ago.

I have the htdig patches for 2.1.3 installed.  Which might be germane...

-- 
Scott LambertKC5MLE   Unix SysAdmin
[EMAIL PROTECTED]  


--
Mailman-Users mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/

This message was sent to: [EMAIL PROTECTED]
Unsubscribe or change your options at
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org


Re: [Mailman-Users] Problem with archrunner using large %'s of cpu (read faq & archives)

2003-10-31 Thread Jon Carnes
Well you've pegged it.  That was a bug in version 2.1.2 which is fixed
in 2.1.3.  The patch for 2.1.2 should still be available - you could
probably patch your running system and just leave it at that (an upgrade
will bring the patch in anyway).

Good Luck - Jon Carnes

On Fri, 2003-10-31 at 09:26, Jay West wrote:
> I'm using Mailman 2.1.2 on FreeBSD v4.8-Release, built using the port. MTA
> is sendmail 8.12.8p1
> 
> Very frequently I will see the ArchRunner process using 99+ % of cpu. I have
> searched the archives and found lots of messages about qrunners using large
> percentages of cpu, but they all seem to talk about the fixes being related
> to actual mail processing (sendmail), not archRunner. I am assuming that if
> the problem was mail delivery or reception I would be seeing the large cpu
> use on a different qrunner process. My issue is specific to the archrunner
> process which I don't find much on in the archives/faq.
> 
> I am using a pretty default install, haven't tweaked anything. If it
> helps... here are some possibly germane things:
> 
> 1) I never seem to be able to catch anything in
> /usr/local/mailman/qfiles/archive, but that may be a timing thing, as my
> archives do appear to be getting updated.
> 2) I looked in the /usr/local/mailman/archives/private/*.mbox directories,
> and find listname.mbox at 33mb and listname.mbox.1 at 54mb. Could it be that
> these files are just so big that it takes huge amounts of cpu to add posts
> to these? I'm guessing they are the archives. This gives rise to several
> questions (someone else maintained this setup before I did). Does mailman
> split them (the .1 file), or can I just rename listname.mbox to
> listname.mbox.2 and mailman will have a smaller chunk to deal with?
> 
> Any thoughts? Thanks in advance!!!
> 
> I have another question or two but will post separately for them.
> 
> Jay West
> 
> ---
> [This E-mail scanned for viruses by Declude Virus]
> 
> 
> --
> Mailman-Users mailing list
> [EMAIL PROTECTED]
> http://mail.python.org/mailman/listinfo/mailman-users
> Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
> Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
> 
> This message was sent to: [EMAIL PROTECTED]
> Unsubscribe or change your options at
> http://mail.python.org/mailman/options/mailman-users/jonc%40nc.rr.com


--
Mailman-Users mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/

This message was sent to: [EMAIL PROTECTED]
Unsubscribe or change your options at
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org