Re: [Mailman-Users] arch python errors, bad marshall data

2002-01-07 Thread Micah Anderson

The machine must have crashed just as a new message was being added,
so the pipermail.pck file was corrupt. The fix is to move the old
~malman/archives/public/listname directory out of the way completely
and rerun ~mailman/bin/arch on the list, which I've done, and all is
good.



On Sun, 06 Jan 2002, Tom Perrine wrote:

  On Sun, 6 Jan 2002 12:12:27 -0800, Micah Anderson [EMAIL PROTECTED] said:
 
 Micah The archives of a particular list went astray, so I tried to run arch
 Micah to re-pickle them, it went for some time (we have archives for a
 Micah couple years), up until December of this year, then it puked, anyone
 Micah know how to fix this??
 
 I have no idea if this is related, but my experience may be
 interesting...
 
 I have a mailing list that has been running since 1994.  It has about
 20,000 messages in the non-Mailman archives.  I have been importing
 and re-importing it into Mailman, as I was playing with some Mailman
 options and also correcting corruption in the original archives as
 part of migration into Mailman.
 
 The first few times I cat'ed all the archive files together and ran
 arch on the resulting huge mbox file.  This took overnight, most
 likely because the machine was heavily into swap, and trying to
 compute long lists of links, etc.
 
 Later, I started arch-ing the individual files separately.  This
 stayed out of swap and typically runs all the files in less than an
 hour!
 
 I guess what I'm getting at is that the arch program uses LOTS of
 memory when building some of the indexes, and that incremental
 arch-ing may be more efficient and less likely to exercise
 memory-related bugs in Mailman, Python, or the underlying system.
 
 It will also avoid the situations where you hit the pre-process memory
 limits, which is what I *suspect* has happened to you.
 
 -- 
 Tom E. Perrine ([EMAIL PROTECTED]) | San Diego Supercomputer Center 
 http://www.sdsc.edu/~tep/ | Voice: +1.858.534.5000
 The French are glad to die for love...  - Moulin Rouge

--
Mailman-Users maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/mailman-users



Re: [Mailman-Users] arch python errors, bad marshall data

2002-01-07 Thread Marc MERLIN

On Sun, Jan 06, 2002 at 01:41:49PM -0800, Tom Perrine wrote:
 I guess what I'm getting at is that the arch program uses LOTS of
 memory when building some of the indexes, and that incremental
 arch-ing may be more efficient and less likely to exercise
 memory-related bugs in Mailman, Python, or the underlying system.

I was asking about this recently (never had the time to try it out):
So you are saying that arch can be fed let's say weekly or monthly mbox
files and it will correctly generate the archive?

I've considered doing  this on setups where  I can't afford to  run arch for
every single  message posted (and  delay qrunner as  a result), but  where I
should be able  to run it for  batches of messages once a  night (strace has
shown me that  arch is very slow at  editing the huge HTML pages  to add one
message at the bottom,  but my hope is that if you add  a batch, it does not
incrementally add the messages one per one)

Marc
-- 
Microsoft is to operating systems  security 
   what McDonalds is to gourmet cooking
  
Home page: http://marc.merlins.org/   |   Finger [EMAIL PROTECTED] for PGP key

--
Mailman-Users maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/mailman-users



Re: [Mailman-Users] arch python errors, bad marshall data

2002-01-06 Thread Tom Perrine

 On Sun, 6 Jan 2002 12:12:27 -0800, Micah Anderson [EMAIL PROTECTED] said:

Micah The archives of a particular list went astray, so I tried to run arch
Micah to re-pickle them, it went for some time (we have archives for a
Micah couple years), up until December of this year, then it puked, anyone
Micah know how to fix this??

I have no idea if this is related, but my experience may be
interesting...

I have a mailing list that has been running since 1994.  It has about
20,000 messages in the non-Mailman archives.  I have been importing
and re-importing it into Mailman, as I was playing with some Mailman
options and also correcting corruption in the original archives as
part of migration into Mailman.

The first few times I cat'ed all the archive files together and ran
arch on the resulting huge mbox file.  This took overnight, most
likely because the machine was heavily into swap, and trying to
compute long lists of links, etc.

Later, I started arch-ing the individual files separately.  This
stayed out of swap and typically runs all the files in less than an
hour!

I guess what I'm getting at is that the arch program uses LOTS of
memory when building some of the indexes, and that incremental
arch-ing may be more efficient and less likely to exercise
memory-related bugs in Mailman, Python, or the underlying system.

It will also avoid the situations where you hit the pre-process memory
limits, which is what I *suspect* has happened to you.

-- 
Tom E. Perrine ([EMAIL PROTECTED]) | San Diego Supercomputer Center 
http://www.sdsc.edu/~tep/ | Voice: +1.858.534.5000
The French are glad to die for love...  - Moulin Rouge

--
Mailman-Users maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/mailman-users