Re: [Mailman-Users] arch python errors, bad marshall data

2002-01-07 Thread Marc MERLIN

On Sun, Jan 06, 2002 at 01:41:49PM -0800, Tom Perrine wrote:
> I guess what I'm getting at is that the "arch" program uses LOTS of
> memory when building some of the indexes, and that incremental
> "arch"-ing may be more efficient and less likely to exercise
> memory-related bugs in Mailman, Python, or the underlying system.

I was asking about this recently (never had the time to try it out):
So you are saying that arch can be fed let's say weekly or monthly mbox
files and it will correctly generate the archive?

I've considered doing  this on setups where  I can't afford to  run arch for
every single  message posted (and  delay qrunner as  a result), but  where I
should be able  to run it for  batches of messages once a  night (strace has
shown me that  arch is very slow at  editing the huge HTML pages  to add one
message at the bottom,  but my hope is that if you add  a batch, it does not
incrementally add the messages one per one)

Marc
-- 
Microsoft is to operating systems & security 
   what McDonalds is to gourmet cooking
  
Home page: http://marc.merlins.org/   |   Finger [EMAIL PROTECTED] for PGP key

--
Mailman-Users maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/mailman-users



Re: [Mailman-Users] arch python errors, bad marshall data

2002-01-07 Thread Micah Anderson

The machine must have crashed just as a new message was being added,
so the pipermail.pck file was corrupt. The fix is to move the old
~malman/archives/public/listname directory out of the way completely
and rerun ~mailman/bin/arch on the list, which I've done, and all is
good.



On Sun, 06 Jan 2002, Tom Perrine wrote:

> > On Sun, 6 Jan 2002 12:12:27 -0800, Micah Anderson <[EMAIL PROTECTED]> said:
> 
> Micah> The archives of a particular list went astray, so I tried to run arch
> Micah> to re-pickle them, it went for some time (we have archives for a
> Micah> couple years), up until December of this year, then it puked, anyone
> Micah> know how to fix this??
> 
> I have no idea if this is related, but my experience may be
> interesting...
> 
> I have a mailing list that has been running since 1994.  It has about
> 20,000 messages in the non-Mailman archives.  I have been importing
> and re-importing it into Mailman, as I was playing with some Mailman
> options and also correcting corruption in the original archives as
> part of migration into Mailman.
> 
> The first few times I cat'ed all the archive files together and ran
> "arch" on the resulting huge mbox file.  This took overnight, most
> likely because the machine was heavily into swap, and trying to
> compute long lists of links, etc.
> 
> Later, I started "arch"-ing the individual files separately.  This
> stayed out of swap and typically runs all the files in less than an
> hour!
> 
> I guess what I'm getting at is that the "arch" program uses LOTS of
> memory when building some of the indexes, and that incremental
> "arch"-ing may be more efficient and less likely to exercise
> memory-related bugs in Mailman, Python, or the underlying system.
> 
> It will also avoid the situations where you hit the pre-process memory
> limits, which is what I *suspect* has happened to you.
> 
> -- 
> Tom E. Perrine ([EMAIL PROTECTED]) | San Diego Supercomputer Center 
> http://www.sdsc.edu/~tep/ | Voice: +1.858.534.5000
> "The French are glad to die for love..."  - Moulin Rouge

--
Mailman-Users maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/mailman-users



Re: [Mailman-Users] arch python errors, bad marshall data

2002-01-06 Thread Tom Perrine

> On Sun, 6 Jan 2002 12:12:27 -0800, Micah Anderson <[EMAIL PROTECTED]> said:

Micah> The archives of a particular list went astray, so I tried to run arch
Micah> to re-pickle them, it went for some time (we have archives for a
Micah> couple years), up until December of this year, then it puked, anyone
Micah> know how to fix this??

I have no idea if this is related, but my experience may be
interesting...

I have a mailing list that has been running since 1994.  It has about
20,000 messages in the non-Mailman archives.  I have been importing
and re-importing it into Mailman, as I was playing with some Mailman
options and also correcting corruption in the original archives as
part of migration into Mailman.

The first few times I cat'ed all the archive files together and ran
"arch" on the resulting huge mbox file.  This took overnight, most
likely because the machine was heavily into swap, and trying to
compute long lists of links, etc.

Later, I started "arch"-ing the individual files separately.  This
stayed out of swap and typically runs all the files in less than an
hour!

I guess what I'm getting at is that the "arch" program uses LOTS of
memory when building some of the indexes, and that incremental
"arch"-ing may be more efficient and less likely to exercise
memory-related bugs in Mailman, Python, or the underlying system.

It will also avoid the situations where you hit the pre-process memory
limits, which is what I *suspect* has happened to you.

-- 
Tom E. Perrine ([EMAIL PROTECTED]) | San Diego Supercomputer Center 
http://www.sdsc.edu/~tep/ | Voice: +1.858.534.5000
"The French are glad to die for love..."  - Moulin Rouge

--
Mailman-Users maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/mailman-users



[Mailman-Users] arch python errors, bad marshall data

2002-01-06 Thread Micah Anderson

The archives of a particular list went astray, so I tried to run arch
to re-pickle them, it went for some time (we have archives for a
couple years), up until December of this year, then it puked, anyone
know how to fix this??

2001-December
Traceback (innermost last):
  File "./arch", line 129, in ?
main()
  File "./arch", line 118, in main
archiver.processUnixMailbox(fp, Article)
  File "/usr/local/mailman/Mailman/Archiver/pipermail.py", line 528,
in processUnixMailbox
self.add_article(a)
  File "/usr/local/mailman/Mailman/Archiver/HyperArch.py", line 928,
in add_article
self.__super_add_article(article)
  File "/usr/local/mailman/Mailman/Archiver/pipermail.py", line 567,
in add_article
article.parentID = parentID = self.get_parent_info(arch, article)
  File "/usr/local/mailman/Mailman/Archiver/pipermail.py", line 601,
in get_parent_info
if parentID and not self.database.hasArticle(archive, parentID):
  File "/usr/local/mailman/Mailman/Archiver/HyperDatabase.py", line
267, in hasArticle
self.__openIndices(archive)
  File "/usr/local/mailman/Mailman/Archiver/HyperDatabase.py", line
245, in __openIndices
t = DumbBTree(os.path.join(arcdir, archive + '-' + i))
  File "/usr/local/mailman/Mailman/Archiver/HyperDatabase.py", line
68, in __init__
self.load()
  File "/usr/local/mailman/Mailman/Archiver/HyperDatabase.py", line
173, in load
self.dict = marshal.load(fp)
ValueError: bad marshal data


Thanks!
Micah

--
Mailman-Users maillist  -  [EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/mailman-users