Re: [Mailman-Users] arch python errors, bad marshall data
On Sun, Jan 06, 2002 at 01:41:49PM -0800, Tom Perrine wrote: > I guess what I'm getting at is that the "arch" program uses LOTS of > memory when building some of the indexes, and that incremental > "arch"-ing may be more efficient and less likely to exercise > memory-related bugs in Mailman, Python, or the underlying system. I was asking about this recently (never had the time to try it out): So you are saying that arch can be fed let's say weekly or monthly mbox files and it will correctly generate the archive? I've considered doing this on setups where I can't afford to run arch for every single message posted (and delay qrunner as a result), but where I should be able to run it for batches of messages once a night (strace has shown me that arch is very slow at editing the huge HTML pages to add one message at the bottom, but my hope is that if you add a batch, it does not incrementally add the messages one per one) Marc -- Microsoft is to operating systems & security what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | Finger [EMAIL PROTECTED] for PGP key -- Mailman-Users maillist - [EMAIL PROTECTED] http://mail.python.org/mailman/listinfo/mailman-users
Re: [Mailman-Users] arch python errors, bad marshall data
The machine must have crashed just as a new message was being added, so the pipermail.pck file was corrupt. The fix is to move the old ~malman/archives/public/listname directory out of the way completely and rerun ~mailman/bin/arch on the list, which I've done, and all is good. On Sun, 06 Jan 2002, Tom Perrine wrote: > > On Sun, 6 Jan 2002 12:12:27 -0800, Micah Anderson <[EMAIL PROTECTED]> said: > > Micah> The archives of a particular list went astray, so I tried to run arch > Micah> to re-pickle them, it went for some time (we have archives for a > Micah> couple years), up until December of this year, then it puked, anyone > Micah> know how to fix this?? > > I have no idea if this is related, but my experience may be > interesting... > > I have a mailing list that has been running since 1994. It has about > 20,000 messages in the non-Mailman archives. I have been importing > and re-importing it into Mailman, as I was playing with some Mailman > options and also correcting corruption in the original archives as > part of migration into Mailman. > > The first few times I cat'ed all the archive files together and ran > "arch" on the resulting huge mbox file. This took overnight, most > likely because the machine was heavily into swap, and trying to > compute long lists of links, etc. > > Later, I started "arch"-ing the individual files separately. This > stayed out of swap and typically runs all the files in less than an > hour! > > I guess what I'm getting at is that the "arch" program uses LOTS of > memory when building some of the indexes, and that incremental > "arch"-ing may be more efficient and less likely to exercise > memory-related bugs in Mailman, Python, or the underlying system. > > It will also avoid the situations where you hit the pre-process memory > limits, which is what I *suspect* has happened to you. > > -- > Tom E. Perrine ([EMAIL PROTECTED]) | San Diego Supercomputer Center > http://www.sdsc.edu/~tep/ | Voice: +1.858.534.5000 > "The French are glad to die for love..." - Moulin Rouge -- Mailman-Users maillist - [EMAIL PROTECTED] http://mail.python.org/mailman/listinfo/mailman-users
Re: [Mailman-Users] arch python errors, bad marshall data
> On Sun, 6 Jan 2002 12:12:27 -0800, Micah Anderson <[EMAIL PROTECTED]> said: Micah> The archives of a particular list went astray, so I tried to run arch Micah> to re-pickle them, it went for some time (we have archives for a Micah> couple years), up until December of this year, then it puked, anyone Micah> know how to fix this?? I have no idea if this is related, but my experience may be interesting... I have a mailing list that has been running since 1994. It has about 20,000 messages in the non-Mailman archives. I have been importing and re-importing it into Mailman, as I was playing with some Mailman options and also correcting corruption in the original archives as part of migration into Mailman. The first few times I cat'ed all the archive files together and ran "arch" on the resulting huge mbox file. This took overnight, most likely because the machine was heavily into swap, and trying to compute long lists of links, etc. Later, I started "arch"-ing the individual files separately. This stayed out of swap and typically runs all the files in less than an hour! I guess what I'm getting at is that the "arch" program uses LOTS of memory when building some of the indexes, and that incremental "arch"-ing may be more efficient and less likely to exercise memory-related bugs in Mailman, Python, or the underlying system. It will also avoid the situations where you hit the pre-process memory limits, which is what I *suspect* has happened to you. -- Tom E. Perrine ([EMAIL PROTECTED]) | San Diego Supercomputer Center http://www.sdsc.edu/~tep/ | Voice: +1.858.534.5000 "The French are glad to die for love..." - Moulin Rouge -- Mailman-Users maillist - [EMAIL PROTECTED] http://mail.python.org/mailman/listinfo/mailman-users
[Mailman-Users] arch python errors, bad marshall data
The archives of a particular list went astray, so I tried to run arch to re-pickle them, it went for some time (we have archives for a couple years), up until December of this year, then it puked, anyone know how to fix this?? 2001-December Traceback (innermost last): File "./arch", line 129, in ? main() File "./arch", line 118, in main archiver.processUnixMailbox(fp, Article) File "/usr/local/mailman/Mailman/Archiver/pipermail.py", line 528, in processUnixMailbox self.add_article(a) File "/usr/local/mailman/Mailman/Archiver/HyperArch.py", line 928, in add_article self.__super_add_article(article) File "/usr/local/mailman/Mailman/Archiver/pipermail.py", line 567, in add_article article.parentID = parentID = self.get_parent_info(arch, article) File "/usr/local/mailman/Mailman/Archiver/pipermail.py", line 601, in get_parent_info if parentID and not self.database.hasArticle(archive, parentID): File "/usr/local/mailman/Mailman/Archiver/HyperDatabase.py", line 267, in hasArticle self.__openIndices(archive) File "/usr/local/mailman/Mailman/Archiver/HyperDatabase.py", line 245, in __openIndices t = DumbBTree(os.path.join(arcdir, archive + '-' + i)) File "/usr/local/mailman/Mailman/Archiver/HyperDatabase.py", line 68, in __init__ self.load() File "/usr/local/mailman/Mailman/Archiver/HyperDatabase.py", line 173, in load self.dict = marshal.load(fp) ValueError: bad marshal data Thanks! Micah -- Mailman-Users maillist - [EMAIL PROTECTED] http://mail.python.org/mailman/listinfo/mailman-users