This is not a request for help but a report of experience in case someone else finds it helpful.

I recently migrated some old mailing lists into Mailman. They had previously run on different software (my own), and at first I assumed I'd need to keep two sets of archives, putting the old ones on my regular website (not the "lists." subdomain created by Mailman).

Then I saw in the FAQ that it was possible to edit list archives. The emphasis there was on deleting posts, but I thought, if this works for deleting posts it should also work for adding them.

Fortunately my old archives were already in mbox format. Or rather, almost in mbox format. The old incarnation of my lists had been on a server where I had a low usage quota, so I had been downloading all archives over a year old and storing them on my home computer. In doing so, I had passed them through a word processor macro to do some minimal cleanup, which was chiefly to remove the ">" that mbox files put in front of body lines beginning with "From " ("From the historian's viewpoint," one subscriber wrote).

Undoing that change was easy enough, but what I didn't notice was that word wrap had gotten imposed on some very long header lines (such as "DomainKey-Signatures:"). This damaged the headers and made them appear to end sooner, with some of their data falling through into the message body.

Usually, when this happened, the "Date:" line would be in the part that fell through. Mailman seems to rely on this line when sorting posts by date (it does _not_ rely on the physical order of messages in the mbox file). In the absence of a "Date:" line in the header, Mailman seems to use the current time (when it is indexing the archive).

To fix this I had to go back through the imported mbox files and clean up the headers. Since I was doing this in vi over an SSH connection and couldn't see clearly whether there was a newline character or only a line that was too long for the screen, I decided the safest method was just to delete all those overlong headers. They shouldn't be needed in the archive anyway. (The "Received:" and "Delivered-To:" lines had long since been removed by my program, when it saved out a week's files and started a new archive.)

I also found some "Date:" lines that had been mistaken from the beginning. One of my subscribers wrote that he had just switched to a Mac in order to clear a Windows-based virus out of his mailbox. Somehow his Macintosh had its system date set to August 27, 1956! Mailman made this the first post on the list, followed by a silence of over 40 years. I went back and corrected the date as well as I could and then indexed the archive all over again.

Moral: You can import old mbox files to a Mailman archive, but be sure to clean up the headers before you generate the index.

--
Larry Kuenning
la...@qhpress.org
------------------------------------------------------
Mailman-Users mailing list Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Reply via email to