Paul Tomblin wrote: >Quoting G. Armour Van Horn ([EMAIL PROTECTED]): > > >>After moving the list to it's new home and running the script to update >>the archive, I ended up with a raft of messages in the January 2007 >>archive that are probably ancient. They show no subject, and all of them >>are dated this afternoon, probably at the time that I ran the script. Is >>there any safe way to clear those out? >> >> > >That happened to me when I moved my archives because I had old messages >that had an "unescaped" "From " line in the body. I guess there was a >time when pipermail didn't put a ">" in front of the word "From " in the >body of a message, and so when I ran "arch" on that mbox I got a lot of >gibberish messages dated today. The user contributed program "cleanarch" >can help fix up some (but not all) of those and I had to use sed to fix >the rest. Another problem I ran into were some messages that came around >1 Jan 2000 that had a date of 1 Jan 100. I also discovered some very old >messages that had a header line of >Content-Type: TEXT/PLAIN; charset=".chrsc" >which confused arch as well. It wasn't until I fixed all >of these problems that I was able to finally run arch in a way that built >good archives. > > > I spoke too soon. I got a lot of this:
#Unix-From line changed: 175609 From the wire service copy: #######Unix-From line changed: 176324 From the MM press release: ##########################Unix-From line changed: 178901 From a designers view I think FW is the most powerful tool. I designed ######Unix-From line changed: 179571 From my web site: Unix-From line changed: 179573 From my experience, there is no specific palette grouping that causes Pal to (I had used the "-s 100" option to output a # every hundred lines.) Every case cleanarch came upon was a valid bit of text inside a message. Then I went and looked at the actual output, and saw that cleanarch had prepended a ">" to the lines that were part of running text, so I renamed files so the output from cleanarch was the live file and ran arch again. I think it may have made things worse, it looks like the same messages that were there before still ended up in the January archive. They still have date tags based on the time of running arch for the first time on the new machine yesterday afternoon. These dates are not found in the mbox file. Looking at the messages in the January archive, it looks like there are only about 25 messages, not really a huge task to go back and repair manually. The question then becomes, what do I need to do to the mbox file so that arch will know where to actually break things, and do I need to do anything special to make sure that the messed up archive elements are no longer present? Van -- ---------------------------------------------------------- Sign up now for Quotes of the Day, a handful of quotations on a theme delivered every morning. Enlightenment! Daily, for free! mailto:[EMAIL PROTECTED] For photography, web design, hosting, and maintenance, visit Van's home page: http://www.domainvanhorn.com/van/ ----------------------------------------------------------- ------------------------------------------------------ Mailman-Users mailing list Mailman-Users@python.org http://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-users/archive%40jab.org Security Policy: http://www.python.org/cgi-bin/faqw-mm.py?req=show&file=faq01.027.htp