Re: [Gossip] tidying up mbox files

2006-08-13 Thread Jeff Breidenbach

Thanks!

Also, one of the people with slightly-broken mbox files suggested this:

perl -i -p -e '/^From /  !/\d\d:\d\d:\d\d \d\d\d\d$/  s/(.+)/$1/' A_*

I'm continuously amazed at both Perl, and the people whose brains are
capable of understanding it. :)

-Jeff

___
Discussion list for The Mail Archive
Gossip@jab.org
http://jab.org/cgi-bin/mailman/listinfo/gossip


[Gossip] tidying up mbox files

2006-08-12 Thread Jeff Breidenbach

Hi all,

When someone wants to import a bunch of messages into an archive,
the provide an mbox file. The mbox file format is simple, but has at
least one gotcha.

  In  order  to  avoid misinterpretation of lines in message bodies which
  begin with the four characters From, followed by a  space  character,
  the  mail  delivery  agent  must quote any occurrence of From  at the
  start of a body line.

The majority of mbox files I've been handed do not escape From like
they should, and this causes problems on M-A's end; inc from the nmh
suite gets unhappy and starts trashing messages. Are there any
recommendations for an mbox2mbox converter that will clean up
these wayward almost-but-not-quite-mbox files?

Jeff

___
Discussion list for The Mail Archive
Gossip@jab.org
http://jab.org/cgi-bin/mailman/listinfo/gossip