Matthew, et al -- ...and then Matthew D. Fuller said... % % On Wed, Dec 26, 2001 at 09:22:33PM -0500 I heard the voice of % David T-G, and lo! it spake thus: % > % > Thus, it should be sufficient to match on any ^From_ line as long as % > you're working with an mbox file (which you can confirm by checking the ... % % Note that this can (also) break.
So I hear!
%
% I was just testing some mbox-parsing code the other day, and I needed a
% quick mbox of reasonable size to test it against. Hey, how about
% ~/mail/sent?
One would think so...
%
% But it's got bare "^From " lines in mid-message where they 'naturally'
% appeared. So, either you need a bit more smarts than just "^From ", or
% mutt doesn't write 'sent' as a true mbox.
And I trust that this all works when you open it with mutt, right? [Hey,
it never hurts to check.]
%
% The 'mbox' manpage from qmail says:
% ---
% MESSAGE FORMAT
% A message encoded in mbox format begins with a From_ line,
% continues with a series of non-From_ lines, and ends with a
% blank line. A From_ line means any line that begins with
% the characters F, r, o, m, space:
%
% [...]
% ---
%
% Which seems to imply the POV that "^From " should be a sufficient pattern
% (in which case, watch out for your sent box!)
Yes, indeed.
%
% Mutt seems to use a bit more smarts. See "is_from()" in from.c for
% details.
At the very least, Philip now has a more solid regexp definition:
From [ <return-path> ] <weekday> <month> <day> <time> [ <timezone> ] <year>
would probably turn into something like
^From ([^\t\s@][^\t\s@]*@[^\t\s@][^\t\s@]*\.[^\t\s@][^\t\s@]*|) \
(Sun|Mon|Tue|Wed|Thu|Fri|Sat) \
(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) \
[\s1-3][0-9] [01][0-9]:[0-5][0-9]:[0-5][0-9] \
([A-Z][A-Z][A-Z] |) [0-9][0-9][0-9][0-9]
(yes, I've faked it with line breaks just to keep things readable; note
the two spaces at the end of the first line although it may not really
matter and [\s]* should perhaps be used instead). No, I'm not going into
MIME-encoding of the header as seen in some ^From: lines. No, this
doesn't allow for leap seconds (but *probably* all one needs is to add a
6 to the seconds regexp). No, this will break at year 10000; apparently
y2k taught me nothing :-)
%
% --
% Matthew Fuller (MF4839) | [EMAIL PROTECTED]
% Unix Systems Administrator | [EMAIL PROTECTED]
% Specializing in FreeBSD | http://www.over-yonder.net/
%
% "The only reason I'm burning my candle at both ends, is because I
% haven't figured out how to light the middle yet"
HTH & HAND & Happy Holidays to all
:-D
--
David T-G * It's easier to fight for one's principles
(play) [EMAIL PROTECTED] * than to live up to them. -- fortune cookie
(work) [EMAIL PROTECTED]
http://www.justpickone.org/davidtg/ Shpx gur Pbzzhavpngvbaf Qrprapl Npg!
msg21919/pgp00000.pgp
Description: PGP signature
