Package: mhonarc Version: 2.6.19-2 Severity: normal Tags: patch upstream Dear Maintainer,
Consider the attached UUCP-style mbox, where for each message the byte-length of its body is indicated with a Content-Length: header. The ‘-conlen’ [0] flag is meant to make MHonArc read the correct body length and remove the need for unescaping lines starting with “From ”. This usually works well [1], but these Content-Length: are ignored for excluded messages, so bogus archives are generated when the body of an excluded message matches the mbox separator. I'd expect `mhonarc -conlen -expireage 86400 -checknoarchive` to create an archive containing only the first and last messages: the 2nd has a Message-Id collision with the 1st; the 3rd has ‘X-No-Archive’ set, and the 4rd is too old. However $ mhonarc -conlen -expireage 86400 -checknoarchive -outdir /tmp/out - </tmp/test.mbox This is MHonArc v2.6.19+, Perl 5.030003 linux Converting messages to /tmp/out Reading - ... Warning: Could not parse date for message Message-Id: <b28150dd2dc7bdd1b2bac3ff09a4a...@no-id-found.mhonarc.org> Date: ..... Writing mail ... Writing /tmp/out/maillist.html ... Writing /tmp/out/threads.html ... Writing database ... 3 new messages 3 total messages $ grep X-Subject: /tmp/out/msg*.html /tmp/out/msg00000.html:<!--X-Subject: foo --> /tmp/out/msg00001.html:<!--X-Subject: --> /tmp/out/msg00002.html:<!--X-Subject: baz --> AFAIK this is because read_mail_header() doesn't return the headers for excluded messages, so read_mail_body() doesn't have any Content-Length value to skip and thus chokes on the “From ” in the message body. After applying the attached patch the bogus message is no longer generated and the archive is what one would expect: $ mhonarc -conlen -expireage 86400 -checknoarchive -outdir /tmp/out - </tmp/test.mbox This is MHonArc v2.6.19+, Perl 5.030003 linux Converting messages to /tmp/out Reading - ..... Writing mail .. Writing /tmp/out/maillist.html ... Writing /tmp/out/threads.html ... Writing database ... 2 new messages 2 total messages $ grep X-Subject: /tmp/out/msg*.html /tmp/out/msg00000.html:<!--X-Subject: foo --> /tmp/out/msg00001.html:<!--X-Subject: baz --> (‘-reconvert’ is a workaround for the Message-Id collision, however it's not always ideal on open lists as it allows an attacker to DoS previous messages to a list.) Cheers, -- Guilhem. [0] https://www.mhonarc.org/MHonArc/doc/resources/conlen.html [1] But see also https://bugs.debian.org/970209
test.mbox
Description: application/mbox
--- mhonarc-2.6.19/lib/mhamain.pl +++ mhonarc-2.6.19/lib/mhamain.pl @@ -788,14 +788,14 @@ grep { /no-external-archive/i } @{$fields->{'restrict'}}) || (defined($fields->{'x-no-archive'}) && grep { /yes/i } @{$fields->{'x-no-archive'}})) ) { - return undef; + return (undef, $fields); } ##----------------------------------## ## Check for user-defined exclusion ## ##----------------------------------## if ($MsgExcFilter) { - return undef if mhonarc::message_exclude($header); + return (undef, $fields) if mhonarc::message_exclude($header); } ##------------## @@ -833,7 +833,7 @@ delmsg($index); $index = undef; } else { - return undef; + return (undef, $fields); } } @@ -879,7 +879,7 @@ ## Return if message too old to add (note, $index just contains time). if (&expired_time($index)) { - return undef; + return (undef, $fields); } ##-------------## @@ -950,7 +950,7 @@ ## Invoke callback if defined if (defined($CBMessageHeadRead) && defined(&$CBMessageHeadRead)) { - return undef unless &$CBMessageHeadRead($fields, $header); + return (undef, $fields) unless &$CBMessageHeadRead($fields, $header); } $Time{$index} = $t;
signature.asc
Description: PGP signature