Package: mhonarc
Version: 2.6.19-2
Severity: normal
Tags: patch upstream

Dear Maintainer,

Consider the attached UUCP-style mbox, where for each message the
byte-length of its body is indicated with a Content-Length: header.  The
‘-conlen’ [0] flag is meant to make MHonArc read the correct body length
and remove the need for unescaping lines starting with “From ”.  This
usually works well [1], but these Content-Length: are ignored for
excluded messages, so bogus archives are generated when the body of an
excluded message matches the mbox separator.

I'd expect `mhonarc -conlen -expireage 86400 -checknoarchive` to create
an archive containing only the first and last messages: the 2nd has a
Message-Id collision with the 1st; the 3rd has ‘X-No-Archive’ set, and
the 4rd is too old.  However

        $ mhonarc -conlen -expireage 86400 -checknoarchive -outdir /tmp/out - 
</tmp/test.mbox
        This is MHonArc v2.6.19+, Perl 5.030003 linux
        Converting messages to /tmp/out
        Reading - ...
        Warning: Could not parse date for message
                 Message-Id: 
<b28150dd2dc7bdd1b2bac3ff09a4a...@no-id-found.mhonarc.org>
                 Date: 
        .....

        Writing mail ...
        Writing /tmp/out/maillist.html ...
        Writing /tmp/out/threads.html ...
        Writing database ...
        3 new messages
        3 total messages

        $ grep X-Subject: /tmp/out/msg*.html
        /tmp/out/msg00000.html:<!--X-Subject: foo -->
        /tmp/out/msg00001.html:<!--X-Subject:  -->
        /tmp/out/msg00002.html:<!--X-Subject: baz -->

AFAIK this is because read_mail_header() doesn't return the headers for
excluded messages, so read_mail_body() doesn't have any Content-Length
value to skip and thus chokes on the “From ” in the message body.  After
applying the attached patch the bogus message is no longer generated and
the archive is what one would expect:

        $ mhonarc -conlen -expireage 86400 -checknoarchive -outdir /tmp/out - 
</tmp/test.mbox
        This is MHonArc v2.6.19+, Perl 5.030003 linux
        Converting messages to /tmp/out
        Reading - .....

        Writing mail ..
        Writing /tmp/out/maillist.html ...
        Writing /tmp/out/threads.html ...
        Writing database ...
        2 new messages
        2 total messages

        $ grep X-Subject: /tmp/out/msg*.html
        /tmp/out/msg00000.html:<!--X-Subject: foo -->
        /tmp/out/msg00001.html:<!--X-Subject: baz -->

(‘-reconvert’ is a workaround for the Message-Id collision, however it's
not always ideal on open lists as it allows an attacker to DoS previous
messages to a list.)

Cheers,
-- 
Guilhem.

[0] https://www.mhonarc.org/MHonArc/doc/resources/conlen.html
[1] But see also https://bugs.debian.org/970209

Attachment: test.mbox
Description: application/mbox

--- mhonarc-2.6.19/lib/mhamain.pl
+++ mhonarc-2.6.19/lib/mhamain.pl
@@ -788,14 +788,14 @@
 	  grep { /no-external-archive/i } @{$fields->{'restrict'}}) ||
 	  (defined($fields->{'x-no-archive'}) &&
 	   grep { /yes/i } @{$fields->{'x-no-archive'}})) ) {
-	return undef;
+	return (undef, $fields);
     }
 
     ##----------------------------------##
     ## Check for user-defined exclusion ##
     ##----------------------------------##
     if ($MsgExcFilter) {
-	return undef  if mhonarc::message_exclude($header);
+	return (undef, $fields) if mhonarc::message_exclude($header);
     }
 
     ##------------##
@@ -833,7 +833,7 @@
 	    delmsg($index);
 	    $index = undef;
 	} else {
-	    return undef;
+	    return (undef, $fields);
 	}
     }
 
@@ -879,7 +879,7 @@
 
     ## Return if message too old to add (note, $index just contains time).
     if (&expired_time($index)) {
-	return undef;
+	return (undef, $fields);
     }
 
     ##-------------##
@@ -950,7 +950,7 @@
 
     ## Invoke callback if defined
     if (defined($CBMessageHeadRead) && defined(&$CBMessageHeadRead)) {
-	return undef  unless &$CBMessageHeadRead($fields, $header);
+	return (undef, $fields) unless &$CBMessageHeadRead($fields, $header);
     }
 
     $Time{$index} = $t;

Attachment: signature.asc
Description: PGP signature

Reply via email to