Joe Quinn wrote:
> We semi-frequently get notified of spam in the form of AOL's notorious
> abuse reports. The actual spam is an attachment of mime type
> message/rfc822, which we have to extract by hand to make them easier to
> organize. We would like to have a tool that operates on all of these
> messages in one keystroke.
> 
> Is there a good way to automatically do this, pulling out every rfc822
> attachment from an mbox and writing it back into the same mbox, or will
> we have to resort to writing some perl?
> 
> This macro from the mutt wiki appears to do what we want, albeit on just
> an individual attachment - http://dev.mutt.org/trac/wiki/MuttFaq/Attachment
>     macro attach E
> <save-entry><kill-line>/home/insert_username_here/aol_mbox<enter>
> 
> I have not been able to find any documentation on how to map over
> attachments by mime type, nor how to map over just the sc...@aol.com
> messages.
> 
> Any ideas?

I set up a mail delivery handler for the junkm...@vianet.ca role account
here that does just this.  Our webmail systems have "Report as spam"
buttons that forward a message as an attachment, so automatically
extracting those attached messages and filing them in a separate folder
as they come in really helps.  We've also managed to train a few users
to Do The Right Thing from their desktop mail clients, although
Outlook's output is sometimes *still* missing most of the headers.

I can't post the entire thing (there's a *lot* of extra noise left over
from the primary handler I adapted), but here's the key part:

====

## waaaay up near the top of the script:
  use MIME::Parser;
  use MIME::Base64;

## begin new MIME::Parser deconstruction
    my $parser = new MIME::Parser;
# apparently not in MIME::Parser 5.420, but it's in 5.427
#    $parser->tmp_dir("$tmpdir/$time.$$.working");
# so we do this instead (eww)
    umask 0077;
    $parser->output_under($tmpdir);

# set some useful options
    $parser->extract_nested_messages(0);
    $parser->decode_bodies(0);

    # sigh.  be nice if the filenames could be arbitrarily *set* for
    # the temp working copies...  since not, we have to manually fiddle
    # the data back and forth.
    my $time = time;
    my $partfile = "$tmpdir/$time.$$.part";

    my $entity = $parser->parse_open($tmp_file);
    foreach my $subent ($entity->parts()) {
      my $type = $subent->mime_type;
      if ($type =~ m|message/rfc822|i) {
        # hack pthui.  some idiot MUAs (or antispam plugins, more
likely) have started to
        # base-64 the attached message, in its entirety.  Which results
in the extracted
        # message being just a blob of base64, unreadable by the mail
server or client.
        # This is a major nuisance.
        my $b64 = (

$subent->{mail_inet_head}->{mail_hdr_hash}->{'Content-Transfer-Encoding'} ?

${$subent->{mail_inet_head}->{mail_hdr_hash}->{'Content-Transfer-Encoding'}[0]}
=~ /: base64/
                : 0);
        if (my $fh = $subent->open("r")) {
          open(FH, ">$partfile");
          while (defined(my $line = $fh->getline)) {
            if ($b64) {
              print FH decode_base64($line);
            } else {
              print FH $line;
            }
          }
          close(FH);
          $fh->close;
          mail_deliver("$settings{HD}/Maildir", 0660,
$settings{QT}{UID}, $settings{QT}{UID}, ".reported", $partfile);
        }
      }
    }
# don't need to do this, mail_deliver does it for us...
#    unlink $tmp;

    mail_deliver("$settings{HD}/Maildir", '0600', $settings{QT}{UID},
$settings{QT}{UID}, 'INBOX', $tmp_file);

    # clean up after the MIME::Parser object
    $parser->filer->purge;

## done new MIME::Parser deconstruction

====

$tmp_file is a temporary working file that stores the message, it
shouldn't take much to tweak this into working with STDIN/STDOUT
(perhaps as called by a :0fc procmail recipe?).

mail_deliver is a sub that calls a custom local binary that writes a
message to a specified folder in a particular user's Maildir;  replace
as suitable for your system.

This also handles the case of users who forward a group of messages from
their mail client.

-kgd

Reply via email to