Re: Automatically extracting AOL scomp attachments

2013-12-31 Thread Kris Deugau
Joe Quinn wrote:
> Magnificent! Thanks for the quick reply.
> 
> I will try this out when I get a chance. Do I have permission to copy
> your code below, with attribution of course?

Sure.  Consider this fragment public domain.

-kgd


> On 12/31/2013 10:57 AM, Kris Deugau wrote:
>> Joe Quinn wrote:
>>> We semi-frequently get notified of spam in the form of AOL's notorious
>>> abuse reports. The actual spam is an attachment of mime type
>>> message/rfc822, which we have to extract by hand to make them easier to
>>> organize. We would like to have a tool that operates on all of these
>>> messages in one keystroke.
>>>
>>> Is there a good way to automatically do this, pulling out every rfc822
>>> attachment from an mbox and writing it back into the same mbox, or will
>>> we have to resort to writing some perl?
>>>
>>> This macro from the mutt wiki appears to do what we want, albeit on just
>>> an individual attachment -
>>> http://dev.mutt.org/trac/wiki/MuttFaq/Attachment
>>>  macro attach E
>>> /home/insert_username_here/aol_mbox
>>>
>>> I have not been able to find any documentation on how to map over
>>> attachments by mime type, nor how to map over just the sc...@aol.com
>>> messages.
>>>
>>> Any ideas?
>> I set up a mail delivery handler for the junkm...@vianet.ca role account
>> here that does just this.  Our webmail systems have "Report as spam"
>> buttons that forward a message as an attachment, so automatically
>> extracting those attached messages and filing them in a separate folder
>> as they come in really helps.  We've also managed to train a few users
>> to Do The Right Thing from their desktop mail clients, although
>> Outlook's output is sometimes *still* missing most of the headers.
>>
>> I can't post the entire thing (there's a *lot* of extra noise left over
>> from the primary handler I adapted), but here's the key part:
>>
>> 
>>
>> ## wy up near the top of the script:
>>use MIME::Parser;
>>use MIME::Base64;
>>
>> ## begin new MIME::Parser deconstruction
>>  my $parser = new MIME::Parser;
>> # apparently not in MIME::Parser 5.420, but it's in 5.427
>> #$parser->tmp_dir("$tmpdir/$time.$$.working");
>> # so we do this instead (eww)
>>  umask 0077;
>>  $parser->output_under($tmpdir);
>>
>> # set some useful options
>>  $parser->extract_nested_messages(0);
>>  $parser->decode_bodies(0);
>>
>>  # sigh.  be nice if the filenames could be arbitrarily *set* for
>>  # the temp working copies...  since not, we have to manually fiddle
>>  # the data back and forth.
>>  my $time = time;
>>  my $partfile = "$tmpdir/$time.$$.part";
>>
>>  my $entity = $parser->parse_open($tmp_file);
>>  foreach my $subent ($entity->parts()) {
>>my $type = $subent->mime_type;
>>if ($type =~ m|message/rfc822|i) {
>>  # hack pthui.  some idiot MUAs (or antispam plugins, more
>> likely) have started to
>>  # base-64 the attached message, in its entirety.  Which results
>> in the extracted
>>  # message being just a blob of base64, unreadable by the mail
>> server or client.
>>  # This is a major nuisance.
>>  my $b64 = (
>>
>> $subent->{mail_inet_head}->{mail_hdr_hash}->{'Content-Transfer-Encoding'}
>> ?
>>
>> ${$subent->{mail_inet_head}->{mail_hdr_hash}->{'Content-Transfer-Encoding'}[0]}
>>
>> =~ /: base64/
>>  : 0);
>>  if (my $fh = $subent->open("r")) {
>>open(FH, ">$partfile");
>>while (defined(my $line = $fh->getline)) {
>>  if ($b64) {
>>print FH decode_base64($line);
>>  } else {
>>print FH $line;
>>  }
>>}
>>close(FH);
>>$fh->close;
>>mail_deliver("$settings{HD}/Maildir", 0660,
>> $settings{QT}{UID}, $settings{QT}{UID}, ".reported", $partfile);
>>  }
>>}
>>  }
>> # don't need to do this, mail_deliver does it for us...
>> #unlink $tmp;
>>
>>  mail_deliver("$settings{HD}/Maildir", '0600', $settings{QT}{UID},
>> $settings{QT}{UID}, 'INBOX', $tmp_file);
>>
>>  # clean up after the MIME::Parser object
>>  $parser->filer->purge;
>>
>> ## done new MIME::Parser deconstruction
>>
>> 
>>
>> $tmp_file is a temporary working file that stores the message, it
>> shouldn't take much to tweak this into working with STDIN/STDOUT
>> (perhaps as called by a :0fc procmail recipe?).
>>
>> mail_deliver is a sub that calls a custom local binary that writes a
>> message to a specified folder in a particular user's Maildir;  replace
>> as suitable for your system.
>>
>> This also handles the case of users who forward a group of messages from
>> their mail client.
>>
>> -kgd
> 



Re: Automatically extracting AOL scomp attachments

2013-12-31 Thread Joe Quinn

Magnificent! Thanks for the quick reply.

I will try this out when I get a chance. Do I have permission to copy 
your code below, with attribution of course?


On 12/31/2013 10:57 AM, Kris Deugau wrote:

Joe Quinn wrote:

We semi-frequently get notified of spam in the form of AOL's notorious
abuse reports. The actual spam is an attachment of mime type
message/rfc822, which we have to extract by hand to make them easier to
organize. We would like to have a tool that operates on all of these
messages in one keystroke.

Is there a good way to automatically do this, pulling out every rfc822
attachment from an mbox and writing it back into the same mbox, or will
we have to resort to writing some perl?

This macro from the mutt wiki appears to do what we want, albeit on just
an individual attachment - http://dev.mutt.org/trac/wiki/MuttFaq/Attachment
 macro attach E
/home/insert_username_here/aol_mbox

I have not been able to find any documentation on how to map over
attachments by mime type, nor how to map over just the sc...@aol.com
messages.

Any ideas?

I set up a mail delivery handler for the junkm...@vianet.ca role account
here that does just this.  Our webmail systems have "Report as spam"
buttons that forward a message as an attachment, so automatically
extracting those attached messages and filing them in a separate folder
as they come in really helps.  We've also managed to train a few users
to Do The Right Thing from their desktop mail clients, although
Outlook's output is sometimes *still* missing most of the headers.

I can't post the entire thing (there's a *lot* of extra noise left over
from the primary handler I adapted), but here's the key part:



## wy up near the top of the script:
   use MIME::Parser;
   use MIME::Base64;

## begin new MIME::Parser deconstruction
 my $parser = new MIME::Parser;
# apparently not in MIME::Parser 5.420, but it's in 5.427
#$parser->tmp_dir("$tmpdir/$time.$$.working");
# so we do this instead (eww)
 umask 0077;
 $parser->output_under($tmpdir);

# set some useful options
 $parser->extract_nested_messages(0);
 $parser->decode_bodies(0);

 # sigh.  be nice if the filenames could be arbitrarily *set* for
 # the temp working copies...  since not, we have to manually fiddle
 # the data back and forth.
 my $time = time;
 my $partfile = "$tmpdir/$time.$$.part";

 my $entity = $parser->parse_open($tmp_file);
 foreach my $subent ($entity->parts()) {
   my $type = $subent->mime_type;
   if ($type =~ m|message/rfc822|i) {
 # hack pthui.  some idiot MUAs (or antispam plugins, more
likely) have started to
 # base-64 the attached message, in its entirety.  Which results
in the extracted
 # message being just a blob of base64, unreadable by the mail
server or client.
 # This is a major nuisance.
 my $b64 = (

$subent->{mail_inet_head}->{mail_hdr_hash}->{'Content-Transfer-Encoding'} ?

${$subent->{mail_inet_head}->{mail_hdr_hash}->{'Content-Transfer-Encoding'}[0]}
=~ /: base64/
 : 0);
 if (my $fh = $subent->open("r")) {
   open(FH, ">$partfile");
   while (defined(my $line = $fh->getline)) {
 if ($b64) {
   print FH decode_base64($line);
 } else {
   print FH $line;
 }
   }
   close(FH);
   $fh->close;
   mail_deliver("$settings{HD}/Maildir", 0660,
$settings{QT}{UID}, $settings{QT}{UID}, ".reported", $partfile);
 }
   }
 }
# don't need to do this, mail_deliver does it for us...
#unlink $tmp;

 mail_deliver("$settings{HD}/Maildir", '0600', $settings{QT}{UID},
$settings{QT}{UID}, 'INBOX', $tmp_file);

 # clean up after the MIME::Parser object
 $parser->filer->purge;

## done new MIME::Parser deconstruction



$tmp_file is a temporary working file that stores the message, it
shouldn't take much to tweak this into working with STDIN/STDOUT
(perhaps as called by a :0fc procmail recipe?).

mail_deliver is a sub that calls a custom local binary that writes a
message to a specified folder in a particular user's Maildir;  replace
as suitable for your system.

This also handles the case of users who forward a group of messages from
their mail client.

-kgd




Re: Automatically extracting AOL scomp attachments

2013-12-31 Thread Kris Deugau
Joe Quinn wrote:
> We semi-frequently get notified of spam in the form of AOL's notorious
> abuse reports. The actual spam is an attachment of mime type
> message/rfc822, which we have to extract by hand to make them easier to
> organize. We would like to have a tool that operates on all of these
> messages in one keystroke.
> 
> Is there a good way to automatically do this, pulling out every rfc822
> attachment from an mbox and writing it back into the same mbox, or will
> we have to resort to writing some perl?
> 
> This macro from the mutt wiki appears to do what we want, albeit on just
> an individual attachment - http://dev.mutt.org/trac/wiki/MuttFaq/Attachment
> macro attach E
> /home/insert_username_here/aol_mbox
> 
> I have not been able to find any documentation on how to map over
> attachments by mime type, nor how to map over just the sc...@aol.com
> messages.
> 
> Any ideas?

I set up a mail delivery handler for the junkm...@vianet.ca role account
here that does just this.  Our webmail systems have "Report as spam"
buttons that forward a message as an attachment, so automatically
extracting those attached messages and filing them in a separate folder
as they come in really helps.  We've also managed to train a few users
to Do The Right Thing from their desktop mail clients, although
Outlook's output is sometimes *still* missing most of the headers.

I can't post the entire thing (there's a *lot* of extra noise left over
from the primary handler I adapted), but here's the key part:



## wy up near the top of the script:
  use MIME::Parser;
  use MIME::Base64;

## begin new MIME::Parser deconstruction
my $parser = new MIME::Parser;
# apparently not in MIME::Parser 5.420, but it's in 5.427
#$parser->tmp_dir("$tmpdir/$time.$$.working");
# so we do this instead (eww)
umask 0077;
$parser->output_under($tmpdir);

# set some useful options
$parser->extract_nested_messages(0);
$parser->decode_bodies(0);

# sigh.  be nice if the filenames could be arbitrarily *set* for
# the temp working copies...  since not, we have to manually fiddle
# the data back and forth.
my $time = time;
my $partfile = "$tmpdir/$time.$$.part";

my $entity = $parser->parse_open($tmp_file);
foreach my $subent ($entity->parts()) {
  my $type = $subent->mime_type;
  if ($type =~ m|message/rfc822|i) {
# hack pthui.  some idiot MUAs (or antispam plugins, more
likely) have started to
# base-64 the attached message, in its entirety.  Which results
in the extracted
# message being just a blob of base64, unreadable by the mail
server or client.
# This is a major nuisance.
my $b64 = (

$subent->{mail_inet_head}->{mail_hdr_hash}->{'Content-Transfer-Encoding'} ?

${$subent->{mail_inet_head}->{mail_hdr_hash}->{'Content-Transfer-Encoding'}[0]}
=~ /: base64/
: 0);
if (my $fh = $subent->open("r")) {
  open(FH, ">$partfile");
  while (defined(my $line = $fh->getline)) {
if ($b64) {
  print FH decode_base64($line);
} else {
  print FH $line;
}
  }
  close(FH);
  $fh->close;
  mail_deliver("$settings{HD}/Maildir", 0660,
$settings{QT}{UID}, $settings{QT}{UID}, ".reported", $partfile);
}
  }
}
# don't need to do this, mail_deliver does it for us...
#unlink $tmp;

mail_deliver("$settings{HD}/Maildir", '0600', $settings{QT}{UID},
$settings{QT}{UID}, 'INBOX', $tmp_file);

# clean up after the MIME::Parser object
$parser->filer->purge;

## done new MIME::Parser deconstruction



$tmp_file is a temporary working file that stores the message, it
shouldn't take much to tweak this into working with STDIN/STDOUT
(perhaps as called by a :0fc procmail recipe?).

mail_deliver is a sub that calls a custom local binary that writes a
message to a specified folder in a particular user's Maildir;  replace
as suitable for your system.

This also handles the case of users who forward a group of messages from
their mail client.

-kgd


Automatically extracting AOL scomp attachments

2013-12-31 Thread Joe Quinn
We semi-frequently get notified of spam in the form of AOL's notorious 
abuse reports. The actual spam is an attachment of mime type 
message/rfc822, which we have to extract by hand to make them easier to 
organize. We would like to have a tool that operates on all of these 
messages in one keystroke.


Is there a good way to automatically do this, pulling out every rfc822 
attachment from an mbox and writing it back into the same mbox, or will 
we have to resort to writing some perl?


This macro from the mutt wiki appears to do what we want, albeit on just 
an individual attachment - http://dev.mutt.org/trac/wiki/MuttFaq/Attachment
macro attach E 
/home/insert_username_here/aol_mbox


I have not been able to find any documentation on how to map over 
attachments by mime type, nor how to map over just the sc...@aol.com 
messages.


Any ideas?

Regards,
JMQ