[Mailman-Users] Re: Pipermail scrubbing ascii txt to ksh attachment

2021-03-07 Thread Mark Sapiro
On 3/7/21 2:15 PM, Mark Dale via Mailman-Users wrote:
> 
> Hi Mark. Your patch to Scrubber.py has solved the problem. Thank you.


And this issue is reported at
 and the broken fix
committed at

and the fix for that at
.


-- 
Mark Sapiro The highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan
--
Mailman-Users mailing list -- mailman-users@python.org
To unsubscribe send an email to mailman-users-le...@python.org
https://mail.python.org/mailman3/lists/mailman-users.python.org/
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: https://www.mail-archive.com/mailman-users@python.org/
https://mail.python.org/archives/list/mailman-users@python.org/


[Mailman-Users] Re: Pipermail scrubbing ascii txt to ksh attachment

2021-03-07 Thread Mark Dale via Mailman-Users
 Original Message 
From: Mark Sapiro [mailto:m...@msapiro.net]
Sent: Saturday, March 6, 2021, 23:27 UTC
To: mailman-users@python.org
Subject: [Mailman-Users] Re: Pipermail scrubbing ascii txt to ksh attachment

> The patch I posted previously was bad. This one is correct. If you patch
> Scrubber.py with this patch, you won't need to edit the mbox and rebuild
> the archive as that message won't get scrubbed.

Hi Mark. Your patch to Scrubber.py has solved the problem. Thank you.

A summary below: 



PROBLEM: Plain text emails being automatically posted to a list, via a Perl 
script, were displaying okay in subscribers' email clients, but in the archive 
-- the body of the message is scrubbed as "attachment.ksh".

CAUSE: The headers of the posts to the list contained the lines: 

Content-Disposition: inline
Content-Type: text/plain 

Mark wrote: "If the script that sends this mail can be altered to either 
include the charset= on the Content-Type: text/plain header or not include the
Content-Disposition: inline header or both, that would solve this." He then 
wrote a patch for Mailman/Handlers/Scrubber.py to do this (in the event that 
the script could not be modified).

=== modified file 'Mailman/Handlers/Scrubber.py'
--- Mailman/Handlers/Scrubber.py2020-06-21 18:45:30 +
+++ Mailman/Handlers/Scrubber.py2021-03-06 19:10:28 +
@@ -90,6 +90,9 @@
 if ctype.lower == 'application/octet-stream':
 # For this type, all[0] is '.obj'. '.bin' is better.
 return '.bin'
+if ctype.lower == 'text/plain':
+# For this type, all[0] is '.ksh'. '.txt' is better.
+return '.txt'
 return all and all[0]
 
 
@@ -196,8 +199,11 @@
 format = part.get_param('format')
 delsp = part.get_param('delsp')
 # TK: if part is attached then check charset and scrub if none
-if part.get('content-disposition') and \
-   not part.get_content_charset():
+# MAS: Content-Disposition is not a good test for 'attached'.
+# RFC 2183 sec. 2.10 allows Content-Disposition on the main body.
+# Make it specifically 'attachment'.
+if (part.get('content-disposition', '').lower() == 'attachment'
+and not part.get_content_charset()):
 omask = os.umask(002)
 try:
 url = save_attachment(mlist, part, dir)


IN ADDITION: The following script was run to tidy up the list's .mbox file.

#!/bin/bash
PATH=/usr/sbin:/usr/bin:/sbin:/bin
LISTNAME=your_list_name_goes_here
echo $LISTNAME archive rebuild started at $(date +%H:%M:%S)
sed -i '/Content-Disposition: inline/d' 
/var/lib/mailman/archives/private/$LISTNAME.mbox/$LISTNAME.mbox
sed -i 's/Content-Type: text\/plain/Content-Type: text\/plain; 
charset="us-ascii"/' 
/var/lib/mailman/archives/private/$LISTNAME.mbox/$LISTNAME.mbox
/usr/lib/mailman/bin/arch --wipe $LISTNAME
echo $LISTNAME archive rebuild completed at $(date +%H:%M:%S)
exit 0


 
--
Mailman-Users mailing list -- mailman-users@python.org
To unsubscribe send an email to mailman-users-le...@python.org
https://mail.python.org/mailman3/lists/mailman-users.python.org/
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: https://www.mail-archive.com/mailman-users@python.org/
https://mail.python.org/archives/list/mailman-users@python.org/


[Mailman-Users] Re: Pipermail scrubbing ascii txt to ksh attachment

2021-03-06 Thread Mark Dale via Mailman-Users



 Original Message 
From: Mark Sapiro [mailto:m...@msapiro.net]
Sent: Saturday, March 6, 2021, 23:27 UTC
To: mailman-users@python.org
Subject: [Mailman-Users] Re: Pipermail scrubbing ascii txt to ksh attachment

> On 3/6/21 2:55 PM, Mark Dale via Mailman-Users wrote:
>>
>> If it turns out that the the owner is unable to modify their Perl script, 
>> setting this shell script to run once a day may not be a problem as their 
>> list is auto-posted to at a scheduled time every day. It's a kludge of a 
>> "fix" I know, but it will limp it home.

> 
> The patch I posted previously was bad. This one is correct. If you patch
> Scrubber.py with this patch, you won't need to edit the mbox and rebuild
> the archive as that message won't get scrubbed.
> 

A -- when I looked (blindly) at the script earlier I was thinking that it 
would only write the attachment as a .txt rather than .ksh file. 

I've added your most recent patch. Now standing by to watch the next post get 
archived in glorious plain text.

Thank you once again Mark. 




--
Mailman-Users mailing list -- mailman-users@python.org
To unsubscribe send an email to mailman-users-le...@python.org
https://mail.python.org/mailman3/lists/mailman-users.python.org/
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: https://www.mail-archive.com/mailman-users@python.org/
https://mail.python.org/archives/list/mailman-users@python.org/


[Mailman-Users] Re: Pipermail scrubbing ascii txt to ksh attachment

2021-03-06 Thread Mark Sapiro
On 3/6/21 2:55 PM, Mark Dale via Mailman-Users wrote:
> 
> If it turns out that the the owner is unable to modify their Perl script, 
> setting this shell script to run once a day may not be a problem as their 
> list is auto-posted to at a scheduled time every day. It's a kludge of a 
> "fix" I know, but it will limp it home.


The patch I posted previously was bad. This one is correct. If you patch
Scrubber.py with this patch, you won't need to edit the mbox and rebuild
the archive as that message won't get scrubbed.

-- 
Mark Sapiro The highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan
=== modified file 'Mailman/Handlers/Scrubber.py'
--- Mailman/Handlers/Scrubber.py2020-06-21 18:45:30 +
+++ Mailman/Handlers/Scrubber.py2021-03-06 19:10:28 +
@@ -90,6 +90,9 @@
 if ctype.lower == 'application/octet-stream':
 # For this type, all[0] is '.obj'. '.bin' is better.
 return '.bin'
+if ctype.lower == 'text/plain':
+# For this type, all[0] is '.ksh'. '.txt' is better.
+return '.txt'
 return all and all[0]
 
 
@@ -196,8 +199,11 @@
 format = part.get_param('format')
 delsp = part.get_param('delsp')
 # TK: if part is attached then check charset and scrub if none
-if part.get('content-disposition') and \
-   not part.get_content_charset():
+# MAS: Content-Disposition is not a good test for 'attached'.
+# RFC 2183 sec. 2.10 allows Content-Disposition on the main body.
+# Make it specifically 'attachment'.
+if (part.get('content-disposition', '').lower() == 'attachment'
+and not part.get_content_charset()):
 omask = os.umask(002)
 try:
 url = save_attachment(mlist, part, dir)

--
Mailman-Users mailing list -- mailman-users@python.org
To unsubscribe send an email to mailman-users-le...@python.org
https://mail.python.org/mailman3/lists/mailman-users.python.org/
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: https://www.mail-archive.com/mailman-users@python.org/
https://mail.python.org/archives/list/mailman-users@python.org/


[Mailman-Users] Re: Pipermail scrubbing ascii txt to ksh attachment

2021-03-06 Thread Mark Dale via Mailman-Users



 Original Message 
From: Mark Sapiro [mailto:m...@msapiro.net]
Sent: Saturday, March 6, 2021, 01:19 UTC
To: mailman-users@python.org
Subject: [Mailman-Users] Re: Pipermail scrubbing ascii txt to ksh attachment

> On 3/5/21 2:56 PM, Mark Dale via Mailman-Users wrote:
>>
>>
>> I'll pass it on to the list owner to modify their script and see how we
>> get on.


> 
> 
> In case the script doesn't get modified, here's a patch I plan to commit
> to Scrubber.py which should help.
> 
> 
> --


Thanks again Mark. The list owner has indicated they can modify their Perl 
script (not done as yet), so I'll put your patch in place regardless.

I ran a script (below) against the mbox file to remove "Content-Disposition: 
inline" and change "Content-Type: text/plain" to include the charset and that 
has the archived messages now displaying nicely as plain text.

If it turns out that the the owner is unable to modify their Perl script, 
setting this shell script to run once a day may not be a problem as their list 
is auto-posted to at a scheduled time every day. It's a kludge of a "fix" I 
know, but it will limp it home.


#!/bin/bash
PATH=/usr/sbin:/usr/bin:/sbin:/bin
LISTNAME=redacted
echo $LISTNAME archive rebuild started at $(date +%H:%M:%S)
sed -i '/Content-Disposition: inline/d' 
/var/lib/mailman/archives/private/$LISTNAME.mbox/$LISTNAME.mbox
sed -i 's/Content-Type: text\/plain/Content-Type: text\/plain; 
charset="us-ascii"/' 
/var/lib/mailman/archives/private/$LISTNAME.mbox/$LISTNAME.mbox
/usr/lib/mailman/bin/arch --wipe $LISTNAME
echo $LISTNAME archive rebuild completed at $(date +%H:%M:%S)
exit 0
--
Mailman-Users mailing list -- mailman-users@python.org
To unsubscribe send an email to mailman-users-le...@python.org
https://mail.python.org/mailman3/lists/mailman-users.python.org/
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: https://www.mail-archive.com/mailman-users@python.org/
https://mail.python.org/archives/list/mailman-users@python.org/


[Mailman-Users] Re: Pipermail scrubbing ascii txt to ksh attachment

2021-03-05 Thread Mark Sapiro
On 3/5/21 2:56 PM, Mark Dale via Mailman-Users wrote:
> 
> 
> I'll pass it on to the list owner to modify their script and see how we
> get on.
> 
> Content-Disposition: inline
> Content-Type: text/plain
> MIME-Version: 1.0
> X-Mailer: MIME::Lite 3.031 (F2.85; T2.17; A2.21; B3.15; Q3.13)


In case the script doesn't get modified, here's a patch I plan to commit
to Scrubber.py which should help.

-- 
Mark Sapiro The highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan
=== modified file 'Mailman/Handlers/Scrubber.py'
--- Mailman/Handlers/Scrubber.py2020-06-21 18:45:30 +
+++ Mailman/Handlers/Scrubber.py2021-03-06 01:10:21 +
@@ -90,6 +90,9 @@
 if ctype.lower == 'application/octet-stream':
 # For this type, all[0] is '.obj'. '.bin' is better.
 return '.bin'
+if ctype.lower == 'text/plain':
+# For this type, all[0] is '.ksh'. '.txt' is better.
+return '.txt'
 return all and all[0]
 
 
@@ -196,7 +199,10 @@
 format = part.get_param('format')
 delsp = part.get_param('delsp')
 # TK: if part is attached then check charset and scrub if none
-if part.get('content-disposition') and \
+# MAS: Content-Disposition is not a good test for 'attached'.
+# RFC 2183 sec. 2.10 allows Content-Disposition on the main body.
+# Make it specifically 'attachment'.
+if part.get('content-disposition').lower() == 'attachment' and \
not part.get_content_charset():
 omask = os.umask(002)
 try:

--
Mailman-Users mailing list -- mailman-users@python.org
To unsubscribe send an email to mailman-users-le...@python.org
https://mail.python.org/mailman3/lists/mailman-users.python.org/
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: https://www.mail-archive.com/mailman-users@python.org/
https://mail.python.org/archives/list/mailman-users@python.org/


[Mailman-Users] Re: Pipermail scrubbing ascii txt to ksh attachment

2021-03-05 Thread Mark Dale via Mailman-Users



 Original Message 
From: Mark Sapiro [mailto:m...@msapiro.net]
Sent: Friday, March 5, 2021, 05:08 UTC
To: mailman-users@python.org
Subject: [Mailman-Users] Re: Pipermail scrubbing ascii txt to ksh attachment

> On 3/4/21 8:36 PM, Mark Dale via Mailman-Users wrote:
>>
>> Reading Steve's reply just now makes me look suspiciously at the Perl
>> X-Mailer: MIME::Lite that is sending the email to the list. My
>> understanding is the list owner has scheduled a Perl script to export
>> from a database and post the resulting export.
> 
> 
> I've looked at the code more carefully, and I see there are two
> conditions for the text/plain part to be scrubbed. One is the lack of a
> charset= parameter, but the other is that the part is not the only body
> part or maybe the first part of a multipart body.
> 
> However, the way the code determines if the part is the body vs. being
> an attachment is the presence of a Content-Disposition: header. Your
> message has a Content-Disposition: inline header and while this is
> explicitly allowed by RFC 2183, it is unusual for a single part
> text/plain message.
> 
> If the perl script that generates this message can not include that
> header, I don't thing the part will be scrubbed.
> 

Ahhh -- that's great intel Mark: thank you very much!!! Your diligence
and patience are mind blowing.

I'll pass it on to the list owner to modify their script and see how we
get on.

Content-Disposition: inline
Content-Type: text/plain
MIME-Version: 1.0
X-Mailer: MIME::Lite 3.031 (F2.85; T2.17; A2.21; B3.15; Q3.13)



--
Mailman-Users mailing list -- mailman-users@python.org
To unsubscribe send an email to mailman-users-le...@python.org
https://mail.python.org/mailman3/lists/mailman-users.python.org/
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: https://www.mail-archive.com/mailman-users@python.org/
https://mail.python.org/archives/list/mailman-users@python.org/


[Mailman-Users] Re: Pipermail scrubbing ascii txt to ksh attachment

2021-03-05 Thread Mark Sapiro
On 3/5/21 2:42 PM, Mark Dale via Mailman-Users wrote:
> 
> 
> Here is what the edited scripts/post wrote to file last night.
> 
...
> Content-Disposition: inline
> Content-Type: text/plain
...


So what's triggering the issue is the lack of a charset= on the
Content-Type: text/plain header together with the Content-Disposition:
inline header.

If the script that sends this mail can be altered to either include the
charset= on the Content-Type: text/plain header or not include the
Content-Disposition: inline header or both, that would solve this.

I'll also work on a patch to Scrubber.py and post that when it's done.


-- 
Mark Sapiro The highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan
--
Mailman-Users mailing list -- mailman-users@python.org
To unsubscribe send an email to mailman-users-le...@python.org
https://mail.python.org/mailman3/lists/mailman-users.python.org/
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: https://www.mail-archive.com/mailman-users@python.org/
https://mail.python.org/archives/list/mailman-users@python.org/


[Mailman-Users] Re: Pipermail scrubbing ascii txt to ksh attachment

2021-03-05 Thread Mark Dale via Mailman-Users



 Original Message 
From: Mark Sapiro [mailto:m...@msapiro.net]
Sent: Friday, March 5, 2021, 04:50 UTC
To: mailman-users@python.org
Subject: [Mailman-Users] Re: Pipermail scrubbing ascii txt to ksh attachment

> On 3/4/21 8:36 PM, Mark Dale via Mailman-Users wrote:
>>
>> Thanks. I've implemented your script patch (LISTNAME needed to be in
>> quotes otherwise the server spat the dummy).
> 
> Sorry about that ...
> 
>> The next scheduled post to
>> the list will be in about 10 hours so I'll have a result to look at then.
> 
> 
> Cool.
> 
> 
>> Reading Steve's reply just now makes me look suspiciously at the Perl
>> X-Mailer: MIME::Lite that is sending the email to the list. My
>> understanding is the list owner has scheduled a Perl script to export
>> from a database and post the resulting export.
>>
>> Anyways, I'll see what the result gets written by scripts/post  in the
>> morning.
> 
> OK.
> 
> I'm just curious, but if the body is scrubbed as an attachment with a
> .txt extension instead of .ksh would that help?
> 


There is no problem opening the .ksh file but the owner of the list
would like to see the archive message display in the message body. The
Namazu text search engine that's incorporated into the list's
archive will then be of use. So having the attachment use .txt wont
really help.

Here is what the edited scripts/post wrote to file last night.

>From noreply@REDACTED  Fri Mar  5 12:03:30 2021
Return-Path: 
X-Original-To: REDACTED@lists.REDACTED
Delivered-To: REDACTED@lists.REDACTED
Received: from alln-iport-3.REDACTED (alln-iport-3.REDACTED [173.37.142.90])
by mailmanlists.network (Postfix) with ESMTPS id EC62F2029E
for ; Fri,  5 Mar 2021 12:03:29 + (UTC)
X-IPAS-Result: =?us-ascii?q?A0DxBADeHEJg/5tdJa1iHAEBAQEBAQcBARIBAQQEAQGCD?=
 =?us-ascii?q?wKCKYIGjXalKAsBAQEPNAQBAYUEgUUCJTkFDQIDAQEBAwIDAQEBAQUBAQECA?=
 =?us-ascii?q?QYEcYVuiTOFWq07AQEBgiaJNoEugTkBix6CIyYcgguBR407GgSTKwGRGJwCh?=
 =?us-ascii?q?EqGc41OhkWDdJ94C4YurF2EHIFsIoFXcIM6TxkNVZwwIwECZwIGCgEBAwmPJ?=
 =?us-ascii?q?gEB?=
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-AV: E=Sophos;i="5.81,224,1610409600";
   d="scan'208";a="656493238"
Received: from rcdn-core-4.REDACTED ([173.37.93.155])
  by alln-iport-3.REDACTED with ESMTP/TLS/DHE-RSA-SEED-SHA; 05 Mar 2021
12:03:28 +
Received: from mail.vrt.REDACTED ([10.83.44.69])
by rcdn-core-4.REDACTED (8.15.2/8.15.2) with SMTP id 125C3S2G009788
for ; Fri, 5 Mar 2021 12:03:28 GMT
Message-Id: <202103051203.125C3S2G009788@rcdn-core-4.REDACTED>
Received: from localhost.localdomain (sigmanager.vrt.REDACTED [10.7.89.25])
by mail.vrt.REDACTED (Postfix) with ESMTP id 44C1E463D6
for ; Fri,  5 Mar 2021 12:03:28 + (UTC)
Content-Disposition: inline
Content-Type: text/plain
MIME-Version: 1.0
X-Mailer: MIME::Lite 3.031 (F2.85; T2.17; A2.21; B3.15; Q3.13)
Date: Fri, 5 Mar 2021 07:03:28 -0500
From: noreply@REDACTED
To: REDACTED@lists.REDACTED
Subject: Signatures Published daily - 26099
Content-Transfer-Encoding: quoted-printable
X-Outbound-SMTP-Client: 10.83.44.69, [10.83.44.69]
X-Outbound-Node: rcdn-core-4.REDACTED


REDACTED Publishing Notice

Datefile:   daily
Version:26099
Publisher:  REDACTED
New Sigs:   139
Dropped Sigs:   0
Ignored Sigs:   75


New Detection Signatures:


* Win.Malware.Injects-9838834-0

* Win.Trojan.Generic-9838835-0

* Win.Packed.Razy-9838836-0

* Win.Packed.Razy-9838837-0


--
Mailman-Users mailing list -- mailman-users@python.org
To unsubscribe send an email to mailman-users-le...@python.org
https://mail.python.org/mailman3/lists/mailman-users.python.org/
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: https://www.mail-archive.com/mailman-users@python.org/
https://mail.python.org/archives/list/mailman-users@python.org/


[Mailman-Users] Re: Pipermail scrubbing ascii txt to ksh attachment

2021-03-04 Thread Stephen J. Turnbull
Executive summary: -0.8 for changing Scrubber (don't think there's a
benefit either way => don't bother), +0.5 for .txt (not sure it's
worth the effort), plus an amusing (YMMV) story.

Mark Sapiro writes:
 > On 3/4/21 7:05 PM, Stephen J. Turnbull wrote:
 > > Mark Sapiro writes:
 > 
 > > Does Scrubber really do that?  Per RFC, the two Content-Type fields
 > > have exactly the same semantics: "it is plain text, encoded as ASCII."
 > 
 > Yes, scrubber really does this. This dates back to Tokio and various
 > Japanese language emails which presumably weren't all ascii. Perhaps we
 > should revisit this, but you know much more about Japanese language
 > emails than I do.

I don't think there's a need to change nowadays.  The problems I see
now are (a) spam where the rules never did apply anyway and (b) in
application/* attachments such as zipfiles.  Common MUAs all do the
right things with text/*, and there are very few folks who bother to
write their own (or use netcat to talk to port 587 ;-) anymore.

 > Apparently, Tokio thought it was better to scrub such a part than to
 > treat it as ascii when it wasn't.

It certainly was back then.  A little earlier than that, I was at Ohio
State and they used a Prime minicomputer for email and secretary-
supported wordprocessing.  The professor who was head of the Econ Dept
computer committee got full of himself and sent me a huge warlording
.sig full of VT-220 (maybe even VT-320) control sequences for flashing
and reverse video and the like, containing the message "Beware the
Wizard!"

It was indeed very bright! and shiny!, so I sent it back to him with
compliments (something like "A sufficiently advanced technology is
indistinguishable from magic, but I have the VT-220 programming
manual").  Silly Rabbit, Trix are for kids -- he had ripped off the
fancy large screen console terminal for use in his personal office.
(I can't blame him, it was *very* nice and it's not like you need more
than a VT-52 or so for the console.)  It was *not* VT-compatible, and
apparently that warlord sig not only reprogrammed its function keys,
but proceeded to invoke them, sending a message to the host which
promptly crashed hard.  (I didn't notice the crash as I was using a PC
for real work and email is asynchronous so nothing out of the ordinary
there either.  I heard about it the next day at the faculty meeting
where "The Wizard" got a dressing-down for misappropriating department
property. :-)

I imagine that terminal lockups etc, if not full-blown system crashes,
were reasonably common with Shift JIS too.

 > Perhaps a compromise is to give scrubbed text/plain attachments a
 > .txt. extension rather than taking the first item returned by
 > mime_types.guess_all_extensions.

I'm not sure.  Nowadays I'm not so worried about harming the machine
as the possibility of malware (eg, URLs with Greek or Cyrillic
cognates of ASCII characters directing you to a phishing site when you
think you're linking to Amazon).

On the other hand, I guess if you're going to fall for that with .txt,
you'll probably think "computers are weird" and fall for it with
Content-Type: application/octet-stream; name="inline-text.ksh"
too. :-(

So, yeah, if you want to go to the trouble, I think defaulting to .txt
for text/plain with no filename specified is more user-friendly.
--
Mailman-Users mailing list -- mailman-users@python.org
To unsubscribe send an email to mailman-users-le...@python.org
https://mail.python.org/mailman3/lists/mailman-users.python.org/
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: https://www.mail-archive.com/mailman-users@python.org/
https://mail.python.org/archives/list/mailman-users@python.org/


[Mailman-Users] Re: Pipermail scrubbing ascii txt to ksh attachment

2021-03-04 Thread Mark Sapiro
On 3/4/21 8:36 PM, Mark Dale via Mailman-Users wrote:
> 
> Reading Steve's reply just now makes me look suspiciously at the Perl
> X-Mailer: MIME::Lite that is sending the email to the list. My
> understanding is the list owner has scheduled a Perl script to export
> from a database and post the resulting export.


I've looked at the code more carefully, and I see there are two
conditions for the text/plain part to be scrubbed. One is the lack of a
charset= parameter, but the other is that the part is not the only body
part or maybe the first part of a multipart body.

However, the way the code determines if the part is the body vs. being
an attachment is the presence of a Content-Disposition: header. Your
message has a Content-Disposition: inline header and while this is
explicitly allowed by RFC 2183, it is unusual for a single part
text/plain message.

If the perl script that generates this message can not include that
header, I don't thing the part will be scrubbed.

-- 
Mark Sapiro The highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan
--
Mailman-Users mailing list -- mailman-users@python.org
To unsubscribe send an email to mailman-users-le...@python.org
https://mail.python.org/mailman3/lists/mailman-users.python.org/
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: https://www.mail-archive.com/mailman-users@python.org/
https://mail.python.org/archives/list/mailman-users@python.org/


[Mailman-Users] Re: Pipermail scrubbing ascii txt to ksh attachment

2021-03-04 Thread Mark Sapiro
On 3/4/21 8:36 PM, Mark Dale via Mailman-Users wrote:
> 
> Thanks. I've implemented your script patch (LISTNAME needed to be in
> quotes otherwise the server spat the dummy).

Sorry about that ...

> The next scheduled post to
> the list will be in about 10 hours so I'll have a result to look at then.


Cool.


> Reading Steve's reply just now makes me look suspiciously at the Perl
> X-Mailer: MIME::Lite that is sending the email to the list. My
> understanding is the list owner has scheduled a Perl script to export
> from a database and post the resulting export.
> 
> Anyways, I'll see what the result gets written by scripts/post  in the
> morning.

OK.

I'm just curious, but if the body is scrubbed as an attachment with a
.txt extension instead of .ksh would that help?


-- 
Mark Sapiro The highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan
--
Mailman-Users mailing list -- mailman-users@python.org
To unsubscribe send an email to mailman-users-le...@python.org
https://mail.python.org/mailman3/lists/mailman-users.python.org/
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: https://www.mail-archive.com/mailman-users@python.org/
https://mail.python.org/archives/list/mailman-users@python.org/


[Mailman-Users] Re: Pipermail scrubbing ascii txt to ksh attachment

2021-03-04 Thread Mark Dale via Mailman-Users



 Original Message 
From: Mark Sapiro [mailto:m...@msapiro.net]
Sent: Friday, March 5, 2021, 01:29 UTC
To: mailman-users@python.org
Subject: [Mailman-Users] Re: Pipermail scrubbing ascii txt to ksh attachment


> Here's something you can try.
> 
> Create a file somewhere and give the Mailman's group write permission on
> it. Then apply the following patch to Mailman's scripts/post
> 
>> === modified file 'scripts/post'
>> --- scripts/post 2018-06-17 23:47:34 +
>> +++ scripts/post 2021-03-05 01:16:56 +
>> @@ -58,8 +58,12 @@
>>  # some MTAs have a hard limit to the time a filter prog can run.  
>> Postfix
>>  # is a good example; if the limit is hit, the proc is SIGKILL'd giving 
>> us
>>  # no chance to save the message.
>> +content = sys.stdin.read()
>> +if listname == LISTNAME:
>> +with open('PATH_TO_FILE', 'a') as fp:
>> +fp.write(content)
>>  inq = get_switchboard(mm_cfg.INQUEUE_DIR)
>> -inq.enqueue(sys.stdin.read(),
>> +inq.enqueue(content,
>>  listname=listname,
>>  tolist=1, _plaintext=1)
>>  
>>
> 
> 
> where LISTNAME is the problem list's name and PATH_TO_FILE is the path
> to the file you created. This file will accumulate all the incoming
> messages posted to the list, exactly as they are received by Mailman.
> 
> Then we can at least know for sure what that message looks like.
> 
> Of course, once you have captured one such message, you can revert the
> patch.
> 

Thanks. I've implemented your script patch (LISTNAME needed to be in
quotes otherwise the server spat the dummy). The next scheduled post to
the list will be in about 10 hours so I'll have a result to look at then.

Reading Steve's reply just now makes me look suspiciously at the Perl
X-Mailer: MIME::Lite that is sending the email to the list. My
understanding is the list owner has scheduled a Perl script to export
from a database and post the resulting export.

Anyways, I'll see what the result gets written by scripts/post  in the
morning.

Cheers,
Mark
--
Mailman-Users mailing list -- mailman-users@python.org
To unsubscribe send an email to mailman-users-le...@python.org
https://mail.python.org/mailman3/lists/mailman-users.python.org/
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: https://www.mail-archive.com/mailman-users@python.org/
https://mail.python.org/archives/list/mailman-users@python.org/


[Mailman-Users] Re: Pipermail scrubbing ascii txt to ksh attachment

2021-03-04 Thread Mark Sapiro
On 3/4/21 7:05 PM, Stephen J. Turnbull wrote:
> Mark Sapiro writes:

> 
> Does Scrubber really do that?  Per RFC, the two Content-Type fields
> have exactly the same semantics: "it is plain text, encoded as ASCII."


Yes, scrubber really does this. This dates back to Tokio and various
Japanese language emails which presumably weren't all ascii. Perhaps we
should revisit this, but you know much more about Japanese language
emails than I do.

Apparently, Tokio thought it was better to scrub such a part than to
treat it as ascii when it wasn't.

Perhaps a compromise is to give scrubbed text/plain attachments a .txt.
extension rather than taking the first item returned by
mime_types.guess_all_extensions.

Note that this is a Mailman 2.1 issue only. In MM 3 scrubber is only
invoked for plain text digests and just removes with notice all
non-text/plain elemental parts and for text/plain treats missing or
unknown charsets as ascii.


-- 
Mark Sapiro The highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan
--
Mailman-Users mailing list -- mailman-users@python.org
To unsubscribe send an email to mailman-users-le...@python.org
https://mail.python.org/mailman3/lists/mailman-users.python.org/
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: https://www.mail-archive.com/mailman-users@python.org/
https://mail.python.org/archives/list/mailman-users@python.org/


[Mailman-Users] Re: Pipermail scrubbing ascii txt to ksh attachment

2021-03-04 Thread Stephen J. Turnbull
Mark Sapiro writes:

 > >> Content-Disposition: inline
 > >> Content-Type: text/plain; charset="us-ascii"
 > >> Content-Transfer-Encoding: 7bit
 > >>
 > >> And in the mbox file below (of the same message), I see:
 > >>
 > >> Content-Disposition: inline
 > >> Content-Type: text/plain
 > >> Content-Transfer-Encoding: quoted-printable
 > 
 > OK, that's the issue. The message in the .mbox is after various list
 > manipulations, but before scrubbing for the pipermail archive, and
 > somehow the '; charset="us-ascii"' has been lost from the Content-Type:
 > header, which is why Scrubber  scrubs it.

Does Scrubber really do that?  Per RFC, the two Content-Type fields
have exactly the same semantics: "it is plain text, encoded as ASCII."

I would hope instead that it's the non-ASCII content that triggered
something (are we sure it's Mailman? could be an MTA somewhere along
the line) to qp-encode.

For example, the original mail may have included directed quotes or
similar hard-to-distinguish "fancy punctuation", but the composing MUA
didn't notice them and just randomly set charset=us-ascii.  Is there
quoted-printable (easily recognized by the "=" + 2 hex digits syntax)
in that MIME body in the mbox?  Another possibility was that it was a
very long line and it was qp-encoded ("=" CRLF inserted after a space)
to conform to RFC 822.

 > > FWIW, using Thunderbird I posted the contents of the original email to a
 > > test list (on the same server, with the same lists configs) and as
 > > expected the archived message displays correctly as a plaintext
 > > email.

How did the contents get into the message in Thunderbird?  Copy-paste?
Yank from mailbox?  Forward?  If it's anything but the last,
Thuderbird almost surely massaged it on the way in.

 > > The headers this time show as:
 > > 
 > > === FROM EMAIL HEADER
 > > MIME-Version: 1.0
 > > Content-Type: text/plain; charset="us-ascii"
 > > Content-Transfer-Encoding: 7bit
 > > 
 > > === FROM MBOX
 > > MIME-Version: 1.0
 > > Content-Type: text/plain; charset=utf-8
 > > Content-Transfer-Encoding: 7bit
 > 
 > This is what should happen. I'm not sure which handler changed the
 > charset to utf-8, but I don't think that's significant.

This is RFC-ly bizarre.  Why would anything change the charset, unless
there were non-ASCII octets?  If something does make that change
despite the body being pure ASCII, I would argue that's a bug (very
old MUAs might refuse to display the message), although it's probably
irrelevant nowadays.

But if there *are* non-ASCII octets, the Content-Transfer-Encoding is
a lie.  I think the original message was probably broken as sent.

Steve

--
Mailman-Users mailing list -- mailman-users@python.org
To unsubscribe send an email to mailman-users-le...@python.org
https://mail.python.org/mailman3/lists/mailman-users.python.org/
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: https://www.mail-archive.com/mailman-users@python.org/
https://mail.python.org/archives/list/mailman-users@python.org/


[Mailman-Users] Re: Pipermail scrubbing ascii txt to ksh attachment

2021-03-04 Thread Mark Sapiro
On 3/4/21 4:20 PM, Mark Dale via Mailman-Users wrote:
>>
>> Thanks Mark. I've copied the message from mbox file and pasted below.
>> (the client's info is marked as redacted).
>>
>> Something I did notice: in the original received email headers I see:


Original as received by the list or from the list?


>> Content-Disposition: inline
>> Content-Type: text/plain; charset="us-ascii"
>> Content-Transfer-Encoding: 7bit
>>
>> And in the mbox file below (of the same message), I see:
>>
>> Content-Disposition: inline
>> Content-Type: text/plain
>> Content-Transfer-Encoding: quoted-printable

OK, that's the issue. The message in the .mbox is after various list
manipulations, but before scrubbing for the pipermail archive, and
somehow the '; charset="us-ascii"' has been lost from the Content-Type:
header, which is why Scrubber  scrubs it.


...
> 
> FWIW, using Thunderbird I posted the contents of the original email to a
> test list (on the same server, with the same lists configs) and as
> expected the archived message displays correctly as a plaintext email.
> 
> The headers this time show as:
> 
> === FROM EMAIL HEADER
> MIME-Version: 1.0
> Content-Type: text/plain; charset="us-ascii"
> Content-Transfer-Encoding: 7bit
> 
> === FROM MBOX
> MIME-Version: 1.0
> Content-Type: text/plain; charset=utf-8
> Content-Transfer-Encoding: 7bit


This is what should happen. I'm not sure which handler changed the
charset to utf-8, but I don't think that's significant.

Clearly there is something different in either the two lists, or the
message that arrived at the lists, but My only guess is the message that
arrived at the original list was already missing the charset="us-ascii".

Here's something you can try.

Create a file somewhere and give the Mailman's group write permission on
it. Then apply the following patch to Mailman's scripts/post

> === modified file 'scripts/post'
> --- scripts/post  2018-06-17 23:47:34 +
> +++ scripts/post  2021-03-05 01:16:56 +
> @@ -58,8 +58,12 @@
>  # some MTAs have a hard limit to the time a filter prog can run.  Postfix
>  # is a good example; if the limit is hit, the proc is SIGKILL'd giving us
>  # no chance to save the message.
> +content = sys.stdin.read()
> +if listname == LISTNAME:
> +with open('PATH_TO_FILE', 'a') as fp:
> +fp.write(content)
>  inq = get_switchboard(mm_cfg.INQUEUE_DIR)
> -inq.enqueue(sys.stdin.read(),
> +inq.enqueue(content,
>  listname=listname,
>  tolist=1, _plaintext=1)
>  
> 


where LISTNAME is the problem list's name and PATH_TO_FILE is the path
to the file you created. This file will accumulate all the incoming
messages posted to the list, exactly as they are received by Mailman.

Then we can at least know for sure what that message looks like.

Of course, once you have captured one such message, you can revert the
patch.

-- 
Mark Sapiro The highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan
--
Mailman-Users mailing list -- mailman-users@python.org
To unsubscribe send an email to mailman-users-le...@python.org
https://mail.python.org/mailman3/lists/mailman-users.python.org/
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: https://www.mail-archive.com/mailman-users@python.org/
https://mail.python.org/archives/list/mailman-users@python.org/


[Mailman-Users] Re: Pipermail scrubbing ascii txt to ksh attachment

2021-03-04 Thread Mark Dale via Mailman-Users



 Original Message 
From: Mark Dale via Mailman-Users [mailto:mailman-users@python.org]
Sent: Friday, March 5, 2021, 00:01 UTC
To: mailman-users@python.org
Subject: [Mailman-Users] Re: Pipermail scrubbing ascii txt to ksh attachment

> 
>  Original Message 
> From: Mark Sapiro [mailto:m...@msapiro.net]
> Sent: Thursday, March 4, 2021, 06:04 UTC
> To: mailman-users@python.org
> Subject: [Mailman-Users] Re: Pipermail scrubbing ascii txt to ksh attachment
> 
>> On 3/3/21 3:36 PM, Mark Dale via Mailman-Users wrote:
>>> Hi Listers,
>>>
>>> I've got a client's list to which a plain text email notice is sent
>>> everyday.
>>>
>>> Mailman-Version: 2.1.34
>>> Postfix
>>> Debian 10
>>>
>>> The message contents are fairly similar each day and the text message
>>> renders in email clients just fine and the .mbox file reads fine also.
>>>
>>> However, Pipermail renders each archived message as a ".ksh" attachment.
>>>
>>>
>>> *
>>>
>>> >From noreply at XXX.com  Wed Mar  3 12:06:05 2021
>>> From: noreply at XXX.com (noreply at XXX.com)
>>> Date: Wed, 3 Mar 2021 07:06:05 -0500
>>> Subject: [XXX] XXX Published daily - 26097
>>> Message-ID: <202103031206.123C662K015222@XXX>
>>>
>>> An embedded and charset-unspecified text was scrubbed...
>>> Name: not available
>>> URL:
>>> <https://XXX/pipermail/XXX/attachments/20210303/06a36d4b/attachment.ksh>
>>>
>>> *
>>>
>>> The message header contains:
>>>
>>> Content-Disposition: inline
>>> MIME-Version: 1.0
>>> X-Mailer: MIME::Lite 3.031 (F2.85; T2.17; A2.21; B3.15; Q3.13)
>>> ...
>>> Content-Type: text/plain; charset="us-ascii"
>>> Content-Transfer-Encoding: 7bit
>>>
>>>
>>> Could anyone point me in the right direction to get the archived
>>> messages to display their content as text in the message body instead of
>>> as an attachment?
>>
>> In order to help with this, I need to see the complete MIME structure of
>> the message. I.e. all the Content-Type: headers including the top level
>> and all the boundaries. This information for a message received from the
>> list is what I want to see as that's what Scrubber sees. Or the message
>> from the archives/private/listname.mbox/listname.mbox would do too.
>>
>> Scrubber should not be scrubbing the 'Content-Type: text/plain;
>> charset="us-ascii"' part with the message 'An embedded and
>> charset-unspecified text was scrubbed...'. Something else is going on here.
>>
>> The .ksh extension comes from the Python library
>> mime_types.guess_all_extensions, which returns the list
>>
>> ['.ksh', '.bat', '.h', '.txt', '.pl', '.c', '.asc', '.text', '.pot',
>> '.brf', '.srt']
>>
>> for text/plain and we arbitrarily pick the first one which is .ksh, but
>> we shouldn't be doing that with a text/plain part with a declared charset.
>>
> 
> ==
> 
> Thanks Mark. I've copied the message from mbox file and pasted below.
> (the client's info is marked as redacted).
> 
> Something I did notice: in the original received email headers I see:
> 
> Content-Disposition: inline
> Content-Type: text/plain; charset="us-ascii"
> Content-Transfer-Encoding: 7bit
> 
> And in the mbox file below (of the same message), I see:
> 
> Content-Disposition: inline
> Content-Type: text/plain
> Content-Transfer-Encoding: quoted-printable
> 
> 
> 
> 
>>From noreply@REDACTED  Thu Mar  4 12:04:42 2021
> Return-Path: 
> X-Original-To: REDACTED@lists.REDACTED
> Delivered-To: REDACTED@lists.REDACTED
> Received: from alln-iport-1.REDACTED (alln-iport-1.REDACTED [173.37.142.88])
>  by mailmanlists.network (Postfix) with ESMTPS id EDCEB1FFCA
>  for ; Thu,  4 Mar 2021 12:04:41 + (UTC)
> X-IPAS-Result: =?us-ascii?q?A0CcAgAZzEBgmJpdJa1iHgEBCxIMghGEL412pSQLAQEBD?=
>  =?us-ascii?q?zQEAQGFBIFFAiU5BQ0CAwEBAQMCAwEBAQEFAQEBAgEGBBQBAQEBAQEBAYZDi?=
>  =?us-ascii?q?TOCUoMIrWIBAQGCJok2gS2BOYsegiQmHIILgUQDgSiMExoEhVWNUAGREpt7h?=
>  =?us-ascii?q?EiGco1JhkWDcp9kC4YsrEqEGoFsIIFZcIM6TxkNVY1jjk0jAQJnAgYKAQEDC?=
>  =?us-ascii?q?YwTAQE?=
> X-IronPort-Anti-Spam-Filtered: true
> X-IronPort-AV: E=Sophos;i="5.81,222,1610409600"; d="scan'208";a="656725958"
> Received: from rcdn-core-3.REDACTED ([173.37.93.154])
>  by alln-iport-1.REDACTED with ESMTP/TLS/DHE-RSA-SEED-SHA;
>  04 Mar 2021 12:04:41 +
> Received:

[Mailman-Users] Re: Pipermail scrubbing ascii txt to ksh attachment

2021-03-04 Thread Mark Dale via Mailman-Users


 Original Message 
From: Mark Sapiro [mailto:m...@msapiro.net]
Sent: Thursday, March 4, 2021, 06:04 UTC
To: mailman-users@python.org
Subject: [Mailman-Users] Re: Pipermail scrubbing ascii txt to ksh attachment

> On 3/3/21 3:36 PM, Mark Dale via Mailman-Users wrote:
>> Hi Listers,
>>
>> I've got a client's list to which a plain text email notice is sent
>> everyday.
>>
>> Mailman-Version: 2.1.34
>> Postfix
>> Debian 10
>>
>> The message contents are fairly similar each day and the text message
>> renders in email clients just fine and the .mbox file reads fine also.
>>
>> However, Pipermail renders each archived message as a ".ksh" attachment.
>>
>>
>> *
>>
>> >From noreply at XXX.com  Wed Mar  3 12:06:05 2021
>> From: noreply at XXX.com (noreply at XXX.com)
>> Date: Wed, 3 Mar 2021 07:06:05 -0500
>> Subject: [XXX] XXX Published daily - 26097
>> Message-ID: <202103031206.123C662K015222@XXX>
>>
>> An embedded and charset-unspecified text was scrubbed...
>> Name: not available
>> URL:
>> <https://XXX/pipermail/XXX/attachments/20210303/06a36d4b/attachment.ksh>
>>
>> *
>>
>> The message header contains:
>>
>> Content-Disposition: inline
>> MIME-Version: 1.0
>> X-Mailer: MIME::Lite 3.031 (F2.85; T2.17; A2.21; B3.15; Q3.13)
>> ...
>> Content-Type: text/plain; charset="us-ascii"
>> Content-Transfer-Encoding: 7bit
>>
>>
>> Could anyone point me in the right direction to get the archived
>> messages to display their content as text in the message body instead of
>> as an attachment?
> 
> In order to help with this, I need to see the complete MIME structure of
> the message. I.e. all the Content-Type: headers including the top level
> and all the boundaries. This information for a message received from the
> list is what I want to see as that's what Scrubber sees. Or the message
> from the archives/private/listname.mbox/listname.mbox would do too.
> 
> Scrubber should not be scrubbing the 'Content-Type: text/plain;
> charset="us-ascii"' part with the message 'An embedded and
> charset-unspecified text was scrubbed...'. Something else is going on here.
> 
> The .ksh extension comes from the Python library
> mime_types.guess_all_extensions, which returns the list
> 
> ['.ksh', '.bat', '.h', '.txt', '.pl', '.c', '.asc', '.text', '.pot',
> '.brf', '.srt']
> 
> for text/plain and we arbitrarily pick the first one which is .ksh, but
> we shouldn't be doing that with a text/plain part with a declared charset.
> 

==

Thanks Mark. I've copied the message from mbox file and pasted below.
(the client's info is marked as redacted).

Something I did notice: in the original received email headers I see:

Content-Disposition: inline
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit

And in the mbox file below (of the same message), I see:

Content-Disposition: inline
Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable




>From noreply@REDACTED  Thu Mar  4 12:04:42 2021
Return-Path: 
X-Original-To: REDACTED@lists.REDACTED
Delivered-To: REDACTED@lists.REDACTED
Received: from alln-iport-1.REDACTED (alln-iport-1.REDACTED [173.37.142.88])
 by mailmanlists.network (Postfix) with ESMTPS id EDCEB1FFCA
 for ; Thu,  4 Mar 2021 12:04:41 + (UTC)
X-IPAS-Result: =?us-ascii?q?A0CcAgAZzEBgmJpdJa1iHgEBCxIMghGEL412pSQLAQEBD?=
 =?us-ascii?q?zQEAQGFBIFFAiU5BQ0CAwEBAQMCAwEBAQEFAQEBAgEGBBQBAQEBAQEBAYZDi?=
 =?us-ascii?q?TOCUoMIrWIBAQGCJok2gS2BOYsegiQmHIILgUQDgSiMExoEhVWNUAGREpt7h?=
 =?us-ascii?q?EiGco1JhkWDcp9kC4YsrEqEGoFsIIFZcIM6TxkNVY1jjk0jAQJnAgYKAQEDC?=
 =?us-ascii?q?YwTAQE?=
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-AV: E=Sophos;i="5.81,222,1610409600"; d="scan'208";a="656725958"
Received: from rcdn-core-3.REDACTED ([173.37.93.154])
 by alln-iport-1.REDACTED with ESMTP/TLS/DHE-RSA-SEED-SHA;
 04 Mar 2021 12:04:41 +
Received: from mail.vrt.REDACTED ([10.83.44.69])
 by rcdn-core-3.REDACTED (8.15.2/8.15.2) with SMTP id 124C4e83021704
 for ; Thu, 4 Mar 2021 12:04:40 GMT
Message-Id: <202103041204.124C4e83021704@rcdn-core-3.REDACTED>
Received: from localhost.localdomain (sigmanager.vrt.REDACTED
 [10.7.89.25])
 by mail.vrt.REDACTED (Postfix) with ESMTP id 7E95E424D5
 for ; Thu,  4 Mar 2021 12:04:40 + (UTC)
Content-Disposition: inline
Content-Type: text/plain
MIME-Version: 1.0
X-Mailer: MIME::Lite 3.031 (F2.85; T2.17; A2.21; B3.15; Q3.13)
Date: Thu, 4 Mar 2021 07:04:40 -0500
From: noreply@REDACTED
To: REDACTED@lists.REDACTED
Content-Transfer-Encoding: quoted-printable
X-Outbound-SMTP-Client: 10.83.44.69, [10.83.44.69]
X-Outbound-Node: rcdn-c

[Mailman-Users] Re: Pipermail scrubbing ascii txt to ksh attachment

2021-03-03 Thread Mark Sapiro
On 3/3/21 3:36 PM, Mark Dale via Mailman-Users wrote:
> Hi Listers,
> 
> I've got a client's list to which a plain text email notice is sent
> everyday.
> 
> Mailman-Version: 2.1.34
> Postfix
> Debian 10
> 
> The message contents are fairly similar each day and the text message
> renders in email clients just fine and the .mbox file reads fine also.
> 
> However, Pipermail renders each archived message as a ".ksh" attachment.
> 
> 
> *
> 
>>From noreply at XXX.com  Wed Mar  3 12:06:05 2021
> From: noreply at XXX.com (noreply at XXX.com)
> Date: Wed, 3 Mar 2021 07:06:05 -0500
> Subject: [XXX] XXX Published daily - 26097
> Message-ID: <202103031206.123C662K015222@XXX>
> 
> An embedded and charset-unspecified text was scrubbed...
> Name: not available
> URL:
> 
> 
> *
> 
> The message header contains:
> 
> Content-Disposition: inline
> MIME-Version: 1.0
> X-Mailer: MIME::Lite 3.031 (F2.85; T2.17; A2.21; B3.15; Q3.13)
> ...
> Content-Type: text/plain; charset="us-ascii"
> Content-Transfer-Encoding: 7bit
> 
> 
> Could anyone point me in the right direction to get the archived
> messages to display their content as text in the message body instead of
> as an attachment?

In order to help with this, I need to see the complete MIME structure of
the message. I.e. all the Content-Type: headers including the top level
and all the boundaries. This information for a message received from the
list is what I want to see as that's what Scrubber sees. Or the message
from the archives/private/listname.mbox/listname.mbox would do too.

Scrubber should not be scrubbing the 'Content-Type: text/plain;
charset="us-ascii"' part with the message 'An embedded and
charset-unspecified text was scrubbed...'. Something else is going on here.

The .ksh extension comes from the Python library
mime_types.guess_all_extensions, which returns the list

['.ksh', '.bat', '.h', '.txt', '.pl', '.c', '.asc', '.text', '.pot',
'.brf', '.srt']

for text/plain and we arbitrarily pick the first one which is .ksh, but
we shouldn't be doing that with a text/plain part with a declared charset.

-- 
Mark Sapiro The highway is for gamblers,
San Francisco Bay Area, Californiabetter use your sense - B. Dylan
--
Mailman-Users mailing list -- mailman-users@python.org
To unsubscribe send an email to mailman-users-le...@python.org
https://mail.python.org/mailman3/lists/mailman-users.python.org/
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: https://www.mail-archive.com/mailman-users@python.org/
https://mail.python.org/archives/list/mailman-users@python.org/