Roger Lynn wrote: > >I'm running Mailman 2.1.7, packaged for Debian (although I don't think >that's relevant to this question). A list that I administer has non-digest >scrubbing enabled. An email was recently sent to it with the following headers: > >Content-Type: text/plain >Content-Disposition: inline >MIME-Version: 1.0 >X-Mailer: MIME-tools 5.411 (Entity 5.404) >Date: Mon, 01 May 2006 18:47:30 +0100 >Subject: [...] >To: [...] >From: [...] >X-Mailer: SINA Webmail 6.00. >Reply-To: [...] >X-Sina-Mail-Agent: sinadeliver-6.00-1.97 >Message-Id: [...] >X-Virus-Scanned: by myinternet myAV on ngflrtr1 >Content-Transfer-Encoding: quoted-printable
Which seems like a mal-formed message. The issue is the Content-Disposition: inline which should only appear in sub-part headers, not in the message headers. >This resulted in the contents of the email being replaced with: > >An embedded and charset-unspecified text was scrubbed... >Name: not available >Url: http://[...]/attachments/20060501/aad799ed/attachment.ksh > >Why is it necessary to scrub plain text in this instance, when no character >set is specified? Couldn't it just be assumed that it is us-ascii? It is a bug or at least insufficiently robust code. We shouldn't be relying on the Content-Disposition: header to determine a sub-part. >If I were to comment out the following code from process() in Scrubber.py, >would there be any consequences other than allowing messages like the above >through to the list? Yes. The consequence is that you could get a message which contained an actual "charset-unspecified text" attachment with an actual character set different from that of the first text/plain part and then these two parts with perhaps incompatible character sets would be 'flattened' together into one part. Here is a suggested change to the code you quoted. Replace if part.get('content-disposition') and \ not part.get_content_charset(): omask = os.umask(002) with if part.get('content-disposition') and \ msg.is_multipart() and \ not part.get_content_charset(): omask = os.umask(002) This is not really a proper fix, but I think it will avoid the problem in your case. > >Incidentally, why does the attachment have the suffix ".ksh"? It seems >rather unusual. I'm using the following settings: > >SCRUBBER_DONT_USE_ATTACHMENT_FILENAME = False >SCRUBBER_USE_ATTACHMENT_FILENAME_EXTENSION = True There is no 'filename' in what we mistakenly think is an attachment, so we guess the extension based on the Content-Type: which is text/plain. We use effectively the Python library call mimetypes.guess_all_extensions('text/plain', strict=False) which returns this list ['.ksh', '.asc', '.h', '.c', '.txt'] and we pick the first one. -- Mark Sapiro <[EMAIL PROTECTED]> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan ------------------------------------------------------ Mailman-Users mailing list Mailman-Users@python.org http://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://www.python.org/cgi-bin/faqw-mm.py Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: http://mail.python.org/mailman/options/mailman-users/archive%40jab.org Security Policy: http://www.python.org/cgi-bin/faqw-mm.py?req=show&file=faq01.027.htp