I have an application that processes MIME messages. It reads a message from a file, looks for a text/html and text/plain parts in it, performs some processing on these parts, and outputs the new message.
Ever since I recently upgraded my Python to 2.4.3, the output messages started to come out garbled, as a block of junk characters. I traced the problem back to a few lines that were removed from the email package: The new Python no longer encodes the payload when converting the MIME message to a string. Since my program must work on several computers, each having a different version of Python, I had to find a way to make it work correctly no matter if msg.as_string() encodes the payload or not. Here is a piece of code that demonstrates how to work around this problem: .................. code start ................ import email import email.MIMEText import email.Charset def do_some_processing(s): """Return the input text or HTML string after processing it in some way.""" # For the sake of this example, we only do some trivial processing. return s.replace('foo','bar') msg = email.message_from_string(file('input_mime_msg','r').read()) utf8 = email.Charset.Charset('UTF-8') for part in msg.walk(): if part.is_multipart(): continue if part.get_content_type() in ('text/plain','text/html'): s = part.get_payload(None, True) # True means decode the payload, which is normally base64-encoded. # s is now a sting containing just the text or html of the part, not encoded in any way. s = do_some_processing(s) # Starting with Python 2.4.3 or so, msg.as_string() no longer encodes the payload # according to the charset, so we have to do it ourselves here. # The trick is to create a message-part with 'x' as payload and see if it got # encoded or not. should_encode = (email.MIMEText.MIMEText('x', 'html', 'UTF-8').get_payload() != 'x') if should_encode: s = utf8.body_encode(s) part.set_payload(s, utf8) # The next two lines may be necessary if the original input message uses a different encoding # encoding than the one used in the email package. In that case we have to replace the # Content-Transfer-Encoding header to indicate the new encoding. del part['Content-Transfer-Encoding'] part['Content-Transfer-Encoding'] = utf8.get_body_encoding() file('output_mime_msg','w').write(msg.as_string()) .................. code end ................ Hope this helps someone out there. (Permission is hereby granted for anybody to use this piece of code for any purpose whatsoever) -- http://mail.python.org/mailman/listinfo/python-list