R. David Murray <rdmur...@bitdance.com> added the comment:

This is one of the infelicities of the translation of the old API to python3: 
'get_payload(decode=True)' actually means 'give me the bytes version of this 
payload", which in this case is the utf-8, which is what you got.  
get_payload() means "give me the payload as a string without doing CTE 
decoding".    In a sort of accident-of-translation this turns out to mean "give 
me the unicode" in this particular case.  If the payload had been base64 
encoded, you'd have gotten a unicode string containing the base64 characters.

Which I grant you is all very confusing.

For a more consistent API, use the new one:

>>> import email.policy
>>> m = email.message_from_bytes(msg_bytes, policy=email.policy.default)
>>> bytes(m)
b'MIME-Version: 1.0\nContent-Type: text/plain;\n 
charset=utf-8\nContent-Transfer-Encoding: 8bit\nContent-Disposition: 
attachment;\n filename="camper_store.csv"\n\nBeyo\xc4\x9flu-\xc4\xb0st'

>>> m.get_content()
'Beyoğlu-İst'

Here we don't even pretend that you have any use for the encoded version, 
either CTE encoding or binary encoding: get_content gives you the "fully 
decoded" payload (decoded from CTE *and* decoded to unicode).

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue25545>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to