New submission from R. David Murray <rdmur...@bitdance.com>: In Python2, this works:
>>> from email.mime.text import MIMEText >>> m = MIMEText('abc') >>> str(m) 'From nobody Tue Mar 13 15:44:59 2012\nContent-Type: text/plain; charset="us-ascii"\nMIME-Version: 1.0\nContent-Transfer-Encoding: 7bit\n\nabc' >>> m['Subject'] = u'É test' >>> str(m) 'From nobody Tue Mar 13 15:48:11 2012\nContent-Type: text/plain; charset="us-ascii"\nMIME-Version: 1.0\nContent-Transfer-Encoding: 7bit\nSubject: =?utf-8?q?=C3=89_test?=\n\nabc' That is, unicode string automatically get turned into encoded words. In Python3 this no longer works: >>> from email.mime.text import MIMEText >>> m = MIMEText('abc') >>> str(m) 'Content-Type: text/plain; charset="us-ascii"\nMIME-Version: 1.0\nContent-Transfer-Encoding: 7bit\n\nabc' >>> m['Subject'] = u'É test' >>> str(m) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/rdmurray/python/p33/Lib/email/message.py", line 154, in __str__ return self.as_string() File "/home/rdmurray/python/p33/Lib/email/message.py", line 168, in as_string g.flatten(self, unixfrom=unixfrom) File "/home/rdmurray/python/p33/Lib/email/generator.py", line 99, in flatten self._write(msg) File "/home/rdmurray/python/p33/Lib/email/generator.py", line 152, in _write self._write_headers(msg) File "/home/rdmurray/python/p33/Lib/email/generator.py", line 186, in _write_headers header_name=h) File "/home/rdmurray/python/p33/Lib/email/header.py", line 205, in __init__ self.append(s, charset, errors) File "/home/rdmurray/python/p33/Lib/email/header.py", line 286, in append s.encode(output_charset, errors) UnicodeEncodeError: 'ascii' codec can't encode character '\xc9' in position 0: ordinal not in range(128) Presumably the problem is that the Python2 code tests for 'string' and if it isn't string handles it by CTE encoding it. In Python3 everything is a string. Probably what should happen is the encoding error should be caught, and the CTE encoding done at that point, based on the model of how Python2 handled unicode strings. ---------- assignee: r.david.murray components: Library (Lib) keywords: easy messages: 155656 nosy: aikinci, r.david.murray priority: high severity: normal stage: needs patch status: open title: Regression in Python3 of email handling of unicode strings in headers type: behavior versions: Python 3.2, Python 3.3 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue14291> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com