New submission from R. David Murray <[email protected]>:
In Python2, this works:
>>> from email.mime.text import MIMEText
>>> m = MIMEText('abc')
>>> str(m)
'From nobody Tue Mar 13 15:44:59 2012\nContent-Type: text/plain;
charset="us-ascii"\nMIME-Version: 1.0\nContent-Transfer-Encoding: 7bit\n\nabc'
>>> m['Subject'] = u'É test'
>>> str(m)
'From nobody Tue Mar 13 15:48:11 2012\nContent-Type: text/plain;
charset="us-ascii"\nMIME-Version: 1.0\nContent-Transfer-Encoding:
7bit\nSubject: =?utf-8?q?=C3=89_test?=\n\nabc'
That is, unicode string automatically get turned into encoded words.
In Python3 this no longer works:
>>> from email.mime.text import MIMEText
>>> m = MIMEText('abc')
>>> str(m)
'Content-Type: text/plain; charset="us-ascii"\nMIME-Version:
1.0\nContent-Transfer-Encoding: 7bit\n\nabc'
>>> m['Subject'] = u'É test'
>>> str(m)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/rdmurray/python/p33/Lib/email/message.py", line 154, in
__str__
return self.as_string()
File "/home/rdmurray/python/p33/Lib/email/message.py", line 168, in
as_string
g.flatten(self, unixfrom=unixfrom)
File "/home/rdmurray/python/p33/Lib/email/generator.py", line 99, in
flatten
self._write(msg)
File "/home/rdmurray/python/p33/Lib/email/generator.py", line 152, in
_write
self._write_headers(msg)
File "/home/rdmurray/python/p33/Lib/email/generator.py", line 186, in
_write_headers
header_name=h)
File "/home/rdmurray/python/p33/Lib/email/header.py", line 205, in
__init__
self.append(s, charset, errors)
File "/home/rdmurray/python/p33/Lib/email/header.py", line 286, in append
s.encode(output_charset, errors)
UnicodeEncodeError: 'ascii' codec can't encode character '\xc9' in position
0: ordinal not in range(128)
Presumably the problem is that the Python2 code tests for 'string' and if
it isn't string handles it by CTE encoding it. In Python3 everything
is a string. Probably what should happen is the encoding error should
be caught, and the CTE encoding done at that point, based on the model of how
Python2 handled unicode strings.
----------
assignee: r.david.murray
components: Library (Lib)
keywords: easy
messages: 155656
nosy: aikinci, r.david.murray
priority: high
severity: normal
stage: needs patch
status: open
title: Regression in Python3 of email handling of unicode strings in headers
type: behavior
versions: Python 3.2, Python 3.3
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue14291>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com