On Tue, Feb 04 2014, "W. Trevor King" <wking at tremily.us> wrote:
> > >>> from __future__ import unicode_literals > >>> import codecs > >>> import locale > >>> import sys > >>> print(locale.getpreferredencoding()) # same as yours > UTF-8 > >>> print(sys.getdefaultencoding()) # same as yours > ascii > >>> _ENCODING = locale.getpreferredencoding() or sys.getdefaultencoding() > >>> print(_ENCODING) # double-check default encodings > UTF-8 > >>> byte_stream = sys.stdout # copied from Page.write > >>> stream = codecs.getwriter(encoding=_ENCODING)(stream=byte_stream) > >>> data = {'from': '\u017b'} # fake the troublesome data > >>> print(type(data['from'])) # double-check unicode_literals > <type 'unicode'> > >>> string = ' <td>{from}</td>\n'.format(**data) > >>> stream.write(string) > <td>?</td> > > It looks like you'll have the same _ENCODING as I do (UTF-8). That > means your stream should be wrapped in a UTF-8 StreamWriter, so I > don't understand why it's converting to ASCII. Can you run through > the above on your troublesome machine and confirm that stream.write() > is still raising the exception? If it doesn't work, can you just > paste that whole run in your next email? I don't know what to paste, so i paste this: $ python Python 2.6.6 (r266:84292, Nov 21 2013, 12:39:37) [GCC 4.4.7 20120313 (Red Hat 4.4.7-3)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> data = {'from': '\u017b'} >>> print(type(data['from'])) <type 'str'> >>> string = ' <td>{from}</td>\n'.format(**data) >>> print string <td>\u017b</td> and then: >>> data = {'from': u'\u017b'} >>> print(type(data['from'])) <type 'unicode'> >>> string = ' <td>{from}</td>\n'.format(**data) Traceback (most recent call last): File "<stdin>", line 1, in <module> UnicodeEncodeError: 'ascii' codec can't encode character u'\u017b' in >>> position 0: ordinal not in range(128) ... and ... >>> import os >>> print os.environ['LANG'] en_US.UTF-8 > Thanks, > Trevor Tomi