Marc 'BlackJack' Rintsch wrote:
On Mon, 01 Sep 2008 02:27:54 -0400, Terry Reedy wrote:
I doubt the OP 'chose' cp437. Why does Python using cp437 even when the
default encoding is utf-8?
On WinXP
>>> sys.getdefaultencoding()
'utf-8'
>>> s='\u012b'
>>> s
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Program Files\Python30\lib\io.py", line 1428, in write
b = encoder.encode(s)
File "C:\Program Files\Python30\lib\encodings\cp437.py", line 19, in
encode
return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u012b' in
position
1: character maps to <undefined>
Most likely because Python figured out that the terminal expects cp437.
What does `sys.stdout.encoding` say?
The interpreter in the command prompt window says CP437.
The IDLE Window says 'cp1252', and it handles the character fine.
Given that Windows OS can handle the character, why is Python/Command
Prompt limiting output?
Characters the IDLE window cannot display (like surrogate pairs) it
displays as boxes. But if I cut '[][]' (4 chars) and paste into
Firefox, I get 3 chars. '[]' where [] has some digits instead of being
empty. It is really confusing when every window on 'unicode-based'
Windows handles a different subset. Is this the fault of Windows or of
Python and IDLE (those two being more limited that FireFox)?
To put it another way, how can one 'choose' utf-8 for display to screen?
If the terminal expects cp437 then displaying utf-8 might give some
problems.
My screen displays whatever Windows tells the graphics card to tell the
screen to display. In OpenOffice, I can select a unicode font that
displays at least everything in the BasicMultilingualPlane (BMP).
Terry Jan Reedy
--
http://mail.python.org/mailman/listinfo/python-list