[issue13395] Python ISO-8859-1 encoding problem
New submission from Hugo Silva hugo...@gmail.com: Hi all, I'm facing a huge encoding problem in Python when dealing with ISO-8859-1 / Latin-1 character set. When using os.listdir to get the contents of a folder I'm getting the strings encoded in ISO-8859-1 (ex: ''Ol\xe1 Mundo''), however in the Python interpreter the same string is encoded to a different charset: In : 'Olá Mundo'.decode('latin-1') Out: u'Ol\xa0 Mundo' How can I force Python to decode the string to the same format. I've seen that os.listdir is returning the strings correctly encoded but the interpreter is not ('á' character corresponds to '\xe1' in ISO-8859-1, not to '\xa0'): http://en.wikipedia.org/wiki/ISO/IEC_8859-1 This is happening Any thoughts on how to overcome ? Regards, -- components: Unicode messages: 147552 nosy: Hugo.Silva, ezio.melotti priority: normal severity: normal status: open title: Python ISO-8859-1 encoding problem versions: Python 2.7 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13395 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13395] Python ISO-8859-1 encoding problem
Ezio Melotti ezio.melo...@gmail.com added the comment: This doesn't seem a bug to me, so you should ask for help somewhere else. You can try to pass a unicode arg to listdir to get unicode back, and double check what the input actually is. -- resolution: - invalid stage: - committed/rejected status: open - closed ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13395 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13395] Python ISO-8859-1 encoding problem
Martin v. Löwis mar...@v.loewis.de added the comment: Apparently, you are using the interactive shell on Microsoft Windows. This will use the OEM code page; which one that is depends on the exact Windows regional version you are using. You shouldn't decode the string with 'latin-1', but with sys.stdin.encoding. Alternatively, you should use Unicode string literals in Python in the first place. In any case, Ezio is right: this is not a help forum, but a bug tracker. -- nosy: +loewis ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13395 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com