[issue13395] Python ISO-8859-1 encoding problem

2011-11-13 Thread Hugo Silva

New submission from Hugo Silva hugo...@gmail.com:

Hi all,

I'm facing a huge encoding problem in Python when dealing with ISO-8859-1 / 
Latin-1 character set.

When using os.listdir to get the contents of a folder I'm getting the strings 
encoded in ISO-8859-1 (ex: ''Ol\xe1 Mundo''), however in the Python interpreter 
the same string is encoded to a different charset:

In : 'Olá Mundo'.decode('latin-1')
Out: u'Ol\xa0 Mundo'

How can I force Python to decode the string to the same format. I've seen that 
os.listdir is returning the strings correctly encoded but the interpreter is 
not ('á' character corresponds to '\xe1' in ISO-8859-1, not to '\xa0'):

http://en.wikipedia.org/wiki/ISO/IEC_8859-1

This is happening 

Any thoughts on how to overcome ?

Regards,

--
components: Unicode
messages: 147552
nosy: Hugo.Silva, ezio.melotti
priority: normal
severity: normal
status: open
title: Python ISO-8859-1 encoding problem
versions: Python 2.7

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13395
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13395] Python ISO-8859-1 encoding problem

2011-11-13 Thread Ezio Melotti

Ezio Melotti ezio.melo...@gmail.com added the comment:

This doesn't seem a bug to me, so you should ask for help somewhere else.
You can try to pass a unicode arg to listdir to get unicode back, and double 
check what the input actually is.

--
resolution:  - invalid
stage:  - committed/rejected
status: open - closed

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13395
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13395] Python ISO-8859-1 encoding problem

2011-11-13 Thread Martin v . Löwis

Martin v. Löwis mar...@v.loewis.de added the comment:

Apparently, you are using the interactive shell on Microsoft Windows. This will 
use the OEM code page; which one that is depends on the exact Windows 
regional version you are using.

You shouldn't decode the string with 'latin-1', but with sys.stdin.encoding. 
Alternatively, you should use Unicode string literals in Python in the first 
place.

In any case, Ezio is right: this is not a help forum, but a bug tracker.

--
nosy: +loewis

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue13395
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com