On Fri, 14 Nov 2008 14:57:42 +0100, Gilles Ganault wrote:
> On Fri, 14 Nov 2008 11:01:27 +0100, "Martin v. Löwis"
> <[EMAIL PROTECTED]> wrote:
>>Add
>> print type(output)
>>here. If it says "unicode", reconsider the next line
>>
>>> print output.decode('utf-8')
>
> In case the string fetched from a web page turns out not to be Unicode
> and Python isn't happy, what is the right way to handle this, know what
> codepage is being used?
How do you fetch the data? If you simply download it with `urllib` or
`urllib` you never get `unicode` but ordinary `str`\s. The you have to
figure out the encoding by looking at the headers from the server and/or
looking at the fetched data if it contains hints.
And when ``print``\ing you should explicitly *encode* the data again
because sooner or later you will come across a `stdout` where Python
can't determine what the process at the other end expects, for example if
output is redirected to a file.
Ciao,
Marc 'BlackJack' Rintsch
--
http://mail.python.org/mailman/listinfo/python-list