Re: [2.5.1] ShiftJIS to Unicode?

MRAB Wed, 26 Nov 2008 17:01:58 -0800

Gilles Ganault wrote:

Hello


        I'm trying to read pages from Amazon JP, whose web pages are
supposed to be encoded in ShiftJIS, and decode contents into Unicode
to keep Python happy:

www.amazon.co.jp
<meta http-equiv="content-type" content="text/html; charset=Shift_JIS"

/>

But this doesn't work:

======
m = try.search(the_page)


How can you have name "try"? It's a reserved word!

if m:
        #UnicodeEncodeError: 'charmap' codec can't encode characters in
position 49-55: character maps to <undefined>             
        title = m.group(1).decode('shift_jis').strip()
======

Has someone successfully accessed Shift-JIS-encoded Japanese contents
with Python?

No problem here:

>>> import urllib
>>> data = urllib.urlopen("http://www.amazon.co.jp/";).read()
>>> decoded_data = data.decode("shift-jis")
>>>
--
http://mail.python.org/mailman/listinfo/python-list

Re: [2.5.1] ShiftJIS to Unicode?

Reply via email to