Gilles> ====== Gilles> m = try.search(the_page) Gilles> if m: Gilles> #UnicodeEncodeError: 'charmap' codec can't encode characters in Gilles> position 49-55: character maps to <undefined> Gilles> title = m.group(1).decode('shift_jis').strip() Gilles> ======
Gilles> Has someone successfully accessed Shift-JIS-encoded Japanese Gilles> contents with Python? Have you verified that the characters in position 49-55 are actually Shift-JIS characters? In my experience problems decoding a source string in any given character set are because of errors in the source, not errors in Python. OTOH, the characters in position 49-55 look like plain old ASCII to me. Does Shift-JIS have ASCII as a proper subset? -- Skip Montanaro - [EMAIL PROTECTED] - http://smontanaro.dyndns.org/ -- http://mail.python.org/mailman/listinfo/python-list