On Fri, 13 Mar 2009 14:24:52 +0100, Peter Otten <__pete...@web.de> wrote: >It seems the database gives you the strings as unicode. When a unicode >string is printed python tries to encode it using sys.stdout.encoding >before writing it to stdout. As you run your script on the windows commmand >line that encoding seems to be cp437. Unfortunately your database contains >characters the cannot be expressed in that encoding.
Vielen Dank for the help :) I hadn't thought about the code page used to display data in the DOS box in XP. It turns out that the HTML page from which I was trying to extract data using regexes was encoded in 8859-1 instead of UTF8, the SQLite wrapper expects Unicode only, and it had a problem with some characters. For those interested, here's how I solved it, although there's likely a smarter way to do it: ============ data = re_data.search(response) if data: name = data.group(1).strip() address = data.group(2).strip() #content="text/html; charset=iso-8859-1"> name = name.decode('iso8859-1') address = address.decode('iso8859-1') sql = 'BEGIN;' sql = sql + 'UPDATE companies SET name=?,address=? WHERE id=?;' sql = sql + "COMMIT" try: cursor.execute(sql, (name,address,id) ) except: print "Failed UPDATING" raise else: print "Pattern not found" ============ Thanks again. -- http://mail.python.org/mailman/listinfo/python-list