[sqlalchemy] Re: Anybody seen--Exception: invalid byte sequence for encoding UTF8?
On Jun 2, 2007, at 8:22 AM, Matt Culbreth wrote: Howdy All, I've got some existing code that I'm trying on a new server. The code was formerly running with Python 2.4 and SA 0.36, but this new server is running Python 2.5 and SA 0.37. Anyway, I've got a small program which is loading a PostgreSQL 8.2 db from a CSV file, and I'm getting this exception: sqlalchemy.exceptions.SQLError: (ProgrammingError) invalid byte sequence for encoding UTF8: 0xe16e69 HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by client_encoding. The particular (fake, generated) set of data doing this is shown here. It looks like that first element (city) is encoded as something other than latin1: {'city': 'Gu\xe1nica', 'first_name': 'Patricia', 'last_name': 'Wagner', 'zip': '25756', 'phone': '490.749.6157', 'state': 'KS', 'annual_salary': '72333', 'broker_id': 452L, 'date_hired': datetime.date(2004, 1, 1), 'address': 'P.O. Box 815, 6723 Eget, Ave', 'commission_percentage': 0.080120064101811897} Has anybody seen this? Do I need to do a convert_unicode or anything like that? if youre parsing from the CSV file, your best bet is to parse the data into python unicode objects using the expected encoding of the filethat will detect any text in the file thats not in the expected encoding. then just use the DB with convert_unicode=True either on create_engine() or within the individual String() types (String(convert_unicode=True) is now equivalent to Unicode()). --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Anybody seen--Exception: invalid byte sequence for encoding UTF8?
Thanks Michael, I'll do this. When I change the model's column types to Unicode() I still get the same type in the DB--character varying(100). I'm assuming that's correct? The DB is using a UTF8 encoding. On Jun 2, 9:53 am, Michael Bayer [EMAIL PROTECTED] wrote: On Jun 2, 2007, at 8:22 AM, Matt Culbreth wrote: Howdy All, I've got some existing code that I'm trying on a new server. The code was formerly running with Python 2.4 and SA 0.36, but this new server is running Python 2.5 and SA 0.37. Anyway, I've got a small program which is loading a PostgreSQL 8.2 db from a CSV file, and I'm getting this exception: sqlalchemy.exceptions.SQLError: (ProgrammingError) invalid byte sequence for encoding UTF8: 0xe16e69 HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by client_encoding. The particular (fake, generated) set of data doing this is shown here. It looks like that first element (city) is encoded as something other than latin1: {'city': 'Gu\xe1nica', 'first_name': 'Patricia', 'last_name': 'Wagner', 'zip': '25756', 'phone': '490.749.6157', 'state': 'KS', 'annual_salary': '72333', 'broker_id': 452L, 'date_hired': datetime.date(2004, 1, 1), 'address': 'P.O. Box 815, 6723 Eget, Ave', 'commission_percentage': 0.080120064101811897} Has anybody seen this? Do I need to do a convert_unicode or anything like that? if youre parsing from the CSV file, your best bet is to parse the data into python unicode objects using the expected encoding of the filethat will detect any text in the file thats not in the expected encoding. then just use the DB with convert_unicode=True either on create_engine() or within the individual String() types (String(convert_unicode=True) is now equivalent to Unicode()). --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---
[sqlalchemy] Re: Anybody seen--Exception: invalid byte sequence for encoding UTF8?
On Jun 2, 2007, at 10:58 AM, Matt Culbreth wrote: Thanks Michael, I'll do this. When I change the model's column types to Unicode() I still get the same type in the DB--character varying(100). I'm assuming that's correct? The DB is using a UTF8 encoding. yes convert_unicode means an incoming u'' object, on the bind parameter side, will be encoded to a utf8 string before its sent to the database...which is what your DB is expecting. --~--~-~--~~~---~--~~ You received this message because you are subscribed to the Google Groups sqlalchemy group. To post to this group, send email to sqlalchemy@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en -~--~~~~--~~--~--~---