[sqlalchemy] Re: Anybody seen--Exception: invalid byte sequence for encoding UTF8?

2007-06-02 Thread Michael Bayer


On Jun 2, 2007, at 8:22 AM, Matt Culbreth wrote:


 Howdy All,

 I've got some existing code that I'm trying on a new server.  The code
 was formerly running with Python 2.4 and SA 0.36, but this new server
 is running Python 2.5 and SA 0.37.

 Anyway, I've got a small program which is loading a PostgreSQL 8.2 db
 from a CSV file, and I'm getting this exception:

 sqlalchemy.exceptions.SQLError: (ProgrammingError) invalid byte
 sequence for encoding UTF8: 0xe16e69
 HINT:  This error can also happen if the byte sequence does not match
 the encoding expected by the server, which is controlled by
 client_encoding.

 The particular (fake, generated) set of data doing this is shown
 here.  It looks like that first element (city) is encoded as something
 other than latin1:

 {'city': 'Gu\xe1nica', 'first_name': 'Patricia', 'last_name':
 'Wagner', 'zip': '25756', 'phone': '490.749.6157', 'state': 'KS',
 'annual_salary': '72333', 'broker_id': 452L, 'date_hired':
 datetime.date(2004, 1, 1), 'address': 'P.O. Box 815, 6723 Eget, Ave',
 'commission_percentage': 0.080120064101811897}

 Has anybody seen this?  Do I need to do a convert_unicode or anything
 like that?

if youre parsing from the CSV file, your best bet is to parse the  
data into python unicode objects using the expected encoding of the  
filethat will detect any text in the file thats not in the  
expected encoding.  then just use the DB with convert_unicode=True  
either on create_engine() or within the individual String() types  
(String(convert_unicode=True) is now equivalent to Unicode()). 
  

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: Anybody seen--Exception: invalid byte sequence for encoding UTF8?

2007-06-02 Thread Matt Culbreth

Thanks Michael, I'll do this.

When I change the model's column types to Unicode() I still get the
same type in the DB--character varying(100).  I'm assuming that's
correct?  The DB is using a UTF8 encoding.

On Jun 2, 9:53 am, Michael Bayer [EMAIL PROTECTED] wrote:
 On Jun 2, 2007, at 8:22 AM, Matt Culbreth wrote:





  Howdy All,

  I've got some existing code that I'm trying on a new server.  The code
  was formerly running with Python 2.4 and SA 0.36, but this new server
  is running Python 2.5 and SA 0.37.

  Anyway, I've got a small program which is loading a PostgreSQL 8.2 db
  from a CSV file, and I'm getting this exception:

  sqlalchemy.exceptions.SQLError: (ProgrammingError) invalid byte
  sequence for encoding UTF8: 0xe16e69
  HINT:  This error can also happen if the byte sequence does not match
  the encoding expected by the server, which is controlled by
  client_encoding.

  The particular (fake, generated) set of data doing this is shown
  here.  It looks like that first element (city) is encoded as something
  other than latin1:

  {'city': 'Gu\xe1nica', 'first_name': 'Patricia', 'last_name':
  'Wagner', 'zip': '25756', 'phone': '490.749.6157', 'state': 'KS',
  'annual_salary': '72333', 'broker_id': 452L, 'date_hired':
  datetime.date(2004, 1, 1), 'address': 'P.O. Box 815, 6723 Eget, Ave',
  'commission_percentage': 0.080120064101811897}

  Has anybody seen this?  Do I need to do a convert_unicode or anything
  like that?

 if youre parsing from the CSV file, your best bet is to parse the
 data into python unicode objects using the expected encoding of the
 filethat will detect any text in the file thats not in the
 expected encoding.  then just use the DB with convert_unicode=True
 either on create_engine() or within the individual String() types
 (String(convert_unicode=True) is now equivalent to Unicode()).


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---



[sqlalchemy] Re: Anybody seen--Exception: invalid byte sequence for encoding UTF8?

2007-06-02 Thread Michael Bayer


On Jun 2, 2007, at 10:58 AM, Matt Culbreth wrote:


 Thanks Michael, I'll do this.

 When I change the model's column types to Unicode() I still get the
 same type in the DB--character varying(100).  I'm assuming that's
 correct?  The DB is using a UTF8 encoding.


yes convert_unicode means an incoming u'' object, on the bind  
parameter side, will be encoded to a utf8 string before its sent to  
the database...which is what your DB is expecting.



--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---