subject:"\[sqlalchemy\] Re\: Multiple encodings in my database"

[sqlalchemy] Re: Multiple encodings in my database

2008-06-27 Thread Hermann Himmelbauer


Am Freitag, 27. Juni 2008 01:20 schrieb Michael Bayer:
 first of all, the stack trace suggests you have not set the encoding
 parameter on create_engine() as it's still using UTF-8.

 If you mean that a single database column may have different encodings
 in different rows, you want to do your own encoding/decoding with
 encoding errors set to something liberal like ignore.  You also
 need to use your own custom type, as below:

 from sqlalchemy import types
 class MyEncodedType(types.TypeDecorator):
   impl = String

   def process_bind_param(self, value, dialect):
   assert isinstance(value, unicode)
   return value.encode('latin-1')

   def process_result_value(self, value, dialect):
   return value.decode('latin-1', 'ignore')

 then use MyEncodedType() as the type for all your columns which
 contain random encoding.   No convert_unicode setting should be used
 on your engine as this type replaces that usage.

Perfect, that works, thanks!

Best Regards,
Hermann

-- 
[EMAIL PROTECTED]
GPG key ID: 299893C7 (on keyservers)
FP: 0124 2584 8809 EF2A DBF9  4902 64B4 D16B 2998 93C7

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---

[sqlalchemy] Re: Multiple encodings in my database

2008-06-27 Thread Bobby Impollonia


If I am using the mysql-specific Column constructs with the charset
option, will things be automatically encoded/ decoded by SA using that
charset? Or is the charset option only used for Create Table?

On Thu, Jun 26, 2008 at 7:20 PM, Michael Bayer [EMAIL PROTECTED] wrote:

 first of all, the stack trace suggests you have not set the encoding
 parameter on create_engine() as it's still using UTF-8.

 If you mean that a single database column may have different encodings
 in different rows, you want to do your own encoding/decoding with
 encoding errors set to something liberal like ignore.  You also
 need to use your own custom type, as below:

 from sqlalchemy import types
 class MyEncodedType(types.TypeDecorator):
impl = String

def process_bind_param(self, value, dialect):
assert isinstance(value, unicode)
return value.encode('latin-1')

def process_result_value(self, value, dialect):
return value.decode('latin-1', 'ignore')

 then use MyEncodedType() as the type for all your columns which
 contain random encoding.   No convert_unicode setting should be used
 on your engine as this type replaces that usage.



 On Jun 26, 2008, at 6:55 PM, Hermann Himmelbauer wrote:


 Hi,
 I'm trying to access a database via SA, which contains varchars with
 different, arbitrary encodings. Most of them are ascii or ISO-8859-2
 encoded,
 however, many are windows-1252 encoded and there are also some other
 weird
 ones.

 In my engine setup, I set the encoding to latin1 and set
 convert_unicode to
 True, as I my application requires the database values in unicode
 format.

 If SA now tries to retrieve such a key, the following traceback
 occurs:

 --
  File /home/dusty/prog/python_modules/sqlalchemy/engine/base.py,
 line 1605,
 in _get_col
return processor(row[index])
  File /home/dusty/prog/python_modules/sqlalchemy/databases/
 maxdb.py, line
 112, in process
return value.decode(dialect.encoding)

 File /local/home/dusty/python/Python-2.4.4/lib/python2.4/encodings/
 utf_8.py,
 line 16, in decode
return codecs.utf_8_decode(input, errors, True)
 UnicodeDecodeError: 'utf8' codec can't decode bytes in position 3-6:
 invalid
 data
 -

 What can I do? It's not so important that all characters are correctly
 displayed, but it's vital that such improper encodings do not crash my
 application. Perhaps, there's some universal encoding that is able
 to deal
 with such problems?

 Best Regards,
 Hermann

 --
 [EMAIL PROTECTED]
 GPG key ID: 299893C7 (on keyservers)
 FP: 0124 2584 8809 EF2A DBF9  4902 64B4 D16B 2998 93C7

 


 


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---

[sqlalchemy] Re: Multiple encodings in my database

2008-06-27 Thread jason kirtland


my understanding is that mysql works a little differently here.  the
column-level character sets are storage encodings only.  all data to and
from the database is encoded in the database connection's configured
encoding.  that can either be left as-is or converted to Unicode for you.

Bobby Impollonia wrote:
 If I am using the mysql-specific Column constructs with the charset
 option, will things be automatically encoded/ decoded by SA using that
 charset? Or is the charset option only used for Create Table?
 
 On Thu, Jun 26, 2008 at 7:20 PM, Michael Bayer [EMAIL PROTECTED] wrote:
 first of all, the stack trace suggests you have not set the encoding
 parameter on create_engine() as it's still using UTF-8.

 If you mean that a single database column may have different encodings
 in different rows, you want to do your own encoding/decoding with
 encoding errors set to something liberal like ignore.  You also
 need to use your own custom type, as below:

 from sqlalchemy import types
 class MyEncodedType(types.TypeDecorator):
impl = String

def process_bind_param(self, value, dialect):
assert isinstance(value, unicode)
return value.encode('latin-1')

def process_result_value(self, value, dialect):
return value.decode('latin-1', 'ignore')

 then use MyEncodedType() as the type for all your columns which
 contain random encoding.   No convert_unicode setting should be used
 on your engine as this type replaces that usage.



 On Jun 26, 2008, at 6:55 PM, Hermann Himmelbauer wrote:

 Hi,
 I'm trying to access a database via SA, which contains varchars with
 different, arbitrary encodings. Most of them are ascii or ISO-8859-2
 encoded,
 however, many are windows-1252 encoded and there are also some other
 weird
 ones.

 In my engine setup, I set the encoding to latin1 and set
 convert_unicode to
 True, as I my application requires the database values in unicode
 format.

 If SA now tries to retrieve such a key, the following traceback
 occurs:

 --
  File /home/dusty/prog/python_modules/sqlalchemy/engine/base.py,
 line 1605,
 in _get_col
return processor(row[index])
  File /home/dusty/prog/python_modules/sqlalchemy/databases/
 maxdb.py, line
 112, in process
return value.decode(dialect.encoding)

 File /local/home/dusty/python/Python-2.4.4/lib/python2.4/encodings/
 utf_8.py,
 line 16, in decode
return codecs.utf_8_decode(input, errors, True)
 UnicodeDecodeError: 'utf8' codec can't decode bytes in position 3-6:
 invalid
 data
 -

 What can I do? It's not so important that all characters are correctly
 displayed, but it's vital that such improper encodings do not crash my
 application. Perhaps, there's some universal encoding that is able
 to deal
 with such problems?

 Best Regards,
 Hermann

 --
 [EMAIL PROTECTED]
 GPG key ID: 299893C7 (on keyservers)
 FP: 0124 2584 8809 EF2A DBF9  4902 64B4 D16B 2998 93C7


 
  


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---

[sqlalchemy] Re: Multiple encodings in my database

2008-06-26 Thread Michael Bayer


first of all, the stack trace suggests you have not set the encoding  
parameter on create_engine() as it's still using UTF-8.

If you mean that a single database column may have different encodings  
in different rows, you want to do your own encoding/decoding with  
encoding errors set to something liberal like ignore.  You also  
need to use your own custom type, as below:

from sqlalchemy import types
class MyEncodedType(types.TypeDecorator):
impl = String

def process_bind_param(self, value, dialect):
assert isinstance(value, unicode)
return value.encode('latin-1')

def process_result_value(self, value, dialect):
return value.decode('latin-1', 'ignore')

then use MyEncodedType() as the type for all your columns which  
contain random encoding.   No convert_unicode setting should be used  
on your engine as this type replaces that usage.



On Jun 26, 2008, at 6:55 PM, Hermann Himmelbauer wrote:


 Hi,
 I'm trying to access a database via SA, which contains varchars with
 different, arbitrary encodings. Most of them are ascii or ISO-8859-2  
 encoded,
 however, many are windows-1252 encoded and there are also some other  
 weird
 ones.

 In my engine setup, I set the encoding to latin1 and set  
 convert_unicode to
 True, as I my application requires the database values in unicode  
 format.

 If SA now tries to retrieve such a key, the following traceback  
 occurs:

 --
  File /home/dusty/prog/python_modules/sqlalchemy/engine/base.py,  
 line 1605,
 in _get_col
return processor(row[index])
  File /home/dusty/prog/python_modules/sqlalchemy/databases/ 
 maxdb.py, line
 112, in process
return value.decode(dialect.encoding)

 File /local/home/dusty/python/Python-2.4.4/lib/python2.4/encodings/ 
 utf_8.py,
 line 16, in decode
return codecs.utf_8_decode(input, errors, True)
 UnicodeDecodeError: 'utf8' codec can't decode bytes in position 3-6:  
 invalid
 data
 -

 What can I do? It's not so important that all characters are correctly
 displayed, but it's vital that such improper encodings do not crash my
 application. Perhaps, there's some universal encoding that is able  
 to deal
 with such problems?

 Best Regards,
 Hermann

 -- 
 [EMAIL PROTECTED]
 GPG key ID: 299893C7 (on keyservers)
 FP: 0124 2584 8809 EF2A DBF9  4902 64B4 D16B 2998 93C7

 


--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google Groups 
sqlalchemy group.
To post to this group, send email to sqlalchemy@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/sqlalchemy?hl=en
-~--~~~~--~~--~--~---

[sqlalchemy] Re: Multiple encodings in my database

[sqlalchemy] Re: Multiple encodings in my database

[sqlalchemy] Re: Multiple encodings in my database

[sqlalchemy] Re: Multiple encodings in my database

4 matches

Site Navigation

Mail list logo

Footer information