a couple of things:

- where's the patch ? I thought i had put it in but it seems not. lets put in a Trac ticket for it.

- may I suggest, that since this issue is decided completely within the source code for types.String, that various implementations of String, corresponding to different user preferences with regards to Unicode treatment, be provided as mods which simply patch themselves into the types package....that way, whatever type you put there gets loaded in from an autoload scenario automatically.


On Apr 20, 2006, at 11:03 AM, Vasily Sulatskov wrote:

Hello Qvx,

As for the autoload I'm not sure what to do. If *I* had to do it I would return Unicode columns everywhere. More flexible solution would, I guess,
alow developer to intervene in some way (via kw param).

I did some testing on sqlalchemy autoload feature and it seems that
sqlalchemy or string-like columns assigns types like:
sqlalchemy.databases.mysql.MSString and there are no types like
sqlalchemy.databases.mysql.MSUnicode in sqlalchemy. So I think it will
require lot's of code changing to make sqlalchemy behave "right-way",
i.e. when string is string and unicode is unicode.

Hence in a current situation it's very good that one can specify
convert_unicode=True and even with autoload=True still get unicode
objects from database.

And with patch that I suggested sqlalchemy will even protect it's
users from hardest pitfalls like putting data in incorect encoding in
database. (Sqlalchemy will try to convert supplied string to unicode
with ascii codec and if converion fails, and it WILL fail if user is
using national encoding in his strings.)

Untill autoload behaviour is not changed I think it would be better
not to make Strings always behave like strings.


I think that sqlalchemy in a perfect world should behave like that:

User controls sqlalchemy behaviour with three engine parameters:

1. engine.server_encoding - encoding used for storing data data in database,
   defaults to 'ascii', when I say 'ascii' I actually mean 'ascii' or
   some other encoding common to most of sqlalchemy users.

2. engine.client_encoding - encoding for client side strings, i.e.
   string that user feeds to sqlalchemy or gets from it. Defaults to
   'ascii', or some other encoding common to most of
   sqlalchemy users

3. engine.autoload_unicode, defaults to False - parameter that tells sqlalchemy should id
   create columns of string type or unicode type when autoloading
   tables, or perhaps some other way to hint column types when
   autoloading.

String column types always return strings to user but also accepts
unicode objects on assignment(unicode objects can always be converted
to string of known encoding)

Unicode column types always return unicode objects. They accepy only
unicode objects. (perhaps they should also accept strings and treat
them as strings with engine.client_encoding encoding)

For string columnt types, if engine.client_encoding doesn't match
engine.server_encoding, takes place automatic string encoding conversion.

In that situation most users of sqlalchemy will just happily use default
parameters.

And unfortunate users of nationtal encodings will turn engine
parameters to something like that:
engine.server_encoding = 'utf8'
engine.client_encoding = 'cp1251'
engine.autoload_unicode = True
or even
engine.server_encoding = 'utf8'
engine.client_encoding = 'cp1251'
engine.autoload_unicode = False

And so, everyone will be happy.
1. Ascii users work as they are used to, not knowing about horrors of
encodings and unicode.

2. National encoding users work using theirs marginal encodings
without data loss.

3. Language purists enjoy that string is string and unicode is unicode
:-)

Any thoughts, comments?

It seems to me that plain strings, in general, are used for two main
reasons: lack of proper unicode support and laziness/lack of knowledge. Only after those two reasons would come all other valid reasons from knowledgable developers. I don't give much thought to those other reasons if I can use unicode. More often than not, I must use strings because of lack of unicode
support, so I'm happy that SA has it. I don't consider myself unicode
expert; just an unfortunate fellow who has to work with latin2 and
windows-1250 encodings and somehow manage my way through. If there is
somebody else here who knows more about unicode I think now would be the
right time to say something...
Same deal. I am not an unicode expert, but I suspect that all people
of countries where non ascii encodings is uses possess innate unicode
knowledge :-)

--
Best regards,
 Vasily                            mailto:[EMAIL PROTECTED]




-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel? cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Sqlalchemy-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/sqlalchemy-users



-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Sqlalchemy-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/sqlalchemy-users

Reply via email to