a couple of things:
- where's the patch ? I thought i had put it in but it seems not.
lets put in a Trac ticket for it.
- may I suggest, that since this issue is decided completely within
the source code for types.String, that various implementations of
String, corresponding to different user preferences with regards to
Unicode treatment, be provided as mods which simply patch themselves
into the types package....that way, whatever type you put there gets
loaded in from an autoload scenario automatically.
On Apr 20, 2006, at 11:03 AM, Vasily Sulatskov wrote:
Hello Qvx,
As for the autoload I'm not sure what to do. If *I* had to do it I
would
return Unicode columns everywhere. More flexible solution would, I
guess,
alow developer to intervene in some way (via kw param).
I did some testing on sqlalchemy autoload feature and it seems that
sqlalchemy or string-like columns assigns types like:
sqlalchemy.databases.mysql.MSString and there are no types like
sqlalchemy.databases.mysql.MSUnicode in sqlalchemy. So I think it will
require lot's of code changing to make sqlalchemy behave "right-way",
i.e. when string is string and unicode is unicode.
Hence in a current situation it's very good that one can specify
convert_unicode=True and even with autoload=True still get unicode
objects from database.
And with patch that I suggested sqlalchemy will even protect it's
users from hardest pitfalls like putting data in incorect encoding in
database. (Sqlalchemy will try to convert supplied string to unicode
with ascii codec and if converion fails, and it WILL fail if user is
using national encoding in his strings.)
Untill autoload behaviour is not changed I think it would be better
not to make Strings always behave like strings.
I think that sqlalchemy in a perfect world should behave like that:
User controls sqlalchemy behaviour with three engine parameters:
1. engine.server_encoding - encoding used for storing data data in
database,
defaults to 'ascii', when I say 'ascii' I actually mean 'ascii' or
some other encoding common to most of sqlalchemy users.
2. engine.client_encoding - encoding for client side strings, i.e.
string that user feeds to sqlalchemy or gets from it. Defaults to
'ascii', or some other encoding common to most of
sqlalchemy users
3. engine.autoload_unicode, defaults to False - parameter that
tells sqlalchemy should id
create columns of string type or unicode type when autoloading
tables, or perhaps some other way to hint column types when
autoloading.
String column types always return strings to user but also accepts
unicode objects on assignment(unicode objects can always be converted
to string of known encoding)
Unicode column types always return unicode objects. They accepy only
unicode objects. (perhaps they should also accept strings and treat
them as strings with engine.client_encoding encoding)
For string columnt types, if engine.client_encoding doesn't match
engine.server_encoding, takes place automatic string encoding
conversion.
In that situation most users of sqlalchemy will just happily use
default
parameters.
And unfortunate users of nationtal encodings will turn engine
parameters to something like that:
engine.server_encoding = 'utf8'
engine.client_encoding = 'cp1251'
engine.autoload_unicode = True
or even
engine.server_encoding = 'utf8'
engine.client_encoding = 'cp1251'
engine.autoload_unicode = False
And so, everyone will be happy.
1. Ascii users work as they are used to, not knowing about horrors of
encodings and unicode.
2. National encoding users work using theirs marginal encodings
without data loss.
3. Language purists enjoy that string is string and unicode is unicode
:-)
Any thoughts, comments?
It seems to me that plain strings, in general, are used for two main
reasons: lack of proper unicode support and laziness/lack of
knowledge. Only
after those two reasons would come all other valid reasons from
knowledgable
developers. I don't give much thought to those other reasons if I
can use
unicode. More often than not, I must use strings because of lack
of unicode
support, so I'm happy that SA has it. I don't consider myself unicode
expert; just an unfortunate fellow who has to work with latin2 and
windows-1250 encodings and somehow manage my way through. If there is
somebody else here who knows more about unicode I think now would
be the
right time to say something...
Same deal. I am not an unicode expert, but I suspect that all people
of countries where non ascii encodings is uses possess innate unicode
knowledge :-)
--
Best regards,
Vasily mailto:[EMAIL PROTECTED]
-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services,
security?
Get stuff done quickly with pre-integrated technology to make your
job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache
Geronimo
http://sel.as-us.falkag.net/sel?
cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Sqlalchemy-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/sqlalchemy-users
-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Sqlalchemy-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/sqlalchemy-users