Hi there, I thank you for all your patience with me. I was completely off-track. I read all the mails again, and everything is starting to make sense now. This is going to be a lengthy email about #1356 and #3370, but please do read until the end. Short executive summary: It's really a bug, and the patch is not bad, but incomplete.
First, contrary to my former opinion, #3370 is a bug in the newforms module, as it is passing unicode to the database API which is not ripe for it and will break as soon as you leave ASCII. #3370 is independent of #952. I see three ways to fix the problem in #3370: a) newforms stops passing unicode strings to the Database API and uses bytestrings. b) the database wrapper in Django sets connection.charset (but needs to translate the charset name since the databases don't understand all charset name variants, see ticket #952 here). This is the approach of the patches in tickets #1356 and #3370. c) the database wrapper in Djago must check whether it gets unicode. In this case, it needs to encode it into a bytestring. With all three variants, what encoding should be used? We currently issue (without #952) a 'set name utf8' at the beginning of each connection, so the database server expects to receive utf8. So, shouldn't we currently always use utf8 encoding, regardless of what is in settings.DEFAULT_CHARSET? This point has caused a lot of confusion. Ivan wrote: > I'm -1 on setting MySQL connection to 'utf8' in #3370. It *will* make > sense when we will have newforms ready and models containing unicode. > But now most of Django is a byte string country. A bright example are > generic views that take data from web and store it to models without any > conversions. This patch will feed 'windows-1251' or 'iso-8859-1' to > MySQL saying that "it's utf-8" and MySQL will try to convert it and most > certainly will store just strings of '????'. Well, the current patch in #3370 (I still ignore __repr__) only changes the charset attribute of a connection, and this attribute is used only to encode unicode strings when sending data to the database, or to decode bytestrings received from the database when MySQLdb is configured to produce unicode ('use_unicode'). Here's what the documentation in MySQLdb-1.2.2b2 says: use_unicode If True, CHAR and VARCHAR and TEXT columns are returned as Unicode strings, using the configured character set. It is best to set the default encoding in the server configuration, or client configuration (read with ==> read_default_file). If you change the character set after ==> connecting (MySQL-4.1 and later), you'll need to put the ==> correct character set name in connection.charset. If False, text-like columns are returned as normal strings, but you can always write Unicode strings. *This must be a keyword parameter.* (But, the charset parameter is also used when you pass in unicode without setting use_unicode) python-MySQLdb-1.2.1p2 is similar, only that there it is no keyword parameter. There's an interesting difference between 1.2.1p2 and 1.2.2b2: For 1.2.1p2, you have to change the charset attribute of the existing connection. If you try this on 1.2.2b2, it won't work. For 1.2.2b2, you either have to pass a 'charset' parameter when you create the connection, or you can call a method set_character_set(). Both of these won't work for 1.2.1p2, of course :-( So, the APIs of python-MySQLdb are incompatible with each other (within a minor version change!) This explains the differences between #1356 and #3370. We need a patch that plays well with both versions of python-MySQLdb. I don't see a problem with the generic views since they pass bytestrings to the database wrapper, this gets as bytestrings to MySQLdb, and for bytestrings the charset attribute is not used at all. Of course, as soon as #952 has been applied, we need to use the encoding from settings.DEFAULT_ENCODING. Michael P.S.: If you set the charset parameter in 1.2.2b2's Connection.__init__(), the default for use_unicode will be True, and python-MySQLdb will return unicode strings. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~---