Hi, A few days ago, I wrote: > I see three ways to fix the problem in #3370: > > a) newforms stops passing unicode strings to the Database API and uses > bytestrings. > > b) the database wrapper in Django sets connection.charset (but needs to > translate the charset name since the databases don't understand all > charset name variants, see ticket #952 here). This is the approach of > the patches in tickets #1356 and #3370. > > c) the database wrapper in Djago must check whether it gets unicode. In > this case, it needs to encode it into a bytestring.
I now see a fourth way that would resolve #952 at the same time: d) make the database wrapper accept both unicode and bytestrings in the models, but always pass unicode strings to the database backend. Details: For #952 to work, the name of the character encoding has to be translated from python naming conventions to these of the used backend, and this would need a huge table (see the ticket). It looks easy, but it's a major annoyance. Now, instead of doing this, how about modifying the database wrapper so that it actually tests whether it gets unicode or bytestrings, and in the case of bytestrings, decodes it to unicode using settings.CHARACTER_SET as encoding? Then it could use unicode to talk to its backend. As far as I see, psycopg2 is unicode capable, and python-MySQLdb, too. This is different from the proposal in the thread 'Unicode or Strings in Models', as I'd still accept both forms in the model and deal with it only when I send it to the database. 'Only unicode in models' would be a major change with many scattered pieces. My proposal is for a transition phase, to support piece-wise conversion to Unicode without breaking everything on the way (as newforms does). Disadvantage: The backend will probably decode it again to get it across the wire, to either UTF-8 or settings.DEFAULT_CHARSET (or something else), adding overhead to the database communication. I think this is a necessary transition from bytestrings to the Great Unicodification of Everything. As soon as there's unicode everywhere, the code that deals with bytestrings can be removed and the solution will fit in perfectly. What do you think? Michael -- noris network AG - Deutschherrnstraße 15-19 - D-90429 Nürnberg - Tel +49-911-9352-0 - Fax +49-911-9352-100 http://www.noris.de - The IT-Outsourcing Company --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~---