#18392: Use utf8mb4 encoding with MySQL 5.5
-------------------------------------+-------------------------------------
     Reporter:  EmilStenstrom        |                    Owner:  nobody
         Type:  Uncategorized        |                   Status:  new
    Component:  Database layer       |                  Version:  1.4
  (models, ORM)                      |               Resolution:
     Severity:  Normal               |             Triage Stage:  Design
     Keywords:  utf8mb4 mysql        |  decision needed
    Has patch:  1                    |      Needs documentation:  0
  Needs tests:  1                    |  Patch needs improvement:  1
Easy pickings:  0                    |                    UI/UX:  0
-------------------------------------+-------------------------------------
Changes (by rogeliorv):

 * keywords:  hack utf8mb4 mysql => utf8mb4 mysql
 * needs_better_patch:  0 => 1
 * has_patch:  0 => 1
 * needs_tests:  0 => 1


Comment:

 Replying to [comment:8 EmilStenstrom]:
 > Replying to [comment:7 rogeliorv]:
 > > As a way to test it. The hack consists in adding self.query('SET NAMES
 utf8mb4') in MySQLdb.connections in Connection.set_character_set function
 as shown here: http://pastebin.com/MW5BgRgP
 > >
 > > Of course the correct way would be to change this in django when
 setting up the cursor connection.
 >
 > Did your hack remove the exception? What was the rationale behind the
 hack? What's the next step?


 Yes, the hack removed the exception. The rationale followed was to make
 the mysql client to use a certain encoding.

 The next step is to make django's mysql connections to use utf8mb4 by
 default or otherwise make it more configurable. Since utf8bm4 is utf8
 compatible, there should be no extra changes in that regard.


 To achieve this django.db.base.cursor should be changed in class
 DatabaseWrapper function _cursor, (complete function definition here
 http://pastebin.com/A6dMEMd4):

 ''kwargs = {

   "conv": django_conversions,
   "charset": "utf8mb4",
   "use_unicode": True,
 }
 ''

 Unfortunately this won't work unless we also change MySQLdb.connections
 class Connection function set_character_set:


 Change the two bottom lines to (complete function definition here:
 http://pastebin.com/AMN1B8za)

 #Hack so data can be decoded/encoded using python's utf8 since
 # python does not know about mysql utf8mb4

 ''if charset == 'utf8mb4':''
     ''charset = 'utf8'''

 ''self.string_decoder.charset = charset''

 ''self.unicode_literal.charset = charset''


 This will guarantee you can use special characets like πŸ˜„πŸ˜ƒπŸ˜Šβ˜ΊπŸ˜‰πŸ˜πŸ˜˜πŸ˜š

 Unlike the previous hack, which worked on reading/writing data, this patch
 only allows me to read data in utfmb4 format, but now I've hit an error on
 insertion/creation where I get 'Cursor' object has no attribute
 '_last_executed'.  I will report evidence on this error as I find it. All
 your help regarding this error is appreciated.

 You can reach me via twitter, @rogeliorv

-- 
Ticket URL: <https://code.djangoproject.com/ticket/18392#comment:9>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To post to this group, send email to django-updates@googlegroups.com.
To unsubscribe from this group, send email to 
django-updates+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-updates?hl=en.

Reply via email to