Hello everyone I'm running django from the trunk, so using the most up to date version, python 2.5 with PyTextile 2.0.10. mysql5.0.2 with all settings to utf-8 and django content type is utf-8
I'm overwriting the save command on events using newforms, we're textiling the input for an html field, here's what I mean. def save(self): import textile if self.body: self.body_html = textile.textile(self.body) super(Event, self).save() it fails with this error Exception Value: 'ascii' codec can't decode byte 0xb4 in position 0: ordinal not in range(128) Exception Location: /usr/local/lib/python2.5/site-packages/textile.py in glyphs, line 2418 My textile settings are # Set your encoding here. ENCODING = 'utf8' # Output? Non-ASCII characters will be automatically # converted to XML entities if you choose ASCII. OUTPUT = 'utf8' I tried changing my OUTPUT to ascii in textile but got the same error, so to me it looks like the form is sending a unicode series of bytes to textile which it can't understand. One way around this is to manipulate the self.summary prior to passing it to textile, like this. self.body = self.body.decode('utf-8') self.body = self.body.encode('ascii', 'ignore') This forces the passing of ascii to textile and it likes that alot, and works. But if a user now copies and pastes the dreaded apostrophe form word or another special character unique to word, it fails with this error. Exception Value: 'ascii' codec can't encode character u'\u2019' in position 5: ordinal not in range(128) Exception Location: /usr/local/lib/python2.5/encodings/utf_8.py in decode, line 16 If I run the super save earlier in the save definition after removing the textiling of the body section, and then call the data out of the database further down in the save definition, and then save it again like this e = Event.objects.get(id=new_id) if e.body: e.body_html = textile.textile(e.body) super(Event, e).save() It all works fine, no encoding or decoding needed for pasted apostrophes or anything. Here's the paste of the relevant part of the form with certain sections commented out so you can see what I mean. http://pastie.textmate.org/71702 I found this on the google groups form Ivan Sagalev To summarizes: your storage (a database) and your input/output (the web) really should use utf-8 to avoid problems with "strange" characters. If you deal internally with unicode (which newforms produce for you) then for now you should explicitly encode from it to utf-8 until Django starts doing it automatically. I've also been reading this thread on the google developers group, and I'm now completely confused as to what is going on. unicode issues in multiple tickets If anyone can tell me if there is some current status on this, or how it works right now I'd be really grateful. If I have to encode and decode then I don't mind, not much anyway :-) --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~---