Hello everyone

I'm running django from the trunk, so using the most up to date
version, python 2.5 with PyTextile 2.0.10.
mysql5.0.2 with all settings to utf-8 and django content type is utf-8

I'm overwriting the save command on events using newforms, we're
textiling the input for an html field, here's what I mean.

def save(self):
        import textile
        if self.body:
                self.body_html = textile.textile(self.body)
        super(Event, self).save()


it fails with this error

Exception Value:        'ascii' codec can't decode byte 0xb4 in position 0:
ordinal not in range(128)
Exception Location:     /usr/local/lib/python2.5/site-packages/textile.py
in glyphs, line 2418

My textile settings are
# Set your encoding here.
ENCODING = 'utf8'

# Output? Non-ASCII characters will be automatically
# converted to XML entities if you choose ASCII.
OUTPUT = 'utf8'

I tried changing my OUTPUT to ascii in textile but got the same error,
so to me it looks like the form is sending a unicode
series of bytes to textile which it can't understand.

One way around this is to manipulate the self.summary prior to passing
it to textile, like this.

self.body = self.body.decode('utf-8')
self.body = self.body.encode('ascii', 'ignore')

This forces the passing of ascii to textile and it likes that alot,
and works.

But if a user now copies and pastes the dreaded apostrophe form word
or another special character unique to word,
it fails with this error.

Exception Value:        'ascii' codec can't encode character u'\u2019' in
position 5: ordinal not in range(128)
Exception Location:     /usr/local/lib/python2.5/encodings/utf_8.py in
decode, line 16


If I run the super save earlier in the save definition after removing
the textiling of the body section, and then
call the data out of the database further down in the save definition,
and then save it again like this

e = Event.objects.get(id=new_id)
if e.body:
        e.body_html = textile.textile(e.body)
super(Event, e).save()

It all works fine, no encoding or decoding needed for pasted
apostrophes or anything.

Here's the paste of the relevant part of the form with certain
sections commented out so you can see what I mean.

http://pastie.textmate.org/71702

I found this on the google groups form Ivan Sagalev
To summarizes: your storage (a database) and your input/output (the
web)
really should use utf-8 to avoid problems with "strange" characters.
If
you deal internally with unicode (which newforms produce for you) then
for now you should explicitly encode from it to utf-8 until Django
starts doing it automatically.

I've also been reading this thread on the google developers group, and
I'm now completely confused as to what is going on.

unicode issues in multiple tickets

If anyone can tell me if there is some current status on this, or how
it works right now I'd be really grateful.  If I have to encode and
decode then I
don't mind, not much anyway :-)


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to