Hello,

I'm having some trouble with strange characters that come from MS
Word. User's are copying text from MS Word, pasting it into a textarea
on the site. I then save that text into MySQL (table charset is set to
utf-8). Then later I retrieve that data to create a RTF document then
an email it to somebody. The problem is that a lot of characters from
word aren't displaying correctly.

Example an " will return \xe2\x80\x93 from the database. What type of
encoding is that?

I've tried running this through all sorts of encoding functions in
python but haven't had any success, unicode, smart_unicode, decode &
encode functions.

My next guess would be that perhaps these strange characters need to
be removed before they are saved to the database. Any ideas?
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/django-users?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to