On 8/2/07, Simon Willison <[EMAIL PROTECTED]> wrote:
> This is a totally ridiculous flaw with the HTTP spec - you literally
> have no reliable way of telling what encoding a request coming in to
> your site uses, since you can't be absolutely sure that the user-agent
> read a page from your site to find out your character encoding!

W3C FTW!

> One really smart trick you can do is this: attempt to decode as UTF-8
> (which is nice and strict and will fail noisily for pretty much
> anything that isn't either UTF-8 or ASCII, a UTF-8 subset). If
> decoding fails, assume ISO-8859-1 which will decode absolutely
> anything without ever throwing an error (although if the content isn't
> ISO-8859-1 you'll end up with garbage). I tend to call this the Flickr
> trick, because of the lovely big letters here:
> http://www.flickr.com/services/api/misc.encoding.html

Yeah, fooling around with it that's been pretty much the conclusion
I've come to.

I'd like to wait for Malcolm to weigh in since he wrote much of this
code (and I think he's on his way back to AU so it might be a bit
before he's over jetlag and back on the list), but I think this is the
right approach:

* Try to decode the form data using ``settings.DEFAULT_CHARSET``. In
most cases this'll be UTF-8, but when it's not we can try to assume
that data's being POSTed back in the same encoding we're serving it up
in.
* If that fails and ``DEFAULT_CHARSET`` isn't UTF-8, try UTF-8.
That'll deal with relatively sane automated clients (i.e.
``WWW::Mechanize`` and all its clones).
* If that fails, use ISO-WTFBBQNAMBLA-1.

How's that sound?

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to