Rick Schumeyer wrote:

> I will have to try the WIN1252 encoding.
> 
> On the client side, my application is a web browser.  On the server 
> side, it is php scripts on a linux box.  The data comes from copying 
> data from a browser window (pointing to another web site) and pasting it 
> into an html textarea, which is then submitted. 
> 
> Given this, would you still suggest the WIN1252 encoding?

No, sticking to utf-8 is safer. Because in the context you describe, it's the
browser that decides the character set and encoding of the textarea data it has
to submit to the HTTP server. There's a problem when the page that contains the
textarea is US-ASCII for example, but the user pastes some non US-ASCII
characters. Then the browser has to choose a non US-ASCII encoding for the
data, possibly one that the server-side script doesn't expect. I assume this is
what happens in your case and the reason of the error you're getting. An easy
solution is to use utf-8 for the webpage, so the browser won't have to switch
to another encoding since every character is supposed to have a representation
in utf-8, "fancy quotes" and everything else.
Also, you'll find this extensively and better explained in this article, for
example:
http://ppewww.ph.gla.ac.uk/~flavell/charset/form-i18n.html

-- 
 Daniel
 PostgreSQL-powered mail user agent and storage: http://www.manitou-mail.org


---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Reply via email to