Re: utf-8 encoding problem
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Mark and Joe, Mark Thomas wrote: Joseph Shraibman wrote: Mark Thomas wrote: request.setCharacterEncoding(UTF-8); Is this always safe? For responses I can (and do) check the accept-charset request [header], but I can't figure out how to tell what the request encoding should be. Don't forget that Accept-Charset has nothing to do with the request: it's all about the list of charsets that are acceptable for the /response/ to the current request. Setting the encoding of the response is sometimes necessary when the browser (stupidly, IMO) elects not to send the charset being used to the server. - -chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGxab+9CaO5/Lv0PARAhAbAJ0XIzeqDmgiKPqMhQLNSdkJJpgomACfTnZa ZK1KZN1hgbzoPmUdFWnI29o= =4CGT -END PGP SIGNATURE- - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: utf-8 encoding problem
Christopher Schultz wrote: Setting the encoding of the response is sometimes necessary when the browser (stupidly, IMO) elects not to send the charset being used to the server. It isn't the browser's fault, its the spec's fault. See https://bugzilla.mozilla.org/show_bug.cgi?id=289060#c8 - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: utf-8 encoding problem
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Joe, Joseph S wrote: Christopher Schultz wrote: Setting the encoding of the response is sometimes necessary when the browser (stupidly, IMO) elects not to send the charset being used to the server. It isn't the browser's fault, its the spec's fault. See https://bugzilla.mozilla.org/show_bug.cgi?id=289060#c8 Certainly, the specification doesn't help in this regard. I'm disappointed that things like this never get fixed in specifications. This question comes up all the time, and the solution is almost always to simply pick a charset and use it all the time, without question. But that's messy, and doesn't allow the client to make any choices about character encoding, etc. :( - -chris -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.7 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFGxbRM9CaO5/Lv0PARAuqDAJ9rbnlgMeJe5NjCLyWzj1S53EAxHgCdExsx CYVYrMDRFMhDpxUoXMFRpPg= =lW9w -END PGP SIGNATURE- - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: utf-8 encoding problem
Try this then - this is my standard character encoding index.jsp test. %@ page contentType=text/html; charset=UTF-8 % !DOCTYPE HTML PUBLIC -//W3C//DTD HTML 4.01 Transitional//EN html head titleCharacter encoding test page/title /head body pData posted to this form was: % request.setCharacterEncoding(UTF-8); out.print(request.getParameter(mydata)); % /p form method=post action=index.jsp input type=text name=mydata input type=submit value=Submit / input type=reset value=Reset / /form /body /html To get the above working with GET, you'll need to make sure URIEncoding=UTF-8 has been set on the connector as Nathan pointed out earlier. Mark Joseph S wrote: POST Mark Thomas wrote: Joseph S wrote: When I did that my content displayed correctly, but on form submission it got corrupted. POST or GET? Mark - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: utf-8 encoding problem
Mark Thomas wrote: request.setCharacterEncoding(UTF-8); Is this always safe? For responses I can (and do) check the accept-charset request paramater, but I can't figure out how to tell what the request encoding should be. - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: utf-8 encoding problem
This is an old problem. See https://bugzilla.mozilla.org/show_bug.cgi?id=18643 https://bugzilla.mozilla.org/show_bug.cgi?id=7533 Firefox and MSIE use a magic _charset_ paramater, but I can't use it because if I call request.getParamater(_charset_) I can't set the encoding after that! Anyway it seems firefox (and I assume IE) submit the form in whatever the page encoding was, so for all forms I serve up myself I'll just send the endong to UTF-8 and I'll assume it will come back as UTF-8 Does Tomcat know anything about _charset_ ? Joseph Shraibman wrote: Mark Thomas wrote: request.setCharacterEncoding(UTF-8); Is this always safe? For responses I can (and do) check the accept-charset request paramater, but I can't figure out how to tell what the request encoding should be. - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: utf-8 encoding problem
Joseph Shraibman wrote: This is an old problem. See https://bugzilla.mozilla.org/show_bug.cgi?id=18643 https://bugzilla.mozilla.org/show_bug.cgi?id=7533 Firefox and MSIE use a magic _charset_ paramater, but I can't use it because if I call request.getParamater(_charset_) I can't set the encoding after that! Anyway it seems firefox (and I assume IE) submit the form in whatever the page encoding was, so for all forms I serve up myself I'll just send the endong to UTF-8 and I'll assume it will come back as UTF-8 Does Tomcat know anything about _charset_ ? No. Mark - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: utf-8 encoding problem
Joseph Shraibman wrote: Mark Thomas wrote: request.setCharacterEncoding(UTF-8); Is this always safe? For responses I can (and do) check the accept-charset request paramater, but I can't figure out how to tell what the request encoding should be. It should be reasonable unless the user goes out of their way to do soemthing different. In that case they deserve whatever they get. Mark - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: utf-8 encoding problem
A few things... First, what type of apostrophe are you using? Are you using a typical ascii apostrophe (') or are you using the Microsoft slanted apostrophe that comes out of word documents (#8242;)? Here are two links that describe the problem: http://www.cs.tut.fi/~jkorpela/www/windows-chars.html http://www.cs.tut.fi/~jkorpela/chars.html#win Now after reading that you're still having issues, then here is what needs to be done to get utf-8 encoding to work. If you're using mod_jk make sure that the ajp connector is set up to encode using utf-8 like so: Connector port=8009 enableLookups=false redirectPort=8443 protocol=AJP/1.3 URIEncoding=UTF-8 / Next, make sure that the request AND response have been set to use utf encoding. The request MUST have its character encoding set BEFORE any request parameters are requested or the request will default to the machines character encoding. public class ContentTypeFilter implements Filter { private static org.apache.log4j.Logger log = org.apache.log4j.Logger.getLogger(tracking); public void init(FilterConfig config) { } public void destroy() { } public void doFilter(ServletRequest request, ServletResponse response, FilterChain filterChain) throws IOException, ServletException { request = (HttpServletRequest)request; request.setCharacterEncoding(UTF-8); response.setCharacterEncoding(UTF-8); response.setContentType(text/html;charset=UTF-8); filterChain.doFilter(request, response); } } Finally, I would also set the meta header on the jsp page to be utf-8 just to be complete... meta http-equiv=Content-Type content=text/html;charset=utf-8 Regards... Original Message Follows From: Joseph S [EMAIL PROTECTED] Reply-To: Tomcat Users List users@tomcat.apache.org To: Tomcat Users List users@tomcat.apache.org Subject: utf-8 encoding problem Date: Tue, 14 Aug 2007 22:24:28 -0400 My problem is this: One of my pages with an apostrophe was not displaying properly, so I added to my jsp: %@ page contentType=text/html; charset=UTF-8% When I did that my content displayed correctly, but on form submission it got corrupted. You can view the problem here: http://b.tupari.net/ One page displays correctly, but on submit the value gets mangled. The other page doesn't display correctly, but if you cut and paste into the form from the first page the apostrophe does come out correctly on submit. This happens in both firefox and konqueror. So who is to blame here? The web browsers? Tomcat? Apache? - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] _ Tease your brain--play Clink! Win cool prizes! http://club.live.com/clink.aspx?icid=clink_hotmailtextlink2 - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: utf-8 encoding problem
Nathan Hook wrote: A few things... First, what type of apostrophe are you using? Are you using a typical ascii apostrophe (') or are you using the Microsoft slanted apostrophe that comes out of word documents (#8242;)? It's #8217; Here are two links that describe the problem: http://www.cs.tut.fi/~jkorpela/www/windows-chars.html http://www.cs.tut.fi/~jkorpela/chars.html#win That basically says that some windows chars doesn't display properly. That isn't my problem. It displays properly when I set the char encoding to utf-8. My question is why doesn't it submit properly if the original page was sent utf-8 but does submit properly if the original page ISO-8859-1? If you're using mod_jk make sure that the ajp connector is set up to encode using utf-8 like so: Connector port=8009 enableLookups=false redirectPort=8443 protocol=AJP/1.3 URIEncoding=UTF-8 / Next, make sure that the request AND response have been set to use utf encoding. Aren't all requests submitted as application/x-www-form-urlencoded which is an encoded form of unicode? - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: utf-8 encoding problem
Joseph S wrote: When I did that my content displayed correctly, but on form submission it got corrupted. POST or GET? Mark - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: utf-8 encoding problem
POST Mark Thomas wrote: Joseph S wrote: When I did that my content displayed correctly, but on form submission it got corrupted. POST or GET? Mark - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
utf-8 encoding problem
My problem is this: One of my pages with an apostrophe was not displaying properly, so I added to my jsp: %@ page contentType=text/html; charset=UTF-8% When I did that my content displayed correctly, but on form submission it got corrupted. You can view the problem here: http://b.tupari.net/ One page displays correctly, but on submit the value gets mangled. The other page doesn't display correctly, but if you cut and paste into the form from the first page the apostrophe does come out correctly on submit. This happens in both firefox and konqueror. So who is to blame here? The web browsers? Tomcat? Apache? - To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]