Re: utf-8 encoding problem

2007-08-17 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Mark and Joe,

Mark Thomas wrote:
 Joseph Shraibman wrote:
 Mark Thomas wrote:

request.setCharacterEncoding(UTF-8);

 Is this always safe?  For responses I can (and do) check the
 accept-charset request [header], but I can't figure out how to tell
 what the request encoding should be.

Don't forget that Accept-Charset has nothing to do with the request:
it's all about the list of charsets that are acceptable for the
/response/ to the current request.

Setting the encoding of the response is sometimes necessary when the
browser (stupidly, IMO) elects not to send the charset being used to the
server.

- -chris
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGxab+9CaO5/Lv0PARAhAbAJ0XIzeqDmgiKPqMhQLNSdkJJpgomACfTnZa
ZK1KZN1hgbzoPmUdFWnI29o=
=4CGT
-END PGP SIGNATURE-

-
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: utf-8 encoding problem

2007-08-17 Thread Joseph S



Christopher Schultz wrote:


Setting the encoding of the response is sometimes necessary when the
browser (stupidly, IMO) elects not to send the charset being used to the
server.

It isn't the browser's fault, its the spec's fault. See 
https://bugzilla.mozilla.org/show_bug.cgi?id=289060#c8


-
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: utf-8 encoding problem

2007-08-17 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Joe,

Joseph S wrote:
 Christopher Schultz wrote:
 
 Setting the encoding of the response is sometimes necessary when the
 browser (stupidly, IMO) elects not to send the charset being used to the
 server.

 It isn't the browser's fault, its the spec's fault. See
 https://bugzilla.mozilla.org/show_bug.cgi?id=289060#c8

Certainly, the specification doesn't help in this regard. I'm
disappointed that things like this never get fixed in specifications.
This question comes up all the time, and the solution is almost always
to simply pick a charset and use it all the time, without question. But
that's messy, and doesn't allow the client to make any choices about
character encoding, etc. :(

- -chris
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.7 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGxbRM9CaO5/Lv0PARAuqDAJ9rbnlgMeJe5NjCLyWzj1S53EAxHgCdExsx
CYVYrMDRFMhDpxUoXMFRpPg=
=lW9w
-END PGP SIGNATURE-

-
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: utf-8 encoding problem

2007-08-16 Thread Mark Thomas
Try this then - this is my standard character encoding index.jsp test.

%@ page contentType=text/html; charset=UTF-8 %
!DOCTYPE HTML PUBLIC -//W3C//DTD HTML 4.01 Transitional//EN
html
   head
 titleCharacter encoding test page/title
   /head
   body
 pData posted to this form was:
 %
   request.setCharacterEncoding(UTF-8);
   out.print(request.getParameter(mydata));
 %

 /p
 form method=post action=index.jsp
   input type=text name=mydata
   input type=submit value=Submit /
   input type=reset value=Reset /
 /form
   /body
/html

To get the above working with GET, you'll need to make sure
URIEncoding=UTF-8 has been set on the connector as Nathan pointed
out earlier.

Mark

Joseph S wrote:
 POST
 
 Mark Thomas wrote:
 Joseph S wrote:
 When I did that my content displayed correctly, but on form submission
 it got corrupted.

 POST or GET?

 Mark


 -
 To start a new topic, e-mail: users@tomcat.apache.org
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 
 -
 To start a new topic, e-mail: users@tomcat.apache.org
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 
 


-
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: utf-8 encoding problem

2007-08-16 Thread Joseph Shraibman

Mark Thomas wrote:


   request.setCharacterEncoding(UTF-8);


Is this always safe?  For responses I can (and do) check the 
accept-charset request paramater, but I can't figure out how to tell 
what the request encoding should be.


-
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: utf-8 encoding problem

2007-08-16 Thread Joseph Shraibman
This is an old problem.  See 
https://bugzilla.mozilla.org/show_bug.cgi?id=18643

https://bugzilla.mozilla.org/show_bug.cgi?id=7533

Firefox and MSIE use a magic _charset_ paramater, but I can't use it 
because if I call request.getParamater(_charset_) I can't set the 
encoding after that!


Anyway it seems firefox (and I assume IE) submit the form in whatever 
the page encoding was, so for all forms I serve up myself I'll just send 
the endong to UTF-8 and I'll assume it will come back as UTF-8


Does Tomcat know anything about _charset_ ?

Joseph Shraibman wrote:

Mark Thomas wrote:


   request.setCharacterEncoding(UTF-8);


Is this always safe?  For responses I can (and do) check the 
accept-charset request paramater, but I can't figure out how to tell 
what the request encoding should be.


-
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



-
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: utf-8 encoding problem

2007-08-16 Thread Mark Thomas
Joseph Shraibman wrote:
 This is an old problem.  See
 https://bugzilla.mozilla.org/show_bug.cgi?id=18643
 https://bugzilla.mozilla.org/show_bug.cgi?id=7533
 
 Firefox and MSIE use a magic _charset_ paramater, but I can't use it
 because if I call request.getParamater(_charset_) I can't set the
 encoding after that!
 
 Anyway it seems firefox (and I assume IE) submit the form in whatever
 the page encoding was, so for all forms I serve up myself I'll just send
 the endong to UTF-8 and I'll assume it will come back as UTF-8
 
 Does Tomcat know anything about _charset_ ?
No.

Mark


-
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: utf-8 encoding problem

2007-08-16 Thread Mark Thomas
Joseph Shraibman wrote:
 Mark Thomas wrote:
 
request.setCharacterEncoding(UTF-8);
 
 Is this always safe?  For responses I can (and do) check the
 accept-charset request paramater, but I can't figure out how to tell
 what the request encoding should be.

It should be reasonable unless the user goes out of their way to do
soemthing different. In that case they deserve whatever they get.

Mark


-
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: utf-8 encoding problem

2007-08-15 Thread Nathan Hook

A few things...

First, what type of apostrophe are you using?  Are you using a typical ascii 
apostrophe (') or are you using the Microsoft slanted apostrophe that comes 
out of word documents (#8242;)?


Here are two links that describe the problem:

http://www.cs.tut.fi/~jkorpela/www/windows-chars.html
http://www.cs.tut.fi/~jkorpela/chars.html#win

Now after reading that you're still having issues, then here is what needs 
to be done to get utf-8 encoding to work.


If you're using mod_jk make sure that the ajp connector is set up to encode 
using utf-8 like so:


Connector port=8009 enableLookups=false redirectPort=8443 
protocol=AJP/1.3 URIEncoding=UTF-8 /



Next, make sure that the request AND response have been set to use utf 
encoding.  The request MUST have its character encoding set BEFORE any 
request parameters are requested or the request will default to the machines 
character encoding.


public class ContentTypeFilter implements Filter
{
 private static org.apache.log4j.Logger log = 
org.apache.log4j.Logger.getLogger(tracking);


 public void init(FilterConfig config)
 {
 }

 public void destroy()
 {
 }

 public void doFilter(ServletRequest request, ServletResponse response, 
FilterChain filterChain) throws IOException, ServletException

 {
request = (HttpServletRequest)request;
request.setCharacterEncoding(UTF-8);

response.setCharacterEncoding(UTF-8);
response.setContentType(text/html;charset=UTF-8);

filterChain.doFilter(request, response);
 }
}

Finally, I would also set the meta header on the jsp page to be utf-8 just 
to be complete...


meta http-equiv=Content-Type content=text/html;charset=utf-8 

Regards...

Original Message Follows
From: Joseph S [EMAIL PROTECTED]
Reply-To: Tomcat Users List users@tomcat.apache.org
To: Tomcat Users List users@tomcat.apache.org
Subject: utf-8 encoding problem
Date: Tue, 14 Aug 2007 22:24:28 -0400

My problem is this:

One of my pages with an apostrophe was not displaying properly, so I added 
to my jsp:


%@ page contentType=text/html; charset=UTF-8%

When I did that my content displayed correctly, but on form submission it 
got corrupted.


You can view the problem here:

http://b.tupari.net/

One page displays correctly, but on submit the value gets mangled.  The 
other page doesn't display correctly, but if you cut and paste into the form 
from the first page the apostrophe does come out correctly on submit.


This happens in both firefox and konqueror.  So who is to blame here? The 
web browsers?  Tomcat?  Apache?


-
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

_
Tease your brain--play Clink! Win cool prizes! 
http://club.live.com/clink.aspx?icid=clink_hotmailtextlink2



-
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: utf-8 encoding problem

2007-08-15 Thread Joseph Shraibman



Nathan Hook wrote:

A few things...

First, what type of apostrophe are you using?  Are you using a typical 
ascii apostrophe (') or are you using the Microsoft slanted apostrophe 
that comes out of word documents (#8242;)?



It's #8217;


Here are two links that describe the problem:

http://www.cs.tut.fi/~jkorpela/www/windows-chars.html
http://www.cs.tut.fi/~jkorpela/chars.html#win


That basically says that some windows chars doesn't display properly. 
That isn't my problem.  It displays properly when I set the char 
encoding to utf-8.  My question is why doesn't it submit properly if the 
original page was sent utf-8 but does submit properly if the original 
page ISO-8859-1?


If you're using mod_jk make sure that the ajp connector is set up to 
encode using utf-8 like so:


Connector port=8009 enableLookups=false redirectPort=8443 
protocol=AJP/1.3 URIEncoding=UTF-8 /



Next, make sure that the request AND response have been set to use utf 
encoding. 


Aren't all requests submitted as application/x-www-form-urlencoded which 
is an encoded form of unicode?



-
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: utf-8 encoding problem

2007-08-15 Thread Mark Thomas
Joseph S wrote:
 When I did that my content displayed correctly, but on form submission
 it got corrupted.

POST or GET?

Mark


-
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: utf-8 encoding problem

2007-08-15 Thread Joseph S

POST

Mark Thomas wrote:

Joseph S wrote:

When I did that my content displayed correctly, but on form submission
it got corrupted.


POST or GET?

Mark


-
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


-
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



utf-8 encoding problem

2007-08-14 Thread Joseph S

My problem is this:

One of my pages with an apostrophe was not displaying properly, so I 
added to my jsp:


%@ page contentType=text/html; charset=UTF-8%

When I did that my content displayed correctly, but on form submission 
it got corrupted.


You can view the problem here:

http://b.tupari.net/

One page displays correctly, but on submit the value gets mangled.  The 
other page doesn't display correctly, but if you cut and paste into the 
form from the first page the apostrophe does come out correctly on submit.


This happens in both firefox and konqueror.  So who is to blame here? 
The web browsers?  Tomcat?  Apache?


-
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]