On 5/29/07, Adam Gordon <[EMAIL PROTECTED]> wrote:
We're using the html:form element and we have the acceptCharset
attribute set to "UTF-8" and when we enter a unicode character in a
textarea in this form element, the UTF-8 character (in this case \u2022,
the bullet symbol: •) displays correctly. However, when the form is
submitted, I'm printing out the bean property for this textarea and I'm
finding that Struts has broken up the UNICODE character into 3 separate
characters: â¢.
We have the Content-Type meta tag set to the UTF-8 charset on the JSP
and we've set the @page directive for contentType also to "text/html;
charset=UTF-8". We've found a workaround, but it's totally a hack as we
basically remove all encodings and then reconstruct the string from a
byte array.
Anyone have an idea as to what might be going on and why Struts appears
to be mucking with the encoding? Thanks.
Struts doesn't do anything with the encoding - I imagine if you
accessed the parameter value directly from the request you will see
the same issue. You could try setting the encoding on the request
using the setCharacterEncoding("UTF-8") - but you need to do that
before the request is processed - (either in the request processor in
Struts 1.2.x or inserting a Command into the chain for 1.3.x)
http://java.sun.com/j2ee/1.4/docs/tutorial/doc/WebI18N5.html
If that doesn't work then I would look at the Servlet Container you're
using - for example Tomcat has the following FAQ:
http://tomcat.apache.org/faq/misc.html#tomcat5CharEncoding
Niall
--adam