Re: Problems with utf-8 encoding
Yair Zohar wrote: Guy Katz wrote: google it. there's a lot. -Original Message- From: Yair Zohar [mailto:[EMAIL PROTECTED] Sent: Monday, September 19, 2005 11:08 AM To: Tomcat Users List Subject: Re: Problems with utf-8 encoding Guy Katz wrote: put an encoding filter in front of your servlet/jsp's that sets a UTF-8 encoding for incoming requests and outgoing responses. its your safest bet for tomcat 4 as far as i remember. -Original Message- From: Yair Zohar [mailto:[EMAIL PROTECTED] Sent: Monday, September 19, 2005 9:43 AM To: Tomcat Users List Subject: Re: Problems with utf-8 encoding Anto Paul wrote: On 9/19/05, Yair Zohar <[EMAIL PROTECTED]> wrote: Hello, I'm using Tomcat 4.1.18 I'm trying to read hebrew data in utf-8 encoding from the database. As a check I entered a utf-8 encoded 'alef' letter to the database field. (I see it in the database as one letter 'alef'). The jsp page that displays the data, prints two chars instead of one. I checked the values of these chars and they are 215 114, which are the utf-8 combination to create the letter 'alef' (so I was told). jps code: <%@ page language="java" contentType="text/html;charset=UTF-8" pageEncoding="UTF-8" info="Tables Handler" import="tablesHandler.*" %> class="tablesHandler.TableViewer" /> <% request.setCharacterEncoding("UTF-8");%> Move <% request.setCharacterEncoding("UTF-8");%> to before jsp:useBean tag. Thanks for replying, It didn't fix the problem, I still see the same two chars. Yair. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] Guy, can you direct me to practical documentation on implementing such a filter ? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] Hi again, I implemented the SetCharacterEncodingFilter from the tomcat 4 examples, In order to check the control I have on the character encoding of the request and response I changed the doFilter method to be: public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException { request.setCharacterEncoding("UTF-8"); System.out.println("Request: "+request.getCharacterEncoding()); response.setContentType("text/html; charset=UTF-8"); System.out.println("Response: "+response.getCharacterEncoding()); // Pass control on to the next filter chain.doFilter(request, response); } request.getCharacterEncoding() returns null. It also returns null if I put the request.setCharacterEncoding("UTF-8"); as a remark. (my page contains utf-8 encoding directives). The response however is set to UTF-8. Can this explain my problem using UTF-8 encoding ? Does anybody know how to solve it ? The filter does the work, it solved the problem. Thanks all who helped, Yair. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Problems with utf-8 encoding
Guy Katz wrote: google it. there's a lot. -Original Message- From: Yair Zohar [mailto:[EMAIL PROTECTED] Sent: Monday, September 19, 2005 11:08 AM To: Tomcat Users List Subject: Re: Problems with utf-8 encoding Guy Katz wrote: put an encoding filter in front of your servlet/jsp's that sets a UTF-8 encoding for incoming requests and outgoing responses. its your safest bet for tomcat 4 as far as i remember. -Original Message- From: Yair Zohar [mailto:[EMAIL PROTECTED] Sent: Monday, September 19, 2005 9:43 AM To: Tomcat Users List Subject: Re: Problems with utf-8 encoding Anto Paul wrote: On 9/19/05, Yair Zohar <[EMAIL PROTECTED]> wrote: Hello, I'm using Tomcat 4.1.18 I'm trying to read hebrew data in utf-8 encoding from the database. As a check I entered a utf-8 encoded 'alef' letter to the database field. (I see it in the database as one letter 'alef'). The jsp page that displays the data, prints two chars instead of one. I checked the values of these chars and they are 215 114, which are the utf-8 combination to create the letter 'alef' (so I was told). jps code: <%@ page language="java" contentType="text/html;charset=UTF-8" pageEncoding="UTF-8" info="Tables Handler" import="tablesHandler.*" %> <% request.setCharacterEncoding("UTF-8");%> Move <% request.setCharacterEncoding("UTF-8");%> to before jsp:useBean tag. Thanks for replying, It didn't fix the problem, I still see the same two chars. Yair. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] Guy, can you direct me to practical documentation on implementing such a filter ? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] Hi again, I implemented the SetCharacterEncodingFilter from the tomcat 4 examples, In order to check the control I have on the character encoding of the request and response I changed the doFilter method to be: public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException { request.setCharacterEncoding("UTF-8"); System.out.println("Request: "+request.getCharacterEncoding()); response.setContentType("text/html; charset=UTF-8"); System.out.println("Response: "+response.getCharacterEncoding()); // Pass control on to the next filter chain.doFilter(request, response); } request.getCharacterEncoding() returns null. It also returns null if I put the request.setCharacterEncoding("UTF-8"); as a remark. (my page contains utf-8 encoding directives). The response however is set to UTF-8. Can this explain my problem using UTF-8 encoding ? Does anybody know how to solve it ?
RE: Problems with utf-8 encoding
google it. there's a lot. -Original Message- From: Yair Zohar [mailto:[EMAIL PROTECTED] Sent: Monday, September 19, 2005 11:08 AM To: Tomcat Users List Subject: Re: Problems with utf-8 encoding Guy Katz wrote: >put an encoding filter in front of your servlet/jsp's that sets a UTF-8 >encoding for incoming requests and outgoing responses. its your safest bet for >tomcat 4 as far as i remember. > >-Original Message- >From: Yair Zohar [mailto:[EMAIL PROTECTED] >Sent: Monday, September 19, 2005 9:43 AM >To: Tomcat Users List >Subject: Re: Problems with utf-8 encoding > > >Anto Paul wrote: > > > >>On 9/19/05, Yair Zohar <[EMAIL PROTECTED]> wrote: >> >> >> >> >>>Hello, >>>I'm using Tomcat 4.1.18 >>>I'm trying to read hebrew data in utf-8 encoding from the database. As a >>>check I entered a utf-8 encoded 'alef' letter to the database field. >>>(I see it in the database as one letter 'alef'). The jsp page that >>>displays the data, prints two chars instead of one. I checked the values >>>of these chars and >>>they are 215 114, which are the utf-8 combination to create the letter >>>'alef' (so I was told). >>> >>>jps code: >>> >>><%@ page language="java" contentType="text/html;charset=UTF-8" >>>pageEncoding="UTF-8" info="Tables Handler" import="tablesHandler.*" %> >>> >>> >>> >>> >>><% request.setCharacterEncoding("UTF-8");%> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>Move <% request.setCharacterEncoding("UTF-8");%> to before jsp:useBean tag. >> >> >> >> >> >Thanks for replying, >It didn't fix the problem, I still see the same two chars. >Yair. > > >- >To unsubscribe, e-mail: [EMAIL PROTECTED] >For additional commands, e-mail: [EMAIL PROTECTED] > > > > > Guy, can you direct me to practical documentation on implementing such a filter ? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Problems with utf-8 encoding
Guy Katz wrote: put an encoding filter in front of your servlet/jsp's that sets a UTF-8 encoding for incoming requests and outgoing responses. its your safest bet for tomcat 4 as far as i remember. -Original Message- From: Yair Zohar [mailto:[EMAIL PROTECTED] Sent: Monday, September 19, 2005 9:43 AM To: Tomcat Users List Subject: Re: Problems with utf-8 encoding Anto Paul wrote: On 9/19/05, Yair Zohar <[EMAIL PROTECTED]> wrote: Hello, I'm using Tomcat 4.1.18 I'm trying to read hebrew data in utf-8 encoding from the database. As a check I entered a utf-8 encoded 'alef' letter to the database field. (I see it in the database as one letter 'alef'). The jsp page that displays the data, prints two chars instead of one. I checked the values of these chars and they are 215 114, which are the utf-8 combination to create the letter 'alef' (so I was told). jps code: <%@ page language="java" contentType="text/html;charset=UTF-8" pageEncoding="UTF-8" info="Tables Handler" import="tablesHandler.*" %> <% request.setCharacterEncoding("UTF-8");%> Move <% request.setCharacterEncoding("UTF-8");%> to before jsp:useBean tag. Thanks for replying, It didn't fix the problem, I still see the same two chars. Yair. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] Guy, can you direct me to practical documentation on implementing such a filter ?
Re: Problems with utf-8 encoding - continue
Jilles van Gurp wrote: Why aren't you using setContentType("text/html, "utf-8") on the response? As I use jsp, I don't know how can I control the response that way. What content-type is the server actually returning (use the live http headers extension for firefox or something similar to find out). I couldn't find if this extension is installed, or how to install it. In the page info : Type: text/html Encoding: UTF-8 Meta: Content-Type text/html; charset=UTF-8 What database and jdbc driver are you using? What method are you using to store the string in the database? I use mysql 4.1.14 + connector 3.1.10 the url for the driver is: "jdbc:mysql://"+Utils.getServerName()+":3306/"+Utils.getDatabaseName()+"?characterEncoding=UTF-8&characterSetResults=UTF-8" the tables definitions: ENGINE=MyISAM DEFAULT CHARSET=utf8 I've had utf-8 trouble with several databases. For example mysql 4.1 + the latest jdbc driver + setCharacterStream had some strange effects. First of all you need to tell mysql to use utf-8 (it defaults to something else) and even if you do that setCharacterStream has some issues that go away if you use setString. Oracle on the other hand cannot insert strings larger than 4KB with setString so you need to use setCharacterStream. Incidently, the mysql driver implementation of setCharacterString is implemented using setString! I use Statement class executeUpdate(String str) method to update and executeQuery(String str) to query the database. Regards, Jilles Yair Zohar wrote: sorry for the double mail, I forgot to add my server.xml encoding definitions: port="8080" URIEncoding="UTF-8" useBodyEncodingForURI="true" minProcessors="5" maxProcessors="75" enableLookups="true" redirectPort="8443" acceptCount="100" debug="0" connectionTimeout="2" useURIValidationHack="false" disableUploadTimeout="true" /> I tried it with and without the useBodyEncodingForURI="true"directive. Yair. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Problems with utf-8 encoding
put an encoding filter in front of your servlet/jsp's that sets a UTF-8 encoding for incoming requests and outgoing responses. its your safest bet for tomcat 4 as far as i remember. -Original Message- From: Yair Zohar [mailto:[EMAIL PROTECTED] Sent: Monday, September 19, 2005 9:43 AM To: Tomcat Users List Subject: Re: Problems with utf-8 encoding Anto Paul wrote: >On 9/19/05, Yair Zohar <[EMAIL PROTECTED]> wrote: > > >>Hello, >>I'm using Tomcat 4.1.18 >>I'm trying to read hebrew data in utf-8 encoding from the database. As a >>check I entered a utf-8 encoded 'alef' letter to the database field. >>(I see it in the database as one letter 'alef'). The jsp page that >>displays the data, prints two chars instead of one. I checked the values >>of these chars and >>they are 215 114, which are the utf-8 combination to create the letter >>'alef' (so I was told). >> >>jps code: >> >><%@ page language="java" contentType="text/html;charset=UTF-8" >>pageEncoding="UTF-8" info="Tables Handler" import="tablesHandler.*" %> >> >> >> >> >><% request.setCharacterEncoding("UTF-8");%> >> >> >> >> >> >> >> >> >> > >Move <% request.setCharacterEncoding("UTF-8");%> to before jsp:useBean tag. > > > Thanks for replying, It didn't fix the problem, I still see the same two chars. Yair. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Problems with utf-8 encoding - continue
Jilles van Gurp wrote: Oracle on the other hand cannot insert strings larger than 4KB with setString so you need to use setCharacterStream. FYI: This is "common knowledge" that used to be right, but isn't anymore. With the Oracle 10g JDBC driver you can set arbitrary length strings with setString http://www.oracle.com/technology/sample_code/tech/java/codesnippet/jdbc/clob10g/handlingclobsinoraclejdbc10g.html - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Problems with utf-8 encoding
Does your browser supports hebrew ?. If you are just getting the data from database and displaying, it should work fine. What database and JDBC driver you are using ?. On 9/19/05, Yair Zohar <[EMAIL PROTECTED]> wrote: > Anto Paul wrote: > > >On 9/19/05, Yair Zohar <[EMAIL PROTECTED]> wrote: > > > > > >>Hello, > >>I'm using Tomcat 4.1.18 > >>I'm trying to read hebrew data in utf-8 encoding from the database. As a > >>check I entered a utf-8 encoded 'alef' letter to the database field. > >>(I see it in the database as one letter 'alef'). The jsp page that > >>displays the data, prints two chars instead of one. I checked the values > >>of these chars and > >>they are 215 114, which are the utf-8 combination to create the letter > >>'alef' (so I was told). > >> > >>jps code: > >> > >><%@ page language="java" contentType="text/html;charset=UTF-8" > >>pageEncoding="UTF-8" info="Tables Handler" import="tablesHandler.*" %> > >> > >> > >> > >> > >><% request.setCharacterEncoding("UTF-8");%> > >> > >> > >> > >> > >> > >> > >> > >> > >> > > > >Move <% request.setCharacterEncoding("UTF-8");%> to before jsp:useBean tag. > > > > > > > Thanks for replying, > It didn't fix the problem, I still see the same two chars. > Yair. > > > -- rgds Anto Paul - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Problems with utf-8 encoding - continue
Why aren't you using setContentType("text/html, "utf-8") on the response? What content-type is the server actually returning (use the live http headers extension for firefox or something similar to find out). What database and jdbc driver are you using? What method are you using to store the string in the database? I've had utf-8 trouble with several databases. For example mysql 4.1 + the latest jdbc driver + setCharacterStream had some strange effects. First of all you need to tell mysql to use utf-8 (it defaults to something else) and even if you do that setCharacterStream has some issues that go away if you use setString. Oracle on the other hand cannot insert strings larger than 4KB with setString so you need to use setCharacterStream. Incidently, the mysql driver implementation of setCharacterString is implemented using setString! Regards, Jilles Yair Zohar wrote: sorry for the double mail, I forgot to add my server.xml encoding definitions: port="8080" URIEncoding="UTF-8" useBodyEncodingForURI="true" minProcessors="5" maxProcessors="75" enableLookups="true" redirectPort="8443" acceptCount="100" debug="0" connectionTimeout="2" useURIValidationHack="false" disableUploadTimeout="true" /> I tried it with and without the useBodyEncodingForURI="true"directive. Yair. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Problems with utf-8 encoding
Anto Paul wrote: On 9/19/05, Yair Zohar <[EMAIL PROTECTED]> wrote: Hello, I'm using Tomcat 4.1.18 I'm trying to read hebrew data in utf-8 encoding from the database. As a check I entered a utf-8 encoded 'alef' letter to the database field. (I see it in the database as one letter 'alef'). The jsp page that displays the data, prints two chars instead of one. I checked the values of these chars and they are 215 114, which are the utf-8 combination to create the letter 'alef' (so I was told). jps code: <%@ page language="java" contentType="text/html;charset=UTF-8" pageEncoding="UTF-8" info="Tables Handler" import="tablesHandler.*" %> <% request.setCharacterEncoding("UTF-8");%> Move <% request.setCharacterEncoding("UTF-8");%> to before jsp:useBean tag. Thanks for replying, It didn't fix the problem, I still see the same two chars. Yair.
Re: Problems with utf-8 encoding
On 9/19/05, Yair Zohar <[EMAIL PROTECTED]> wrote: > > Hello, > I'm using Tomcat 4.1.18 > I'm trying to read hebrew data in utf-8 encoding from the database. As a > check I entered a utf-8 encoded 'alef' letter to the database field. > (I see it in the database as one letter 'alef'). The jsp page that > displays the data, prints two chars instead of one. I checked the values > of these chars and > they are 215 114, which are the utf-8 combination to create the letter > 'alef' (so I was told). > > jps code: > > <%@ page language="java" contentType="text/html;charset=UTF-8" > pageEncoding="UTF-8" info="Tables Handler" import="tablesHandler.*" %> > > > > > <% request.setCharacterEncoding("UTF-8");%> > > > > > > > Move <% request.setCharacterEncoding("UTF-8");%> to before jsp:useBean tag. -- rgds Anto Paul - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Problems with utf-8 encoding - continue
sorry for the double mail, I forgot to add my server.xml encoding definitions: I tried it with and without the useBodyEncodingForURI="true"directive. Yair. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Problems with utf-8 encoding
Hello, I'm using Tomcat 4.1.18 I'm trying to read hebrew data in utf-8 encoding from the database. As a check I entered a utf-8 encoded 'alef' letter to the database field. (I see it in the database as one letter 'alef'). The jsp page that displays the data, prints two chars instead of one. I checked the values of these chars and they are 215 114, which are the utf-8 combination to create the letter 'alef' (so I was told). jps code: <%@ page language="java" contentType="text/html;charset=UTF-8" pageEncoding="UTF-8" info="Tables Handler" import="tablesHandler.*" %> <% request.setCharacterEncoding("UTF-8");%> I tried it in all the combinations of the 'UTF-8' directives. Does some have an idea how can I tell tomcat to display it as one char (the letter alef) and not two separated gibrish chars? Or maybe it's another issue ? Thanks ahead, Yair. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Tomcat 5.0.30 - UTF-8 encoding not working
Hi Mark, Adding URIEncoding="UTF-8" to the coyote connector did the trick. Thanks a bunch! My guess is that our app will be hosted on a tomcat instance that only hosts UTF-8-aware apps. Thanks and regards Sanjay -Original Message- From: Karanjkar, Sanjay V (IT) Sent: Friday, June 03, 2005 10:44 AM To: Tomcat Users List; [EMAIL PROTECTED] Subject: RE: Tomcat 5.0.30 - UTF-8 encoding not working Hi, Apologies, my previous mail was missing a few things... Correction -> Tomcat *does* show UTF-8 encoded data correctly (after fetching from the database). It also saves UTF-8 encoded data correctly (I verified this by looking at a saved record). However, the place where it fails is when I pass UTF-8 data as a URL parameter to a popup screen. In the attached screenshot, you can see that the main screen fetches and displays UTF-8 data correctly but the popup screen (which pops up on clicking the "Edit" button) shows garbled characters. I checked the encoding on the popup screen and it does show me UTF-8. Am I losing the encoding when constructing the URL string? Note that this all works fine when I use ServletExec.. Fyi, the popup screen is launched via the following javascript code: function editConfirmComment() { var form = document.frm_update; var confirmComment = form.updComment.value; var url = '../../fc3Common/view/externalCommentDetails.jsp?dummy=dummy' + '&confirmComment=' + encodeURIComponent(confirmComment); popupWindow(url, 'ExternalCommentDetails', 480, 240); } Mark, in this case would I need to do as you said in your comments? >>> If you are encoding your data in the URI, you will need to set the >>> URIEncoding attribute on the coyote connector to "UTF-8" to ensure >>> that the URI is decoded correctly. >>> A clean Tomcat install should only require URIEncoding="UTF-8" >>> to be added to the connector in server.xml for these to work for any >>> UTF-8 data. One issue is that my app would be hosted on a web farm. As the above looks to be a server-wide change, it will affect other apps hosted on the instance too, right? Thanks and regards Sanjay Morgan Stanley -Original Message- From: Mark Thomas [mailto:[EMAIL PROTECTED] Sent: Friday, June 03, 2005 12:27 AM To: Tomcat Users List Subject: Re: Tomcat 5.0.30 - UTF-8 encoding not working Karanjkar, Sanjay V (IT) wrote: > Hi msjava, > > I'm trying to migrate our webapp from ServletExec4.1.1/JDK1.3.1 to > Tomcat5.0.30/JDK1.4.2. > On ServletExec, our app was showing/saving UTF-8 strings correctly. However, > after migration to Tomcat, the pages are not showing UTF-8 encoded content > correctly. If your are POSTing your data, request.setCharacterEncoding("UTF-8") should do the trick but you MUST call this before any parameters are read. If you are encoding your data in the URI, you will need to set the URIEncoding attribute on the coyote connector to "UTF-8" to ensure that the URI is decoded correctly. > Do I need to do something else for Tomcat? In particular, do I need to > do the stuff mentioned here: > http://wiki.apache.org/jakarta-tomcat/Tomcat/UTF-8 1. Yes 2 & 3 - No . These might work under some circumstances but 2. is trying to change a read-only property and 3. is hacking around the data not being handled correctly in the first place. When I am testing this, I use the following JSP to make sure Tomcat is correctly configured. A clean Tomcat install should only require URIEncoding="UTF-8" to be added to the connector in server.xml for these to work for any UTF-8 data. You should test it with both method="post" and method="get" <%@ page contentType="text/html; charset=UTF-8" %> UTF-8 test page UTF-8 data posted to this form was: <% request.setCharacterEncoding("UTF-8"); out.print(request.getParameter("mydata")); %> If this works, then the chances are your app isn't quite right. If you have a test case that doesn't work (try and make it as simple as possible) post it to the list and I'll take a look. Mark - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] NOTICE: If received in error, please destroy and notify sender. Sender does not waive confidentiality or privilege, and use is prohibited. NOTICE: If received in error, please destroy and notify sender. Sender does not waive confidentiality or privilege, and use is prohibited. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Tomcat 5.0.30 - UTF-8 encoding not working
Hi, Apologies, my previous mail was missing a few things... Correction -> Tomcat *does* show UTF-8 encoded data correctly (after fetching from the database). It also saves UTF-8 encoded data correctly (I verified this by looking at a saved record). However, the place where it fails is when I pass UTF-8 data as a URL parameter to a popup screen. In the attached screenshot, you can see that the main screen fetches and displays UTF-8 data correctly but the popup screen (which pops up on clicking the "Edit" button) shows garbled characters. I checked the encoding on the popup screen and it does show me UTF-8. Am I losing the encoding when constructing the URL string? Note that this all works fine when I use ServletExec.. Fyi, the popup screen is launched via the following javascript code: function editConfirmComment() { var form = document.frm_update; var confirmComment = form.updComment.value; var url = '../../fc3Common/view/externalCommentDetails.jsp?dummy=dummy' + '&confirmComment=' + encodeURIComponent(confirmComment); popupWindow(url, 'ExternalCommentDetails', 480, 240); } Mark, in this case would I need to do as you said in your comments? >>> If you are encoding your data in the URI, you will need to set the >>> URIEncoding attribute on the coyote connector to "UTF-8" to ensure >>> that the URI is decoded correctly. >>> A clean Tomcat install should only require URIEncoding="UTF-8" >>> to be added to the connector in server.xml for these to work >>> for any UTF-8 data. One issue is that my app would be hosted on a web farm. As the above looks to be a server-wide change, it will affect other apps hosted on the instance too, right? Thanks and regards Sanjay Morgan Stanley -Original Message- From: Mark Thomas [mailto:[EMAIL PROTECTED] Sent: Friday, June 03, 2005 12:27 AM To: Tomcat Users List Subject: Re: Tomcat 5.0.30 - UTF-8 encoding not working Karanjkar, Sanjay V (IT) wrote: > Hi msjava, > > I'm trying to migrate our webapp from ServletExec4.1.1/JDK1.3.1 to > Tomcat5.0.30/JDK1.4.2. > On ServletExec, our app was showing/saving UTF-8 strings correctly. However, > after migration to Tomcat, the pages are not showing UTF-8 encoded content > correctly. If your are POSTing your data, request.setCharacterEncoding("UTF-8") should do the trick but you MUST call this before any parameters are read. If you are encoding your data in the URI, you will need to set the URIEncoding attribute on the coyote connector to "UTF-8" to ensure that the URI is decoded correctly. > Do I need to do something else for Tomcat? In particular, do I need to > do the stuff mentioned here: > http://wiki.apache.org/jakarta-tomcat/Tomcat/UTF-8 1. Yes 2 & 3 - No . These might work under some circumstances but 2. is trying to change a read-only property and 3. is hacking around the data not being handled correctly in the first place. When I am testing this, I use the following JSP to make sure Tomcat is correctly configured. A clean Tomcat install should only require URIEncoding="UTF-8" to be added to the connector in server.xml for these to work for any UTF-8 data. You should test it with both method="post" and method="get" <%@ page contentType="text/html; charset=UTF-8" %> UTF-8 test page UTF-8 data posted to this form was: <% request.setCharacterEncoding("UTF-8"); out.print(request.getParameter("mydata")); %> If this works, then the chances are your app isn't quite right. If you have a test case that doesn't work (try and make it as simple as possible) post it to the list and I'll take a look. Mark - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] NOTICE: If received in error, please destroy and notify sender. Sender does not waive confidentiality or privilege, and use is prohibited. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Tomcat 5.0.30 - UTF-8 encoding not working
Karanjkar, Sanjay V (IT) wrote: Hi msjava, I'm trying to migrate our webapp from ServletExec4.1.1/JDK1.3.1 to Tomcat5.0.30/JDK1.4.2. On ServletExec, our app was showing/saving UTF-8 strings correctly. However, after migration to Tomcat, the pages are not showing UTF-8 encoded content correctly. If your are POSTing your data, request.setCharacterEncoding("UTF-8") should do the trick but you MUST call this before any parameters are read. If you are encoding your data in the URI, you will need to set the URIEncoding attribute on the coyote connector to "UTF-8" to ensure that the URI is decoded correctly. Do I need to do something else for Tomcat? In particular, do I need to do the stuff mentioned here: http://wiki.apache.org/jakarta-tomcat/Tomcat/UTF-8 1. Yes 2 & 3 - No . These might work under some circumstances but 2. is trying to change a read-only property and 3. is hacking around the data not being handled correctly in the first place. When I am testing this, I use the following JSP to make sure Tomcat is correctly configured. A clean Tomcat install should only require URIEncoding="UTF-8" to be added to the connector in server.xml for these to work for any UTF-8 data. You should test it with both method="post" and method="get" <%@ page contentType="text/html; charset=UTF-8" %> UTF-8 test page UTF-8 data posted to this form was: <% request.setCharacterEncoding("UTF-8"); out.print(request.getParameter("mydata")); %> If this works, then the chances are your app isn't quite right. If you have a test case that doesn't work (try and make it as simple as possible) post it to the list and I'll take a look. Mark - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Tomcat 5.0.30 - UTF-8 encoding not working
Hi msjava, I'm trying to migrate our webapp from ServletExec4.1.1/JDK1.3.1 to Tomcat5.0.30/JDK1.4.2. On ServletExec, our app was showing/saving UTF-8 strings correctly. However, after migration to Tomcat, the pages are not showing UTF-8 encoded content correctly. All our JSP pages contain the following: --- <%@ page import="java.util.*, java.lang.*" contentType="text/html; charset=UTF-8" %> ... ... The web.xml file contains: - http://java.sun.com/dtd/web-app_2_3.dtd";> A filter servlet for all JSPs: - public void doFilter(ServletRequest request, ServletResponse response, FilterChain filterChain) { try { if (null != encoding) { request.setCharacterEncoding(encoding); } filterChain.doFilter(request, response); Do I need to do something else for Tomcat? In particular, do I need to do the stuff mentioned here: http://wiki.apache.org/jakarta-tomcat/Tomcat/UTF-8 Thanks in advance Sanjay NOTICE: If received in error, please destroy and notify sender. Sender does not waive confidentiality or privilege, and use is prohibited. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: tomcat 5 and UTF-8 encoding
Hi, There are tons of other such stories in the archives of this list and of Bugzilla. Nothing new here. Yoav Shapira http://www.yoavshapira.com >-Original Message- >From: Allistair Crossley [mailto:[EMAIL PROTECTED] >Sent: Tuesday, December 07, 2004 4:04 AM >To: Tomcat Users List >Subject: RE: tomcat 5 and UTF-8 encoding > >someone else had a similar issue with hebrew and you can read what happened >here: > >http://issues.apache.org/bugzilla/show_bug.cgi?id=32500 > >Allistair. > >> -Original Message- >> From: Peter Johnson [mailto:[EMAIL PROTECTED] >> Sent: 07 December 2004 03:41 >> To: Tomcat Users List >> Subject: Re: tomcat 5 and UTF-8 encoding >> >> >> Sarah, >> >> I recall a post a week or so ago regarding the contentType >> string losing >> the space after the ; >> >> This may be causing the issue. >> >> PJ >> >> Sarah wrote: >> >> >Hi, >> > I need to use jsp to display some data in Japanese >> character from MS SQL server database. I have already set >> the encoding in jsp to be: >> > >> ><%@ page language="java" contentType="text/html; charset=UTF-8" %> > > >> >If I use tomcat version 5.0.18, then the japanese character >> is displayed correctly. However, if I use 5.0.28 or 5.5.4, >> the characters are something like "???". If I right click >> the html page generated from jsp on the above versions, I can >> see the encoding to be Western instead of "UTF-8" like what >> happened with 5.0.18. Does anyone know what cause this >> problem and if any configuration of Tomcat needs to be made. >> Thank you very much for your help. >> > >> > >> >Sarah >> > >> > >> >- >> >Do you Yahoo!? >> > Read only the mail you want - Yahoo! Mail SpamGuard. >> > >> > >> >> - >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> For additional commands, e-mail: [EMAIL PROTECTED] >> >> > > > >--- >QAS Ltd. >Developers of QuickAddress Software >http://www.qas.com";>www.qas.com >Registered in England: No 2582055 >Registered in Australia: No 082 851 474 >--- > > > >- >To unsubscribe, e-mail: [EMAIL PROTECTED] >For additional commands, e-mail: [EMAIL PROTECTED] This e-mail, including any attachments, is a confidential business communication, and may contain information that is confidential, proprietary and/or privileged. This e-mail is intended only for the individual(s) to whom it is addressed, and may not be saved, copied, printed, disclosed or used by anyone else. If you are not the(an) intended recipient, please immediately delete this e-mail from your computer system and notify the sender. Thank you. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: tomcat 5 and UTF-8 encoding
someone else had a similar issue with hebrew and you can read what happened here: http://issues.apache.org/bugzilla/show_bug.cgi?id=32500 Allistair. > -Original Message- > From: Peter Johnson [mailto:[EMAIL PROTECTED] > Sent: 07 December 2004 03:41 > To: Tomcat Users List > Subject: Re: tomcat 5 and UTF-8 encoding > > > Sarah, > > I recall a post a week or so ago regarding the contentType > string losing > the space after the ; > > This may be causing the issue. > > PJ > > Sarah wrote: > > >Hi, > > I need to use jsp to display some data in Japanese > character from MS SQL server database. I have already set > the encoding in jsp to be: > > > ><%@ page language="java" contentType="text/html; charset=UTF-8" %> > > > >If I use tomcat version 5.0.18, then the japanese character > is displayed correctly. However, if I use 5.0.28 or 5.5.4, > the characters are something like "???". If I right click > the html page generated from jsp on the above versions, I can > see the encoding to be Western instead of "UTF-8" like what > happened with 5.0.18. Does anyone know what cause this > problem and if any configuration of Tomcat needs to be made. > Thank you very much for your help. > > > > > >Sarah > > > > > >- > >Do you Yahoo!? > > Read only the mail you want - Yahoo! Mail SpamGuard. > > > > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --- QAS Ltd. Developers of QuickAddress Software http://www.qas.com";>www.qas.com Registered in England: No 2582055 Registered in Australia: No 082 851 474 --- - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: tomcat 5 and UTF-8 encoding
Sarah, I recall a post a week or so ago regarding the contentType string losing the space after the ; This may be causing the issue. PJ Sarah wrote: Hi, I need to use jsp to display some data in Japanese character from MS SQL server database. I have already set the encoding in jsp to be: <%@ page language="java" contentType="text/html; charset=UTF-8" %> If I use tomcat version 5.0.18, then the japanese character is displayed correctly. However, if I use 5.0.28 or 5.5.4, the characters are something like "???". If I right click the html page generated from jsp on the above versions, I can see the encoding to be Western instead of "UTF-8" like what happened with 5.0.18. Does anyone know what cause this problem and if any configuration of Tomcat needs to be made. Thank you very much for your help. Sarah - Do you Yahoo!? Read only the mail you want - Yahoo! Mail SpamGuard. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
tomcat 5 and UTF-8 encoding
Hi, I need to use jsp to display some data in Japanese character from MS SQL server database. I have already set the encoding in jsp to be: <%@ page language="java" contentType="text/html; charset=UTF-8" %> If I use tomcat version 5.0.18, then the japanese character is displayed correctly. However, if I use 5.0.28 or 5.5.4, the characters are something like "???". If I right click the html page generated from jsp on the above versions, I can see the encoding to be Western instead of "UTF-8" like what happened with 5.0.18. Does anyone know what cause this problem and if any configuration of Tomcat needs to be made. Thank you very much for your help. Sarah - Do you Yahoo!? Read only the mail you want - Yahoo! Mail SpamGuard.
Re: UTF-8 Encoding in Jsp | RESOLVED
I concur, thanks for posting your findings. Also if I may ask: please don't change the subject of your mails. Those of us who view this list as a newsgroup get all messed up! Andoni. - Original Message - From: "Shapira, Yoav" <[EMAIL PROTECTED]> Newsgroups: gmane.comp.jakarta.tomcat.user Sent: Thursday, December 02, 2004 2:08 PM Subject: RE: UTF-8 Encoding in Jsp | RESOLVED Hi, Thanks for posting your findings ;) Yoav Shapira http://www.yoavshapira.com >-Original Message- >From: Arnab Chakravarty [mailto:[EMAIL PROTECTED] >Sent: Thursday, December 02, 2004 9:03 AM >To: Tomcat Users List >Subject: RE: UTF-8 Encoding in Jsp | RESOLVED > >Hi all, > >First of all thanks to all the people who helped in the first place (I >am grateful). The problem was resolved and was due to some problem with >the home grown framework we were using with the application. > >Tomcat had nothing to do with the problem and content type is the only >thing required to make it work. As far as the database persistence was >concerned, oracle did no mistake in storing the data but when our >framework was persisting the values, it somehow corrupted the data >somewhere in the middle of submitting the page with non-english >characters and writing to the database. > >We found this problem by simply writing a simple jsp page without using >the framework and rendered some non-english characters successfully. > >Thanks again, >Arnab > > >-Original Message- >From: Arnab Chakravarty [mailto:[EMAIL PROTECTED] >Sent: Wednesday, December 01, 2004 4:08 PM >To: Tomcat Users List >Subject: RE: UTF-8 Encoding in Jsp > >Hi, > >Thanks for the reply but it did not work. May be I didn't explain the >problem correctly. > >I am running an application that supports all the languages but only in >some specific places of the application and I have made those places >UTF-8 complaint. > >Further, they are being saved to Database (Oracle 9). When we are >reading the data back from the database, junk characters are displayed >on the screen. Yes, the database is set to support UTF-8 Encoding and >this is working with the old version of tomcat 3.3 and not with current >upgraded version of tomcat 5.0 > >There are also places in the application where drop downs contain some >different language support and we can see those charsets (Japanese, >Chinese etc) appearing. Only, when I try to display on the screen >through the jsp file, I am encountering this problem of junk characters >begin displayed. > >Hope I have set more context around the problem. Please help me resolve >this issue. > >Thanks, >Arnab > >-Original Message- >From: Mariano [mailto:[EMAIL PROTECTED] >Sent: Wednesday, December 01, 2004 12:54 PM >To: 'Tomcat Users List' >Subject: RE: UTF-8 Encoding in Jsp > >You should use too: > > > > > >and this scriptlet: > > request.setCharacterEncoding("UTF-8"); > >at the beginning. > >I hope this help you > >-Mensaje original- >De: Arnab Chakravarty [mailto:[EMAIL PROTECTED] >Enviado el: martes, 30 de noviembre de 2004 15:28 >Para: Tomcat Users List >Asunto: UTF-8 Encoding in Jsp > > >Hi all, > >I need to make my all jsp files compatible with UTF-8 Encoding and even >though I am using the directives: > ><%@ page pageEncoding="UTF-8"%> ><%@ page contentType = "text/html;charset=UTF-8"%> > >in the jsp files, cannot make it work. > >Using tomcat version 5. Is there any config changes I need to make for >the UTF-8 Encoding to work. > >Please help. > >Thanks in advance, >Arnab > >- >To unsubscribe, e-mail: [EMAIL PROTECTED] >For additional commands, e-mail: [EMAIL PROTECTED] > > >- >To unsubscribe, e-mail: [EMAIL PROTECTED] >For additional commands, e-mail: [EMAIL PROTECTED] > > >- >To unsubscribe, e-mail: [EMAIL PROTECTED] >For additional commands, e-mail: [EMAIL PROTECTED] This e-mail, including any attachments, is a confidential business communication, and may contain information that is confidential, proprietary and/or privileged. This e-mail is intended only for the individual(s) to whom it is addressed, and may not be saved, copied, printed, disclosed or used by anyone else. If you are not the(an) intended recipient, please immediately delete this e-mail from your computer system and notify the sender. Thank you. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: UTF-8 Encoding in Jsp | RESOLVED
Hi, Thanks for posting your findings ;) Yoav Shapira http://www.yoavshapira.com >-Original Message- >From: Arnab Chakravarty [mailto:[EMAIL PROTECTED] >Sent: Thursday, December 02, 2004 9:03 AM >To: Tomcat Users List >Subject: RE: UTF-8 Encoding in Jsp | RESOLVED > >Hi all, > >First of all thanks to all the people who helped in the first place (I >am grateful). The problem was resolved and was due to some problem with >the home grown framework we were using with the application. > >Tomcat had nothing to do with the problem and content type is the only >thing required to make it work. As far as the database persistence was >concerned, oracle did no mistake in storing the data but when our >framework was persisting the values, it somehow corrupted the data >somewhere in the middle of submitting the page with non-english >characters and writing to the database. > >We found this problem by simply writing a simple jsp page without using >the framework and rendered some non-english characters successfully. > >Thanks again, >Arnab > > >-Original Message- >From: Arnab Chakravarty [mailto:[EMAIL PROTECTED] >Sent: Wednesday, December 01, 2004 4:08 PM >To: Tomcat Users List >Subject: RE: UTF-8 Encoding in Jsp > >Hi, > >Thanks for the reply but it did not work. May be I didn't explain the >problem correctly. > >I am running an application that supports all the languages but only in >some specific places of the application and I have made those places >UTF-8 complaint. > >Further, they are being saved to Database (Oracle 9). When we are >reading the data back from the database, junk characters are displayed >on the screen. Yes, the database is set to support UTF-8 Encoding and >this is working with the old version of tomcat 3.3 and not with current >upgraded version of tomcat 5.0 > >There are also places in the application where drop downs contain some >different language support and we can see those charsets (Japanese, >Chinese etc) appearing. Only, when I try to display on the screen >through the jsp file, I am encountering this problem of junk characters >begin displayed. > >Hope I have set more context around the problem. Please help me resolve >this issue. > >Thanks, >Arnab > >-Original Message- >From: Mariano [mailto:[EMAIL PROTECTED] >Sent: Wednesday, December 01, 2004 12:54 PM >To: 'Tomcat Users List' >Subject: RE: UTF-8 Encoding in Jsp > >You should use too: > > > > > >and this scriptlet: > > request.setCharacterEncoding("UTF-8"); > >at the beginning. > >I hope this help you > >-Mensaje original- >De: Arnab Chakravarty [mailto:[EMAIL PROTECTED] >Enviado el: martes, 30 de noviembre de 2004 15:28 >Para: Tomcat Users List >Asunto: UTF-8 Encoding in Jsp > > >Hi all, > >I need to make my all jsp files compatible with UTF-8 Encoding and even >though I am using the directives: > ><%@ page pageEncoding="UTF-8"%> ><%@ page contentType = "text/html;charset=UTF-8"%> > >in the jsp files, cannot make it work. > >Using tomcat version 5. Is there any config changes I need to make for >the UTF-8 Encoding to work. > >Please help. > >Thanks in advance, >Arnab > >- >To unsubscribe, e-mail: [EMAIL PROTECTED] >For additional commands, e-mail: [EMAIL PROTECTED] > > >- >To unsubscribe, e-mail: [EMAIL PROTECTED] >For additional commands, e-mail: [EMAIL PROTECTED] > > >- >To unsubscribe, e-mail: [EMAIL PROTECTED] >For additional commands, e-mail: [EMAIL PROTECTED] This e-mail, including any attachments, is a confidential business communication, and may contain information that is confidential, proprietary and/or privileged. This e-mail is intended only for the individual(s) to whom it is addressed, and may not be saved, copied, printed, disclosed or used by anyone else. If you are not the(an) intended recipient, please immediately delete this e-mail from your computer system and notify the sender. Thank you. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: UTF-8 Encoding in Jsp | RESOLVED
Hi all, First of all thanks to all the people who helped in the first place (I am grateful). The problem was resolved and was due to some problem with the home grown framework we were using with the application. Tomcat had nothing to do with the problem and content type is the only thing required to make it work. As far as the database persistence was concerned, oracle did no mistake in storing the data but when our framework was persisting the values, it somehow corrupted the data somewhere in the middle of submitting the page with non-english characters and writing to the database. We found this problem by simply writing a simple jsp page without using the framework and rendered some non-english characters successfully. Thanks again, Arnab -Original Message- From: Arnab Chakravarty [mailto:[EMAIL PROTECTED] Sent: Wednesday, December 01, 2004 4:08 PM To: Tomcat Users List Subject: RE: UTF-8 Encoding in Jsp Hi, Thanks for the reply but it did not work. May be I didn't explain the problem correctly. I am running an application that supports all the languages but only in some specific places of the application and I have made those places UTF-8 complaint. Further, they are being saved to Database (Oracle 9). When we are reading the data back from the database, junk characters are displayed on the screen. Yes, the database is set to support UTF-8 Encoding and this is working with the old version of tomcat 3.3 and not with current upgraded version of tomcat 5.0 There are also places in the application where drop downs contain some different language support and we can see those charsets (Japanese, Chinese etc) appearing. Only, when I try to display on the screen through the jsp file, I am encountering this problem of junk characters begin displayed. Hope I have set more context around the problem. Please help me resolve this issue. Thanks, Arnab -Original Message- From: Mariano [mailto:[EMAIL PROTECTED] Sent: Wednesday, December 01, 2004 12:54 PM To: 'Tomcat Users List' Subject: RE: UTF-8 Encoding in Jsp You should use too: and this scriptlet: request.setCharacterEncoding("UTF-8"); at the beginning. I hope this help you -Mensaje original- De: Arnab Chakravarty [mailto:[EMAIL PROTECTED] Enviado el: martes, 30 de noviembre de 2004 15:28 Para: Tomcat Users List Asunto: UTF-8 Encoding in Jsp Hi all, I need to make my all jsp files compatible with UTF-8 Encoding and even though I am using the directives: <%@ page pageEncoding="UTF-8"%> <%@ page contentType = "text/html;charset=UTF-8"%> in the jsp files, cannot make it work. Using tomcat version 5. Is there any config changes I need to make for the UTF-8 Encoding to work. Please help. Thanks in advance, Arnab - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: UTF-8 Encoding in Jsp
I would recommend that you make the Entire Site UTF-8. The parts that are in English will still work no problem but I would really not try mixing the encoding for requests. The "junk characters" you are getting back are also not actually junk. You can work out what encoding is being used by interpreting these string and knowing what the intended string is. Also the fact that you are not just getting lots of "?" characters means that it is not Oracle that is having the problem. I will read the other reply when I get a chance and see if I have any further contributions but for now I really strenuously suggest making ALL the pages/servlets UTF-8. Regards, Andoni. - Original Message - From: "Arnab Chakravarty" <[EMAIL PROTECTED]> Newsgroups: gmane.comp.jakarta.tomcat.user Sent: Wednesday, December 01, 2004 10:37 AM Subject: RE: UTF-8 Encoding in Jsp Hi, Thanks for the reply but it did not work. May be I didn't explain the problem correctly. I am running an application that supports all the languages but only in some specific places of the application and I have made those places UTF-8 complaint. Further, they are being saved to Database (Oracle 9). When we are reading the data back from the database, junk characters are displayed on the screen. Yes, the database is set to support UTF-8 Encoding and this is working with the old version of tomcat 3.3 and not with current upgraded version of tomcat 5.0 There are also places in the application where drop downs contain some different language support and we can see those charsets (Japanese, Chinese etc) appearing. Only, when I try to display on the screen through the jsp file, I am encountering this problem of junk characters begin displayed. Hope I have set more context around the problem. Please help me resolve this issue. Thanks, Arnab -Original Message- From: Mariano [mailto:[EMAIL PROTECTED] Sent: Wednesday, December 01, 2004 12:54 PM To: 'Tomcat Users List' Subject: RE: UTF-8 Encoding in Jsp You should use too: and this scriptlet: request.setCharacterEncoding("UTF-8"); at the beginning. I hope this help you -Mensaje original- De: Arnab Chakravarty [mailto:[EMAIL PROTECTED] Enviado el: martes, 30 de noviembre de 2004 15:28 Para: Tomcat Users List Asunto: UTF-8 Encoding in Jsp Hi all, I need to make my all jsp files compatible with UTF-8 Encoding and even though I am using the directives: <%@ page pageEncoding="UTF-8"%> <%@ page contentType = "text/html;charset=UTF-8"%> in the jsp files, cannot make it work. Using tomcat version 5. Is there any config changes I need to make for the UTF-8 Encoding to work. Please help. Thanks in advance, Arnab - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: UTF-8 Encoding in Jsp
Hi, These encoding issues are always a nightmare ;) There are some relevant areas of the Servlet spec you may want to look at wrt encoding, notably (Internationalization and Request data encoding). In terms of UTF-8 not coming back correctly from your database you need to ensure that when they were _added_ that the character encoding was UTF-8. You should also verify yuor database is in UTF-8 mode. If both these statements are true, then you need to read Internationalization in the Servlet spec which says "If the servlet does not specify a character encoding before the getWriter method of the ServletResponse interface is called or the response is committed, the default ISO-8859-1 is used." In other words, you need to call setLocale or setCharacterEncoding before the response is committed. I am not entirely sure whether that is actually what that JSP page directive is doing, maybe it is. Perhaps in your JSP you can output <%= request.getCharacterEncoding() %> to make sure your UTF-8 has been set. If it is null, it has not been set. If it _is_ UTF-8 then the character data is either not actually UTF-8 coming from the database either because a) your database driver connection URL is not operating in UTF-8 mode, b) the data when put into the database was not UTF-8 or c) the database is not running UTF-8. In terms of sending data to the database as UTF-8 check your driver parameters (normally on the URL string) and also database setting. You also need to take note of this section of the Servlet spec. We had to write a servlet filter to change our inbound form posts to the correct encoding for our database Cp1252. Request data encoding extract The default encoding of a request the container uses to create the request reader and parse POST data must be ISO-8859-1 if none has been specified by the client request. However, in order to indicate to the developer in this case the failure of the client to send a character encoding, the container returns null from the getCharacterEncoding method. If the client hasn't set character encoding and the request data is encoded with a different encoding than the default as described above, breakage can occur. To remedy this situation, a new method setCharacterEncoding(String enc) has been added to the ServletRequest interface. Developers can override the character encoding supplied by the container by calling this method. It must be called prior to parsing any post data or reading any input from the request. Hope this info gets you thinking, Allistair. > -Original Message- > From: Arnab Chakravarty [mailto:[EMAIL PROTECTED] > Sent: 01 December 2004 10:38 > To: Tomcat Users List > Subject: RE: UTF-8 Encoding in Jsp > > > Hi, > > Thanks for the reply but it did not work. May be I didn't explain the > problem correctly. > > I am running an application that supports all the languages > but only in > some specific places of the application and I have made those places > UTF-8 complaint. > > Further, they are being saved to Database (Oracle 9). When we are > reading the data back from the database, junk characters are displayed > on the screen. Yes, the database is set to support UTF-8 Encoding and > this is working with the old version of tomcat 3.3 and not > with current > upgraded version of tomcat 5.0 > > There are also places in the application where drop downs contain some > different language support and we can see those charsets (Japanese, > Chinese etc) appearing. Only, when I try to display on the screen > through the jsp file, I am encountering this problem of junk > characters > begin displayed. > > Hope I have set more context around the problem. Please help > me resolve > this issue. > > Thanks, > Arnab > > -Original Message- > From: Mariano [mailto:[EMAIL PROTECTED] > Sent: Wednesday, December 01, 2004 12:54 PM > To: 'Tomcat Users List' > Subject: RE: UTF-8 Encoding in Jsp > > You should use too: > > > > > > and this scriptlet: > > request.setCharacterEncoding("UTF-8"); > > at the beginning. > > I hope this help you > > -Mensaje original- > De: Arnab Chakravarty [mailto:[EMAIL PROTECTED] > Enviado el: martes, 30 de noviembre de 2004 15:28 > Para: Tomcat Users List > Asunto: UTF-8 Encoding in Jsp > > > Hi all, > > I need to make my all jsp files compatible with UTF-8 > Encoding and even > though I am using the directives: > > <%@ page pageEncoding="UTF-8"%> > <%@ page contentType = "text/html;charset=UTF-8"%> > > in the jsp files, cannot make it work. > > Using tomcat version 5. Is there any config changes I need to make for > the UTF-8 Encoding to work. > &
RE: UTF-8 Encoding in Jsp
Hi, Thanks for the reply but it did not work. May be I didn't explain the problem correctly. I am running an application that supports all the languages but only in some specific places of the application and I have made those places UTF-8 complaint. Further, they are being saved to Database (Oracle 9). When we are reading the data back from the database, junk characters are displayed on the screen. Yes, the database is set to support UTF-8 Encoding and this is working with the old version of tomcat 3.3 and not with current upgraded version of tomcat 5.0 There are also places in the application where drop downs contain some different language support and we can see those charsets (Japanese, Chinese etc) appearing. Only, when I try to display on the screen through the jsp file, I am encountering this problem of junk characters begin displayed. Hope I have set more context around the problem. Please help me resolve this issue. Thanks, Arnab -Original Message- From: Mariano [mailto:[EMAIL PROTECTED] Sent: Wednesday, December 01, 2004 12:54 PM To: 'Tomcat Users List' Subject: RE: UTF-8 Encoding in Jsp You should use too: and this scriptlet: request.setCharacterEncoding("UTF-8"); at the beginning. I hope this help you -Mensaje original- De: Arnab Chakravarty [mailto:[EMAIL PROTECTED] Enviado el: martes, 30 de noviembre de 2004 15:28 Para: Tomcat Users List Asunto: UTF-8 Encoding in Jsp Hi all, I need to make my all jsp files compatible with UTF-8 Encoding and even though I am using the directives: <%@ page pageEncoding="UTF-8"%> <%@ page contentType = "text/html;charset=UTF-8"%> in the jsp files, cannot make it work. Using tomcat version 5. Is there any config changes I need to make for the UTF-8 Encoding to work. Please help. Thanks in advance, Arnab - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: UTF-8 Encoding in Jsp
You should use too: and this scriptlet: request.setCharacterEncoding("UTF-8"); at the beginning. I hope this help you -Mensaje original- De: Arnab Chakravarty [mailto:[EMAIL PROTECTED] Enviado el: martes, 30 de noviembre de 2004 15:28 Para: Tomcat Users List Asunto: UTF-8 Encoding in Jsp Hi all, I need to make my all jsp files compatible with UTF-8 Encoding and even though I am using the directives: <%@ page pageEncoding="UTF-8"%> <%@ page contentType = "text/html;charset=UTF-8"%> in the jsp files, cannot make it work. Using tomcat version 5. Is there any config changes I need to make for the UTF-8 Encoding to work. Please help. Thanks in advance, Arnab - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: UTF-8 Encoding in Jsp
Hello, First and foremost I would say: be absolutely sure that it is the JSP's fault. I hope you are not getting some data from a database and trying to show it? Be sure that your editor is saving the JSP in UTF-8 format. Add the flag: -Dfile.encoding=UTF-8 to the CATALINA_OPTS environment variable in your catalina.bat (or equivalent) startup file. and use: req.setCharacterEncoding("UTF-8"); to set the encoding on the request. This may help: http://marc.theaimsgroup.com/?l=tomcat-user&m=105524550416364&w=2 Though you can ignore the method wich is used to set the encoding as the above line does the same job in servlets. Andoni. - Original Message - From: "Arnab Chakravarty" <[EMAIL PROTECTED]> Newsgroups: gmane.comp.jakarta.tomcat.user Sent: Tuesday, November 30, 2004 2:28 PM Subject: UTF-8 Encoding in Jsp Hi all, I need to make my all jsp files compatible with UTF-8 Encoding and even though I am using the directives: <%@ page pageEncoding="UTF-8"%> <%@ page contentType = "text/html;charset=UTF-8"%> in the jsp files, cannot make it work. Using tomcat version 5. Is there any config changes I need to make for the UTF-8 Encoding to work. Please help. Thanks in advance, Arnab - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
UTF-8 Encoding in Jsp
Hi all, I need to make my all jsp files compatible with UTF-8 Encoding and even though I am using the directives: <%@ page pageEncoding="UTF-8"%> <%@ page contentType = "text/html;charset=UTF-8"%> in the jsp files, cannot make it work. Using tomcat version 5. Is there any config changes I need to make for the UTF-8 Encoding to work. Please help. Thanks in advance, Arnab - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: UTF-8 Encoding Issue Since 5.0.27 ( gun in my mouth )
Thanks for you info Mark. It does appear that a part of my issue stems from my .properties files being in UTF-8. So I have to ask the question, why has this changed since if I run the same code in 5.0.24, I have no issue, and 5.0.28 has a problem. It sounds like a substantial problem that UTF-8 resource bundles aren't supported any more. Besides this simple example, I'm still seeing problems with a servlet returning XML in UTF-8. Again, no issue in 5.0.24, only after 5.0.25. I will put together a sample and post it shortly. Thanks again for the help, Rick -Original Message- From: Mark Thomas [mailto:[EMAIL PROTECTED] Posted At: Wednesday, September 01, 2004 4:14 PM Posted To: Tomcat Dev Conversation: UTF-8 Encoding Issue Since 5.0.27 ( gun in my mouth ) Subject: RE: UTF-8 Encoding Issue Since 5.0.27 ( gun in my mouth ) OK. I have a simple test case and all seems to be well. See the end of this message for the contents of my test files. My environment: Win XP SP2 - brave I know but all has been OK so far ;) JDK 1.4.2_05 Tomcat 5.0 branch, HEAD (latest) from CVS (very close to 5.0.28) Points to note: 1. All my test files are ASCII files. 2. I had all sorts of problems with non-ASCII properties files. I didn't get to the bottom of it but I think Windows was adding junk to the start of the file if it was UTF-8 encoded. Maybe having the first line as a comment would fix this but I haven't tested this. 3. There were times where Eclipse and Windows were reporting the exact same file as having different encodings. There is something odd here but I didn't look at this any further. 4. I had property file issues with 4.1.HEAD as well as 5.0.HEAD. 5. The downside of using ASCII files is that entering the UTF-8 characters by hand is a real pain. A simple conversion app should fix this though. 6. Apart from the property file issue, everything seems fine. Test files follow. Hope this helps, Mark PS I noticed that you cross-posted to the dev list. Please don't do this. Any message cross-posted is less likely rather than more likely to get a response. === utf8.jsp <%@ page language="java" import="java.lang.*,java.util.*" contentType="text/html; charset=UTF-8" %> UTF-8 Encoding issue Text from JSP page (which is ASCII encoded). English Japanese Text from resources bundle: <% String language = request.getParameter("language"); if (language == null) { language="en"; } Locale locale = null; if (language.equalsIgnoreCase("en")) { locale = Locale.ENGLISH; } else { locale = Locale.JAPAN; } ResourceBundle bundle = ResourceBundle.getBundle("foo.bar.LocalStrings", locale); out.println("" + bundle.getString("test") + ""); %> <%=request.getParameter("language") %> = LocalStrings_en.properties = test=Test string from resources bundle = LocalStrings_ja.properties = test=\u30d5\u30a1\u30a4\u30eb\u30ed - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: UTF-8 Encoding Issue Since 5.0.27 ( gun in my mouth )
OK. I have a simple test case and all seems to be well. See the end of this message for the contents of my test files. My environment: Win XP SP2 - brave I know but all has been OK so far ;) JDK 1.4.2_05 Tomcat 5.0 branch, HEAD (latest) from CVS (very close to 5.0.28) Points to note: 1. All my test files are ASCII files. 2. I had all sorts of problems with non-ASCII properties files. I didn't get to the bottom of it but I think Windows was adding junk to the start of the file if it was UTF-8 encoded. Maybe having the first line as a comment would fix this but I haven't tested this. 3. There were times where Eclipse and Windows were reporting the exact same file as having different encodings. There is something odd here but I didn't look at this any further. 4. I had property file issues with 4.1.HEAD as well as 5.0.HEAD. 5. The downside of using ASCII files is that entering the UTF-8 characters by hand is a real pain. A simple conversion app should fix this though. 6. Apart from the property file issue, everything seems fine. Test files follow. Hope this helps, Mark PS I noticed that you cross-posted to the dev list. Please don't do this. Any message cross-posted is less likely rather than more likely to get a response. === utf8.jsp <%@ page language="java" import="java.lang.*,java.util.*" contentType="text/html; charset=UTF-8" %> UTF-8 Encoding issue Text from JSP page (which is ASCII encoded). English Japanese Text from resources bundle: <% String language = request.getParameter("language"); if (language == null) { language="en"; } Locale locale = null; if (language.equalsIgnoreCase("en")) { locale = Locale.ENGLISH; } else { locale = Locale.JAPAN; } ResourceBundle bundle = ResourceBundle.getBundle("foo.bar.LocalStrings", locale); out.println("" + bundle.getString("test") + ""); %> <%=request.getParameter("language") %> = LocalStrings_en.properties = test=Test string from resources bundle = LocalStrings_ja.properties = test=\u30d5\u30a1\u30a4\u30eb\u30ed - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: UTF-8 Encoding Issue Since 5.0.27 ( gun in my mouth )
The change (which is required by the spec) is that if the character set has not been set before a call to getWriter() then it will default to ISO-8859-1. There was some discussion on the tomcat-dev list about this (see http://marc.theaimsgroup.com/?l=tomcat-dev&m=109104739719572&w=2) I'll try and put together a very simple JSP test case and get back to you. Mark > -Original Message- > From: Rick [mailto:[EMAIL PROTECTED] > Sent: Wednesday, September 01, 2004 3:44 AM > To: 'Tomcat Users List'; [EMAIL PROTECTED] > Subject: UTF-8 Encoding Issue Since 5.0.27 ( gun in my mouth ) > > Since 5.0.27, pretty much all of my UTF-8 i8 code seems to be > messed up. > > The problem seems to have been caused by whatever fix was > created for issue > -- > ServletResponse.setContentType sets response encoding after > getWriter was > called (Bugtraq 5062838) (luehe) > -- > > Now it seems almost impossible to properly set the encoding > type of some of > my JSPs and all of my Servlets that return UTF-8 XML data. > > As an example, my login page allows the user to switch to > Japanese text. > Text data is read with a ResourceBundle, which reads from a > UTF-8 encoded > .properties file. > > If the encoding of the .jsp page itself is in ASCII, then I > can't get the > characters to show up at all any more. > I have to save the .jsp page as UTF-8. > Added "set JAVA_OPTS=-Dfile.encoding=UTF-8" to my catalina.bat file > > Then, If I try to set a character set in my page header, it messes up. > > This works in some cases... > <%@ page language="java" import="java.util.*" > contentType="text/html" %> > response.getCharacterEncoding() = "ISO-8859-1" > > The really scary part is that with no meta or charset > actually set, that the > browser(IE) correctly changes to UTF-8 and displays the > content fine. But > if I change the actual file encoding of the .jsp page from > UTF-8 back to > ASCII. Then IE does not change to UTF-8 and the page is > messed up again. > Why does the actual encoding of the .jsp file itself dictate > the response > sent to the client? > > It appears that the actual encoding of the source file > someone how gets past > along and then I'm unable to alter the character encoding, > and if I try, it > just causes everything to go to hell. > > > This use to work before 5.0.27, but now doesn't, even though > all data and > pages are encoded in UTF-8. > <%@ page language="java" import="java.util.*" contentType="text/html; > charset=UTF-8" %> > response.getCharacterEncoding() = "UTF-8" > > > Before 5.0.27, all I had to do to get my output in UTF-8 was ... > contentType="text/html; charset=UTF-8" > > Now I have to mess with the actual .jsp file page encodings > and still can't > get most to work properly as well as none of my servlets will > return correct > UTF-8 data. > > I have tried setting "pageEncoding" in the page tag as well > with no luck. > > > Thanks for anyone's insight or help on this, its never fun to > find out that > something that had been working quite solid , up and blows up > for no good > reason. > > Current dev machine is on windows xp by the way, vanilla > install of Tomcat > 5.0.28. > I will be setting this up on a Linux box for more testing shortly. > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
UTF-8 Encoding Issue Since 5.0.27 ( gun in my mouth )
Since 5.0.27, pretty much all of my UTF-8 i8 code seems to be messed up. The problem seems to have been caused by whatever fix was created for issue -- ServletResponse.setContentType sets response encoding after getWriter was called (Bugtraq 5062838) (luehe) -- Now it seems almost impossible to properly set the encoding type of some of my JSPs and all of my Servlets that return UTF-8 XML data. As an example, my login page allows the user to switch to Japanese text. Text data is read with a ResourceBundle, which reads from a UTF-8 encoded .properties file. If the encoding of the .jsp page itself is in ASCII, then I can't get the characters to show up at all any more. I have to save the .jsp page as UTF-8. Added "set JAVA_OPTS=-Dfile.encoding=UTF-8" to my catalina.bat file Then, If I try to set a character set in my page header, it messes up. This works in some cases... <%@ page language="java" import="java.util.*" contentType="text/html" %> response.getCharacterEncoding() = "ISO-8859-1" The really scary part is that with no meta or charset actually set, that the browser(IE) correctly changes to UTF-8 and displays the content fine. But if I change the actual file encoding of the .jsp page from UTF-8 back to ASCII. Then IE does not change to UTF-8 and the page is messed up again. Why does the actual encoding of the .jsp file itself dictate the response sent to the client? It appears that the actual encoding of the source file someone how gets past along and then I'm unable to alter the character encoding, and if I try, it just causes everything to go to hell. This use to work before 5.0.27, but now doesn't, even though all data and pages are encoded in UTF-8. <%@ page language="java" import="java.util.*" contentType="text/html; charset=UTF-8" %> response.getCharacterEncoding() = "UTF-8" Before 5.0.27, all I had to do to get my output in UTF-8 was ... contentType="text/html; charset=UTF-8" Now I have to mess with the actual .jsp file page encodings and still can't get most to work properly as well as none of my servlets will return correct UTF-8 data. I have tried setting "pageEncoding" in the page tag as well with no luck. Thanks for anyone's insight or help on this, its never fun to find out that something that had been working quite solid , up and blows up for no good reason. Current dev machine is on windows xp by the way, vanilla install of Tomcat 5.0.28. I will be setting this up on a Linux box for more testing shortly. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: UTF-8 encoding
Hello Nikki > Just send UTF8 encoded data and everything will be allright. Yes, that seems to work for me at the moment, though I am relying on default settings because I do not even specify UTF-8. (Java defaults to Unicode anyway.) I'm only using LATIN-1 characters at the moment, so I cannot comment on what would happen if I was working with (say) Chinese characters. I have to leave it at that because this is something I shall be looking into later. All the best! Harry > Simply I don't get it. You send data over HTTP. You can send data as you > wish. What about servlet serving images? > Just send UTF8 encoded data and everything will be allright. > No way Tomcat knows do you want to send cyrrilic letter or french accent > letter. It's up to you. > Niki > Harry Mantheakis wrote: - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: UTF-8 encoding
Simply I don't get it. You send data over HTTP. You can send data as you wish. What about servlet serving images? Just send UTF8 encoded data and everything will be allright. No way Tomcat knows do you want to send cyrrilic letter or french accent letter. It's up to you. Niki Harry Mantheakis wrote: Okay, thanks Yoav. I got the source, and I can see what's happening - thanks to Google - at this URL: http://java.sun.com/blueprints/code/jps131/src/com/sun/j2ee/blueprints/encod ingfilter/web/EncodingFilter.java.html The 'doFilter' method sets the encoding for the *request* which does not seem to address the original question, which was asking how to 'force tomcat to send data in UTF-8 encoding'. Interesting filter nevertheless! It is a subject that concerns me. Kind regards Harry Hi, implement a EncodingFilter class Where's the interface? javax.servlet.Filter is the interface. He probably had http://java.sun.com/blueprints/code/jps131/api/com/sun/j2ee/blueprints/e ncodingfilter/web/EncodingFilter.html in mind. Yoav Shapira - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: UTF-8 encoding
Okay, thanks Yoav. I got the source, and I can see what's happening - thanks to Google - at this URL: http://java.sun.com/blueprints/code/jps131/src/com/sun/j2ee/blueprints/encod ingfilter/web/EncodingFilter.java.html The 'doFilter' method sets the encoding for the *request* which does not seem to address the original question, which was asking how to 'force tomcat to send data in UTF-8 encoding'. Interesting filter nevertheless! It is a subject that concerns me. Kind regards Harry > Hi, > >>> implement a EncodingFilter class >> >> >> Where's the interface? > > javax.servlet.Filter is the interface. He probably had > http://java.sun.com/blueprints/code/jps131/api/com/sun/j2ee/blueprints/e > ncodingfilter/web/EncodingFilter.html in mind. > > Yoav Shapira - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: UTF-8 encoding
>javax.servlet.Filter is the interface. He probably had >http://java.sun.com/blueprints/code/jps131/api/com/sun/j2ee/blueprints/ e >ncodingfilter/web/EncodingFilter.html in mind. BTW, swap .java for .html (or google with the above) to see the full java source code for the blueprint encoding filter implementation. Yoav Shapira This e-mail, including any attachments, is a confidential business communication, and may contain information that is confidential, proprietary and/or privileged. This e-mail is intended only for the individual(s) to whom it is addressed, and may not be saved, copied, printed, disclosed or used by anyone else. If you are not the(an) intended recipient, please immediately delete this e-mail from your computer system and notify the sender. Thank you. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: UTF-8 encoding
Hi, >> implement a EncodingFilter class > > >Where's the interface? javax.servlet.Filter is the interface. He probably had http://java.sun.com/blueprints/code/jps131/api/com/sun/j2ee/blueprints/e ncodingfilter/web/EncodingFilter.html in mind. Yoav Shapira This e-mail, including any attachments, is a confidential business communication, and may contain information that is confidential, proprietary and/or privileged. This e-mail is intended only for the individual(s) to whom it is addressed, and may not be saved, copied, printed, disclosed or used by anyone else. If you are not the(an) intended recipient, please immediately delete this e-mail from your computer system and notify the sender. Thank you. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: UTF-8 encoding
> implement a EncodingFilter class Where's the interface? > Hi, you can specify the utf-8 encoding with a filter. All you need to do is > implement a EncodingFilter class, and then in your deployment descriptor add > the element as follows: > > > EncodingFilter > EncodingFilter > UTF-8 encoding > org.mysite.EncodingFilter > > targetEncoding > utf-8 > > > > Hope this helps:) > > -Yan > > -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > Sent: Monday, April 05, 2004 6:49 AM > To: Tomcat Users List > Subject: UTF-8 encoding > > > Hi! > > I have a web-application which on the serverside needs UTF-8 encoding. I > tried to install and run apache/tomcat on a Windows-XP environment, and > the server says, the encoding is not UTF-8. same applicationwith the same > apache/tomcat version runs correctly on a windows 2000 environment. Is > this a XP specific problem and is there any possibility to force tomcat to > send data in UTF-8 encoding. > > > > Best regards > bab - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: UTF-8 encoding
Hi, you can specify the utf-8 encoding with a filter. All you need to do is implement a EncodingFilter class, and then in your deployment descriptor add the element as follows: EncodingFilter EncodingFilter UTF-8 encoding org.mysite.EncodingFilter targetEncoding utf-8 Hope this helps:) -Yan -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Sent: Monday, April 05, 2004 6:49 AM To: Tomcat Users List Subject: UTF-8 encoding Hi! I have a web-application which on the serverside needs UTF-8 encoding. I tried to install and run apache/tomcat on a Windows-XP environment, and the server says, the encoding is not UTF-8. same applicationwith the same apache/tomcat version runs correctly on a windows 2000 environment. Is this a XP specific problem and is there any possibility to force tomcat to send data in UTF-8 encoding. Best regards bab - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
UTF-8 encoding
Hi! I have a web-application which on the serverside needs UTF-8 encoding. I tried to install and run apache/tomcat on a Windows-XP environment, and the server says, the encoding is not UTF-8. same applicationwith the same apache/tomcat version runs correctly on a windows 2000 environment. Is this a XP specific problem and is there any possibility to force tomcat to send data in UTF-8 encoding. Best regards bab
UTF-8 encoding problem with file included using jsp:include
Hello, I have a jsp page with the following code at the top of the page, in order to display the page contents in UTF-8: <%@ page contentType="text/html; charset=UTF-8" %> <% response.setContentType("text/html; charset=UTF-8"); %> In this page is a jsp:include tag that includes a static html file, the name of which is determined at runtime. This included file contains UTF-8 encoded characters, however these are not being displayed correctly in my browser (mozilla 1.5/debian), but as generic 'unknown unicode' chars. If I use an include directive instead however, the characters are displayed correctly. If I change the extension of the included file to .jsp so that it's compiled (just to see what happends) the characters still don't display because the .java file generated by Jasper has a response.setContentType("iso-8859-1") line inserted into it, which I've been unable to figure out how to change. It seems likely that somewhere along the line, the content type of the included file (html or jsp) is being set and this setting is taking precedence over the page directives I have in the including page. I've tried setting everything I can think of to UTF-8 (file encoding, response and request objects), I've checked that the JSP compiler should be compiling using UTF-8 (I'm using tomcat 4.1.29) (even though this shouldn't really affect and included html file) but I can't seem to get the included file encoded correctly. Does anyone know what setting is responsible for the response.setContentType line inserted by jasper, or have any further ideas that I could investigate ? Many thanks, ..camilla - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Setting UTF-8 Encoding
I am sorry. It has something to do with struts. Coz if I place my JSP in examples or tomcat-docs it works fine. Somehow Struts is messing things up. Or i havent configured things properly. Affan - Original Message - From: "Daniel Brown" <[EMAIL PROTECTED]> To: "Tomcat Users List" <[EMAIL PROTECTED]> Sent: Friday, January 31, 2003 5:50 PM Subject: RE: Setting UTF-8 Encoding > Affan, > > The encoding is set just fine. If I copy and paste your JSP, and run it > here, I get the following as the content type in the HTTP headers: > > Content-Type: text/html; charset=UTF-8 > > You're seeing empty squares where you'd expect characters for a couple of > reasons: > > The Unicode escape for trademark is \u2122, according to the HTML 4.01 spec. > The raw copyright characters in your document are in ISO-8859-1, not UTF-8. > > If you replace \u0099 by \u2122, it works just fine. > > Alternatively, why not use ™, ©, and ® instead? > > HTH, > > Dan. > > > -Original Message- > > From: Affan Qureshi [mailto:[EMAIL PROTECTED]] > > Sent: 31 January 2003 11:38 > > To: Tomcat Users List > > Subject: Setting UTF-8 Encoding > > > > > > I am having trouble setting the encoding to UTF-8 and hence my > > web pages are > > unable to render characters like the Trademark or Copyright symbols. In > > Tomcat's source at various places teh character encoding is > > hard-coded to be > > ISO-8859-1. I have tried to use the filter in the examples to set the > > encoding type but that did not help and I kept seeing questionamarks for > > those characters. I have also tried to modify the source and > > build again but > > that doesn't work either (I know I must be doing something wrong here.) > > > > Somehow tomcat doesn't allow me to change the character encoding to UTF-8. > > The same JSPs are looking fine on Weblogic and Resin without any > > configuration/modification to the server settings. > > > > Any ideas how can I fix this ugly problem in my app. The app is unusable > > without this. > > > > Thanks a lot. > > > > Affan > > > > > > - > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Setting UTF-8 Encoding
You have in your mind that there are several levels where there can happen mischief with international characters. - Generating java for the jsp You can verify this by a look at the generated source in the work directory. Do they look like you expect? - Compiling the generated java I don't know how you can control which encoding is used to compile the source. (As I'm not using tomcat much, I had not to deal with this yet) - Handling at runtime - Setting no/wrong headers - The browser. As your code works in other engines these levels are the ones which are least likely to be the cause of the problem. My guess is, that it is #2 that is causing the pain. > -Original Message- > From: Affan Qureshi [mailto:[EMAIL PROTECTED]] > Sent: Friday, January 31, 2003 1:47 PM > To: Tomcat Users List > Subject: Re: Setting UTF-8 Encoding > > > Locale object is set to to "en_US" by default. And I am using > Tomcat 4.1.18 > on Win2K. i have also tried the same on SunSOLARIS and Linux. > > If I use servlets instead of JSP it works fine and output the > characters as > required. But i guess its the JSPWriter that does something > which shows teh > question marks in place of those characters. > > However, the same JSP code works in for Resin and Weblogic. > > Baffled > > Affan > > - Original Message - > From: "Masood Ahmed" <[EMAIL PROTECTED]> > To: "Tomcat Users List" <[EMAIL PROTECTED]> > Sent: Friday, January 31, 2003 5:13 PM > Subject: Re: Setting UTF-8 Encoding > > > > Have you tried setting the locale directly on the > > request object? See if that helps. > > > > What version of tomcat are you using? > > > > thanks, > > -Masood > > > > --- Affan Qureshi <[EMAIL PROTECTED]> wrote: > > > I forgot to paste my code which is there at the > > > bottom now. > > > > > > > I am having trouble setting the encoding to UTF-8 > > > and hence my web pages > > > are > > > > unable to render characters like the Trademark or > > > Copyright symbols. In > > > > Tomcat's source at various places teh character > > > encoding is hard-coded to > > > be > > > > ISO-8859-1. I have tried to use the filter in the > > > examples to set the > > > > encoding type but that did not help and I kept > > > seeing questionamarks for > > > > those characters. I have also tried to modify the > > > source and build again > > > but > > > > that doesn't work either (I know I must be doing > > > something wrong here.) > > > > > > > > Somehow tomcat doesn't allow me to change the > > > character encoding to UTF-8. > > > > The same JSPs are looking fine on Weblogic and > > > Resin without any > > > > configuration/modification to the server settings. > > > > > > > > Any ideas how can I fix this ugly problem in my > > > app. The app is unusable > > > > without this. > > > > > > > > Thanks a lot. > > > > > > > > Affan > > > > > > Here is my code for the Test JSP: > > > <%@page contentType="text/html; charset=UTF-8"%> > > > > > > Test JSP > > > > > > <% out.println('\u00A9'); %> > > > <% System.out.println("This © is test");%> > > > > > > <% out.println("This ° is test"); %> > > > > > > <% out.println("This © is test"); %> > > > > > > <% out.println("This \u00A9 is test"); %> <%= "©"%> > > > > > > <% out.println("This \u00B0 is test"); %> > > > > > > <% out.println("This \u00AE is test"); %> > > > > > > <% out.println("This \u0099 is test"); %> > > > > > > <% out.println("This \u00F6 is test"); %> > > > <% out.flush(); %> > > > > > > > > > > > > > > > > > > - > > > To unsubscribe, e-mail: > > > [EMAIL PROTECTED] > > > For additional commands, e-mail: > > > [EMAIL PROTECTED] > > > > > > > > > __ > > Do you Yahoo!? > > Yahoo! Mail Plus - Powerful. Affordable. Sign up now. > > http://mailplus.yahoo.com > > > > > - > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Setting UTF-8 Encoding
Thanks for your help. > Affan, > > The encoding is set just fine. If I copy and paste your JSP, and run it > here, I get the following as the content type in the HTTP headers: > > Content-Type: text/html; charset=UTF-8 > > You're seeing empty squares where you'd expect characters for a couple of > reasons: > > The Unicode escape for trademark is \u2122, according to the HTML 4.01 spec. > The raw copyright characters in your document are in ISO-8859-1, not UTF-8. > If you replace \u0099 by \u2122, it works just fine. Unfortunately I get the same '?' in place of those characters even by replacing it. > Alternatively, why not use ™, ©, and ® instead? The problem is that the data entry guys will copy/paste these symbols from the webpages and they will go just as the come. When viewing from database the JSP has to recognize the characters and display them accordingly. Even if i use a filter for replacing these characters it won't help because the JSPWriter would have placed the ? already in the stream being sent to the browser. The same thing works for Servlets but not for JSPs. I read that the writer in Servlets uses the content-type set in request to determine the encoding while JSPWriter uses the system settings of the Locale or something. Nasty problem isn't it? > HTH, > > Dan. Affan > > > -Original Message- > > From: Affan Qureshi [mailto:[EMAIL PROTECTED]] > > Sent: 31 January 2003 11:38 > > To: Tomcat Users List > > Subject: Setting UTF-8 Encoding > > > > > > I am having trouble setting the encoding to UTF-8 and hence my > > web pages are > > unable to render characters like the Trademark or Copyright symbols. In > > Tomcat's source at various places teh character encoding is > > hard-coded to be > > ISO-8859-1. I have tried to use the filter in the examples to set the > > encoding type but that did not help and I kept seeing questionamarks for > > those characters. I have also tried to modify the source and > > build again but > > that doesn't work either (I know I must be doing something wrong here.) > > > > Somehow tomcat doesn't allow me to change the character encoding to UTF-8. > > The same JSPs are looking fine on Weblogic and Resin without any > > configuration/modification to the server settings. > > > > Any ideas how can I fix this ugly problem in my app. The app is unusable > > without this. > > > > Thanks a lot. > > > > Affan > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Setting UTF-8 Encoding
Affan, The encoding is set just fine. If I copy and paste your JSP, and run it here, I get the following as the content type in the HTTP headers: Content-Type: text/html; charset=UTF-8 You're seeing empty squares where you'd expect characters for a couple of reasons: The Unicode escape for trademark is \u2122, according to the HTML 4.01 spec. The raw copyright characters in your document are in ISO-8859-1, not UTF-8. If you replace \u0099 by \u2122, it works just fine. Alternatively, why not use ™, ©, and ® instead? HTH, Dan. > -Original Message- > From: Affan Qureshi [mailto:[EMAIL PROTECTED]] > Sent: 31 January 2003 11:38 > To: Tomcat Users List > Subject: Setting UTF-8 Encoding > > > I am having trouble setting the encoding to UTF-8 and hence my > web pages are > unable to render characters like the Trademark or Copyright symbols. In > Tomcat's source at various places teh character encoding is > hard-coded to be > ISO-8859-1. I have tried to use the filter in the examples to set the > encoding type but that did not help and I kept seeing questionamarks for > those characters. I have also tried to modify the source and > build again but > that doesn't work either (I know I must be doing something wrong here.) > > Somehow tomcat doesn't allow me to change the character encoding to UTF-8. > The same JSPs are looking fine on Weblogic and Resin without any > configuration/modification to the server settings. > > Any ideas how can I fix this ugly problem in my app. The app is unusable > without this. > > Thanks a lot. > > Affan > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Setting UTF-8 Encoding
Locale object is set to to "en_US" by default. And I am using Tomcat 4.1.18 on Win2K. i have also tried the same on SunSOLARIS and Linux. If I use servlets instead of JSP it works fine and output the characters as required. But i guess its the JSPWriter that does something which shows teh question marks in place of those characters. However, the same JSP code works in for Resin and Weblogic. Baffled Affan - Original Message - From: "Masood Ahmed" <[EMAIL PROTECTED]> To: "Tomcat Users List" <[EMAIL PROTECTED]> Sent: Friday, January 31, 2003 5:13 PM Subject: Re: Setting UTF-8 Encoding > Have you tried setting the locale directly on the > request object? See if that helps. > > What version of tomcat are you using? > > thanks, > -Masood > > --- Affan Qureshi <[EMAIL PROTECTED]> wrote: > > I forgot to paste my code which is there at the > > bottom now. > > > > > I am having trouble setting the encoding to UTF-8 > > and hence my web pages > > are > > > unable to render characters like the Trademark or > > Copyright symbols. In > > > Tomcat's source at various places teh character > > encoding is hard-coded to > > be > > > ISO-8859-1. I have tried to use the filter in the > > examples to set the > > > encoding type but that did not help and I kept > > seeing questionamarks for > > > those characters. I have also tried to modify the > > source and build again > > but > > > that doesn't work either (I know I must be doing > > something wrong here.) > > > > > > Somehow tomcat doesn't allow me to change the > > character encoding to UTF-8. > > > The same JSPs are looking fine on Weblogic and > > Resin without any > > > configuration/modification to the server settings. > > > > > > Any ideas how can I fix this ugly problem in my > > app. The app is unusable > > > without this. > > > > > > Thanks a lot. > > > > > > Affan > > > > Here is my code for the Test JSP: > > <%@page contentType="text/html; charset=UTF-8"%> > > > > Test JSP > > > > <% out.println('\u00A9'); %> > > <% System.out.println("This © is test");%> > > > > <% out.println("This ° is test"); %> > > > > <% out.println("This © is test"); %> > > > > <% out.println("This \u00A9 is test"); %> <%= "©"%> > > > > <% out.println("This \u00B0 is test"); %> > > > > <% out.println("This \u00AE is test"); %> > > > > <% out.println("This \u0099 is test"); %> > > > > <% out.println("This \u00F6 is test"); %> > > <% out.flush(); %> > > > > > > > > > > > - > > To unsubscribe, e-mail: > > [EMAIL PROTECTED] > > For additional commands, e-mail: > > [EMAIL PROTECTED] > > > > > __ > Do you Yahoo!? > Yahoo! Mail Plus - Powerful. Affordable. Sign up now. > http://mailplus.yahoo.com > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Setting UTF-8 Encoding
Have you tried setting the locale directly on the request object? See if that helps. What version of tomcat are you using? thanks, -Masood --- Affan Qureshi <[EMAIL PROTECTED]> wrote: > I forgot to paste my code which is there at the > bottom now. > > > I am having trouble setting the encoding to UTF-8 > and hence my web pages > are > > unable to render characters like the Trademark or > Copyright symbols. In > > Tomcat's source at various places teh character > encoding is hard-coded to > be > > ISO-8859-1. I have tried to use the filter in the > examples to set the > > encoding type but that did not help and I kept > seeing questionamarks for > > those characters. I have also tried to modify the > source and build again > but > > that doesn't work either (I know I must be doing > something wrong here.) > > > > Somehow tomcat doesn't allow me to change the > character encoding to UTF-8. > > The same JSPs are looking fine on Weblogic and > Resin without any > > configuration/modification to the server settings. > > > > Any ideas how can I fix this ugly problem in my > app. The app is unusable > > without this. > > > > Thanks a lot. > > > > Affan > > Here is my code for the Test JSP: > <%@page contentType="text/html; charset=UTF-8"%> > > Test JSP > > <% out.println('\u00A9'); %> > <% System.out.println("This © is test");%> > > <% out.println("This ° is test"); %> > > <% out.println("This © is test"); %> > > <% out.println("This \u00A9 is test"); %> <%= "©"%> > > <% out.println("This \u00B0 is test"); %> > > <% out.println("This \u00AE is test"); %> > > <% out.println("This \u0099 is test"); %> > > <% out.println("This \u00F6 is test"); %> > <% out.flush(); %> > > > > > - > To unsubscribe, e-mail: > [EMAIL PROTECTED] > For additional commands, e-mail: > [EMAIL PROTECTED] > __ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Setting UTF-8 Encoding
I forgot to paste my code which is there at the bottom now. > I am having trouble setting the encoding to UTF-8 and hence my web pages are > unable to render characters like the Trademark or Copyright symbols. In > Tomcat's source at various places teh character encoding is hard-coded to be > ISO-8859-1. I have tried to use the filter in the examples to set the > encoding type but that did not help and I kept seeing questionamarks for > those characters. I have also tried to modify the source and build again but > that doesn't work either (I know I must be doing something wrong here.) > > Somehow tomcat doesn't allow me to change the character encoding to UTF-8. > The same JSPs are looking fine on Weblogic and Resin without any > configuration/modification to the server settings. > > Any ideas how can I fix this ugly problem in my app. The app is unusable > without this. > > Thanks a lot. > > Affan Here is my code for the Test JSP: <%@page contentType="text/html; charset=UTF-8"%> Test JSP <% out.println('\u00A9'); %> <% System.out.println("This © is test");%> <% out.println("This ° is test"); %> <% out.println("This © is test"); %> <% out.println("This \u00A9 is test"); %> <%= "©"%> <% out.println("This \u00B0 is test"); %> <% out.println("This \u00AE is test"); %> <% out.println("This \u0099 is test"); %> <% out.println("This \u00F6 is test"); %> <% out.flush(); %> - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Setting UTF-8 Encoding
I am having trouble setting the encoding to UTF-8 and hence my web pages are unable to render characters like the Trademark or Copyright symbols. In Tomcat's source at various places teh character encoding is hard-coded to be ISO-8859-1. I have tried to use the filter in the examples to set the encoding type but that did not help and I kept seeing questionamarks for those characters. I have also tried to modify the source and build again but that doesn't work either (I know I must be doing something wrong here.) Somehow tomcat doesn't allow me to change the character encoding to UTF-8. The same JSPs are looking fine on Weblogic and Resin without any configuration/modification to the server settings. Any ideas how can I fix this ugly problem in my app. The app is unusable without this. Thanks a lot. Affan - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Switching on UTF-8 Encoding
On Fri, 8 Feb 2002, Antony Stace wrote: > Date: Fri, 8 Feb 2002 12:03:35 +0900 > From: Antony Stace <[EMAIL PROTECTED]> > Reply-To: Tomcat Users List <[EMAIL PROTECTED]> > To: Tomcat Users List <[EMAIL PROTECTED]>, > [EMAIL PROTECTED] > Subject: Re: Switching on UTF-8 Encoding > > Thanks Jeff, Timothy, Craig for your replies. > > I have a situation where I have a form which is UTF-8 format. In the servlet(I am >acutally using struts) > when I am processing a user request I use > > > name = userForm.getName(); //Struts saves the information from a form in a Bean > name = new String(name.getBytes(),"UTF-8"); > > I can then save the name value in a database without problems. > > I then use the contents of the Bean to write output in a jsp file but I get garbage. > Does this mean that the format of the data in the Bean is incorrect? Should the > values in this bean be written in a different format? > > If it is any use, I printed out the request and response encoding to a log file in >the servlet, > > request.getCharacterEncoding() = null > response.getCharacterEncoding() = ISO-8859-1 > This means that your browser didn't include a character encoding in it's Content-Type header on the form submission (sadly typical, unfortunately). If you know that you're running on a Servlet 2.3 environment (like Tomcat 4), you can call request.setCharacterEncoding() *before* calling any of the getParameter() methods, and Tomcat will do the translation for you. One approach to this is to use a Filter -- an example filter that does this sort of thing (SetCharacterEncodingFilter) is included in the WEB-INF/classes of the example webapp that is shipped with Tomcat 4. > > Cheers > > Tony > Craig > > On Thu, 7 Feb 2002 08:59:53 -0600 > [EMAIL PROTECTED] wrote: > > > > > You can use <%@ page contentType="text/html;charset=UTF-8" %> in the JSP or > > alternatively include the tag in your HTML. This will tell the browser to use the UTF-8 > > Encoding. > > > > Then when getting the requests, you can do a request.setCharacterEncoding > > ("UTF-8") before getting anything from the request to allow you to read in > > parameters as UTF-8. You could also try just reading in the parameters > > without setting that, and then doing param.getBytes("UTF-8"). > > > > I've been struggling with some encoding issues for a little while now, but I > > have it working, so if you have any other questions, please feel free to email > > me and I'll see if I can help. > > > > Good luck, > > -Jeff > > > > > > > > > > Antony Stace > > > hoo.com> cc: > > Subject: Switching on UTF-8 Encoding > > 02/07/02 > > 07:45 AM > > Please > > respond to > > "Tomcat Users > > List" > > > > > > > > > > > > > > Hi > > > > What do I need to do so that data returned from Tomcat 4 is returned in UTF-8 > > encoding to a requesting browser and > > requests received are read as UTF-8. > > > > -- > > > > > > Cheers > > > > Tony$B!#(B > > - > > > > > > _ > > Do You Yahoo!? > > Get your free @yahoo.com address at http://mail.yahoo.com > > > > > > -- > > To unsubscribe: <mailto:[EMAIL PROTECTED]> > > For additional commands: <mailto:[EMAIL PROTECTED]> > > Troubles with the list: <mailto:[EMAIL PROTECTED]> > > > > > > > > > > > > > > -- > > To unsubscribe: <mailto:[EMAIL PROTECTED]> > > For additional commands: <mailto:[EMAIL PROTECTED]> > > Troubles with the list: <mailto:[EMAIL PROTECTED]> > > > -- > > > Cheers > > Tony$B!#(B > - > > > _ > Do You Yahoo!? > Get your free @yahoo.com address at http://mail.yahoo.com > > > -- > To unsubscribe: <mailto:[EMAIL PROTECTED]> > For additional commands: <mailto:[EMAIL PROTECTED]> > Troubles with the list: <mailto:[EMAIL PROTECTED]> > > -- To unsubscribe: <mailto:[EMAIL PROTECTED]> For additional commands: <mailto:[EMAIL PROTECTED]> Troubles with the list: <mailto:[EMAIL PROTECTED]>
Re: Switching on UTF-8 Encoding
Hi Tony, The issue maybe in these places: 1. Request object - Jeff has covered the issue. 2. Database I/O - You have find out what type of Unicode encoding does the Database support. (UTF-8 or UCS-2). If it is UCS-2 then you have convert the data into UTF-8 at the java end. 3. The JSP's encoding should set as UTF-8. As mentioned by Jeff. Moreover the browser should have access to the appropriate fonts to show the data. Regards, Karthik - Original Message - From: "Antony Stace" <[EMAIL PROTECTED]> To: "Tomcat Users List" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Sent: Friday, February 08, 2002 8:33 AM Subject: Re: Switching on UTF-8 Encoding > Thanks Jeff, Timothy, Craig for your replies. > > I have a situation where I have a form which is UTF-8 format. In the servlet(I am acutally using struts) > when I am processing a user request I use > > > name = userForm.getName(); file://Struts saves the information from a form in a Bean > name = new String(name.getBytes(),"UTF-8"); > > I can then save the name value in a database without problems. > > I then use the contents of the Bean to write output in a jsp file but I get garbage. > Does this mean that the format of the data in the Bean is incorrect? Should the > values in this bean be written in a different format? > > If it is any use, I printed out the request and response encoding to a log file in the servlet, > > request.getCharacterEncoding() = null > response.getCharacterEncoding() = ISO-8859-1 > > > Cheers > > Tony > > > On Thu, 7 Feb 2002 08:59:53 -0600 > [EMAIL PROTECTED] wrote: > > > > > You can use <%@ page contentType="text/html;charset=UTF-8" %> in the JSP or > > alternatively include the tag in your HTML. This will tell the browser to use the UTF-8 > > Encoding. > > > > Then when getting the requests, you can do a request.setCharacterEncoding > > ("UTF-8") before getting anything from the request to allow you to read in > > parameters as UTF-8. You could also try just reading in the parameters > > without setting that, and then doing param.getBytes("UTF-8"). > > > > I've been struggling with some encoding issues for a little while now, but I > > have it working, so if you have any other questions, please feel free to email > > me and I'll see if I can help. > > > > Good luck, > > -Jeff > > > > > > > > > > Antony Stace > > > hoo.com> cc: > > Subject: Switching on UTF-8 Encoding > > 02/07/02 > > 07:45 AM > > Please > > respond to > > "Tomcat Users > > List" > > > > > > > > > > > > > > Hi > > > > What do I need to do so that data returned from Tomcat 4 is returned in UTF-8 > > encoding to a requesting browser and > > requests received are read as UTF-8. > > > > -- > > > > > > Cheers > > > > Tony$B!#(B > > - > > > > > > _ > > Do You Yahoo!? > > Get your free @yahoo.com address at http://mail.yahoo.com > > > > > > -- > > To unsubscribe: <mailto:[EMAIL PROTECTED]> > > For additional commands: <mailto:[EMAIL PROTECTED]> > > Troubles with the list: <mailto:[EMAIL PROTECTED]> > > > > > > > > > > > > > > -- > > To unsubscribe: <mailto:[EMAIL PROTECTED]> > > For additional commands: <mailto:[EMAIL PROTECTED]> > > Troubles with the list: <mailto:[EMAIL PROTECTED]> > > > -- > > > Cheers > > Tony$B!#(B > - > > > _ > Do You Yahoo!? > Get your free @yahoo.com address at http://mail.yahoo.com > > > -- > To unsubscribe: <mailto:[EMAIL PROTECTED]> > For additional commands: <mailto:[EMAIL PROTECTED]> > Troubles with the list: <mailto:[EMAIL PROTECTED]> > > -- To unsubscribe: <mailto:[EMAIL PROTECTED]> For additional commands: <mailto:[EMAIL PROTECTED]> Troubles with the list: <mailto:[EMAIL PROTECTED]>
Re: Switching on UTF-8 Encoding
Thanks Jeff, Timothy, Craig for your replies. I have a situation where I have a form which is UTF-8 format. In the servlet(I am acutally using struts) when I am processing a user request I use name = userForm.getName(); //Struts saves the information from a form in a Bean name = new String(name.getBytes(),"UTF-8"); I can then save the name value in a database without problems. I then use the contents of the Bean to write output in a jsp file but I get garbage. Does this mean that the format of the data in the Bean is incorrect? Should the values in this bean be written in a different format? If it is any use, I printed out the request and response encoding to a log file in the servlet, request.getCharacterEncoding() = null response.getCharacterEncoding() = ISO-8859-1 Cheers Tony On Thu, 7 Feb 2002 08:59:53 -0600 [EMAIL PROTECTED] wrote: > > You can use <%@ page contentType="text/html;charset=UTF-8" %> in the JSP or > alternatively include the tag in your HTML. This will tell the browser to use the UTF-8 > Encoding. > > Then when getting the requests, you can do a request.setCharacterEncoding > ("UTF-8") before getting anything from the request to allow you to read in > parameters as UTF-8. You could also try just reading in the parameters > without setting that, and then doing param.getBytes("UTF-8"). > > I've been struggling with some encoding issues for a little while now, but I > have it working, so if you have any other questions, please feel free to email > me and I'll see if I can help. > > Good luck, > -Jeff > > > > > > Antony Stace > > > hoo.com> cc: > > Subject: Switching on UTF-8 Encoding > > 02/07/02 > > 07:45 AM > > Please > > respond to > > "Tomcat Users > > List" > > > > > > > > > > Hi > > What do I need to do so that data returned from Tomcat 4 is returned in UTF-8 > encoding to a requesting browser and > requests received are read as UTF-8. > > -- > > > Cheers > > Tony$B!#(B > - > > > _ > Do You Yahoo!? > Get your free @yahoo.com address at http://mail.yahoo.com > > > -- > To unsubscribe: <mailto:[EMAIL PROTECTED]> > For additional commands: <mailto:[EMAIL PROTECTED]> > Troubles with the list: <mailto:[EMAIL PROTECTED]> > > > > > > > -- > To unsubscribe: <mailto:[EMAIL PROTECTED]> > For additional commands: <mailto:[EMAIL PROTECTED]> > Troubles with the list: <mailto:[EMAIL PROTECTED]> -- Cheers Tony$B!#(B - _ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com -- To unsubscribe: <mailto:[EMAIL PROTECTED]> For additional commands: <mailto:[EMAIL PROTECTED]> Troubles with the list: <mailto:[EMAIL PROTECTED]>
Re: Switching on UTF-8 Encoding
i did it by using filter, it works quite good >From Timothy - Original Message - From: "Antony Stace" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Thursday, February 07, 2002 9:45 PM Subject: Switching on UTF-8 Encoding > Hi > > What do I need to do so that data returned from Tomcat 4 is returned in UTF-8 encoding to a requesting browser and > requests received are read as UTF-8. > > -- > > > Cheers > > Tony$B!#(B > - > > > _ > Do You Yahoo!? > Get your free @yahoo.com address at http://mail.yahoo.com > > > -- > To unsubscribe: <mailto:[EMAIL PROTECTED]> > For additional commands: <mailto:[EMAIL PROTECTED]> > Troubles with the list: <mailto:[EMAIL PROTECTED]> > -- To unsubscribe: <mailto:[EMAIL PROTECTED]> For additional commands: <mailto:[EMAIL PROTECTED]> Troubles with the list: <mailto:[EMAIL PROTECTED]>
Re: Switching on UTF-8 Encoding
On Thu, 7 Feb 2002, Antony Stace wrote: > Date: Thu, 7 Feb 2002 22:45:23 +0900 > From: Antony Stace <[EMAIL PROTECTED]> > Reply-To: Tomcat Users List <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Subject: Switching on UTF-8 Encoding > > Hi > > What do I need to do so that data returned from Tomcat 4 is returned in UTF-8 >encoding to a requesting browser and > requests received are read as UTF-8. > For writing UTF-8 content, your servlet needs to set the character encoding *before* it gets the response's writer: response.setContentType("text/html;charset=UTF-8"); PrintWriter writer = response.getWriter(); writer.println("This line will be written in UTF-8"); For reading, the browser should have set a character encoding on its "Content-Type" header. If it didn't (or if this is a GET request and you are trying to process query string parameters), call the following *before* calling any of the request.getParameter methods (or request.getReader): request.setCharacterEncoding("UTF-8"); Note that this method was added in Servlet 2.3, so it won't work in Tomcat 3.x environments. > -- > > > Cheers > > Tony$B!#(B Craig -- To unsubscribe: <mailto:[EMAIL PROTECTED]> For additional commands: <mailto:[EMAIL PROTECTED]> Troubles with the list: <mailto:[EMAIL PROTECTED]>
Re: Switching on UTF-8 Encoding
You can use <%@ page contentType="text/html;charset=UTF-8" %> in the JSP or alternatively include the tag in your HTML. This will tell the browser to use the UTF-8 Encoding. Then when getting the requests, you can do a request.setCharacterEncoding ("UTF-8") before getting anything from the request to allow you to read in parameters as UTF-8. You could also try just reading in the parameters without setting that, and then doing param.getBytes("UTF-8"). I've been struggling with some encoding issues for a little while now, but I have it working, so if you have any other questions, please feel free to email me and I'll see if I can help. Good luck, -Jeff Antony Stace cc: Subject: Switching on UTF-8 Encoding 02/07/02 07:45 AM Please respond to "Tomcat Users List" Hi What do I need to do so that data returned from Tomcat 4 is returned in UTF-8 encoding to a requesting browser and requests received are read as UTF-8. -- Cheers Tony$B!#(B - _ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com -- To unsubscribe: <mailto:[EMAIL PROTECTED]> For additional commands: <mailto:[EMAIL PROTECTED]> Troubles with the list: <mailto:[EMAIL PROTECTED]> -- To unsubscribe: <mailto:[EMAIL PROTECTED]> For additional commands: <mailto:[EMAIL PROTECTED]> Troubles with the list: <mailto:[EMAIL PROTECTED]>
Switching on UTF-8 Encoding
Hi What do I need to do so that data returned from Tomcat 4 is returned in UTF-8 encoding to a requesting browser and requests received are read as UTF-8. -- Cheers Tony$B!#(B - _ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com -- To unsubscribe: <mailto:[EMAIL PROTECTED]> For additional commands: <mailto:[EMAIL PROTECTED]> Troubles with the list: <mailto:[EMAIL PROTECTED]>
RE: Shouldn't Tomcat 3.2.1 decode the UTF-8 encoding of request parameters?
This is the way it is supposed to work. The default form submission encoding is application/x-www-form-urlencoded (which you'll notice is what got sent in the Content-Type header. This means that all non-ASCII data is going to get URL encoded using %HH (where H is single HEX digit). Your single input character got turned into Unicode and then encoded into UTF-8 which turned it into 3 bytes. These three bytes where then URL encoded and sent to the servlet. You'll also notice that nothing in the POST request sent to the servlet indicates the character encoding. There is no way for the servlet container to convert this data from the three bytes it receives back into characters because nothing supplies the appropriate encoding. This is not the fault of the container, its a major hole in the HTTP and HTML specifications that makes any I18n effort a royal pain in the a**. There are a couple ways to decode the data but what I use is something like this: sValue = new String(sOriginal.getBytes("8859_1"), sEncoding); where sEncoding is the encoding used in the client (e.g. Shift_JIS). You can't determine sEncoding a proiori. You'll need to either assume that all data sent to your application is in a given encoding or pass the correct encoding in a hidden form field, etc. > -Original Message- > From: Mike Spreitzer [mailto:[EMAIL PROTECTED]] > Sent: Monday, February 19, 2001 4:27 PM > To: [EMAIL PROTECTED] > Subject: Shouldn't Tomcat 3.2.1 decode the UTF-8 encoding of request > parameters? > > > Consider a form that is encoded in UTF-8. Here's how it comes down: > > HTTP/1.0 200 OK > Content-Type: text/html; charset=UTF-8 > Servlet-Engine: Tomcat Web Server/3.2.1 (JSP 1.1; Servlet 2.2; > Java 1.3.0; > AIX 4.3 ppc; java.vendor=IBM Corporation) > > > "http://www.w3.org/TR/html4/DTD/loose.dtd"> > > ... > > ... > > ... > > I fill in the "usr" field with a single character, U+201D, and submit. > Here's how the submission goes up: > > POST /servlet/SusrReg HTTP/1.1 > Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, > application/x-comet, application/pdf, */* > Referer: http://9.2.43.70:8085/servlet/SusrReg > Accept-Language: en-us > Content-Type: application/x-www-form-urlencoded > Accept-Encoding: gzip, deflate > User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0) > Host: 9.2.43.70:8085 > Content-Length: 165 > Connection: Keep-Alive > Cookie: JSESSIONID=loj2w5hcz1 > > usr=%E2%80%9D&B1=Submit > > In my servlet, I find the value of the request parameter named "usr" is a > string of three characters: U+00E2, U+0080, U+009D. Should I be > offended, > or expect that the servlet should have to decode the UTF-8? I find the > servlet spec v2.2 fairly silent on the issue, leading me to expect that > the servlet container is supposed to handle the full parameter decoding. > > Thanks, > Mike > > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, email: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]
Shouldn't Tomcat 3.2.1 decode the UTF-8 encoding of request parameters?
Consider a form that is encoded in UTF-8. Here's how it comes down: HTTP/1.0 200 OK Content-Type: text/html; charset=UTF-8 Servlet-Engine: Tomcat Web Server/3.2.1 (JSP 1.1; Servlet 2.2; Java 1.3.0; AIX 4.3 ppc; java.vendor=IBM Corporation) http://www.w3.org/TR/html4/DTD/loose.dtd"> ... ... ... I fill in the "usr" field with a single character, U+201D, and submit. Here's how the submission goes up: POST /servlet/SusrReg HTTP/1.1 Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/x-comet, application/pdf, */* Referer: http://9.2.43.70:8085/servlet/SusrReg Accept-Language: en-us Content-Type: application/x-www-form-urlencoded Accept-Encoding: gzip, deflate User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0) Host: 9.2.43.70:8085 Content-Length: 165 Connection: Keep-Alive Cookie: JSESSIONID=loj2w5hcz1 usr=%E2%80%9D&B1=Submit In my servlet, I find the value of the request parameter named "usr" is a string of three characters: U+00E2, U+0080, U+009D. Should I be offended, or expect that the servlet should have to decode the UTF-8? I find the servlet spec v2.2 fairly silent on the issue, leading me to expect that the servlet container is supposed to handle the full parameter decoding. Thanks, Mike - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]