RE: Tomcat and Unicode parameters in URLs ???
Thanks for pointing this out Larry. Unfortunately we use Tomcat 4 only because it seems quite a bit faster than the Tomcat 3 series. Thank you though. It looks like I'm going to have to learn how to guess the character set and language. Thank you, Soefara. From: Larry Isaacs [EMAIL PROTECTED] Reply-To: Tomcat Users List [EMAIL PROTECTED] To: 'Tomcat Users List' [EMAIL PROTECTED] Subject: RE: Tomcat and Unicode parameters in URLs ??? Date: Tue, 19 Mar 2002 07:51:46 -0500 If you can live with the Servlet 2.2 spec, Tomcat 3.3 has a work around for this. The DecodeInterceptor can accept a URL like the following to specify the encoding as part of the URI: http://localhost:8080/myapp/index.jsp;charset=UTF-8?param=value For details, see the charsetAttribute attribute of the DecodeInterceptor: http://jakarta.apache.org/tomcat/tomcat-3.3-doc/serverxml.html#DecodeInterceptor You are welcome to give it a try. Cheers, Larry -Original Message- From: Craig R. McClanahan [mailto:[EMAIL PROTECTED]] Sent: Monday, March 18, 2002 8:50 PM To: Tomcat Users List Subject: Re: Tomcat and Unicode parameters in URLs ??? On Tue, 19 Mar 2002, Soefara Redzuan wrote: Date: Tue, 19 Mar 2002 09:20:47 +0800 From: Soefara Redzuan [EMAIL PROTECTED] Reply-To: Tomcat Users List [EMAIL PROTECTED] To: [EMAIL PROTECTED] Subject: Re: Tomcat and Unicode parameters in URLs ??? Setting the content type, as you did above, only affects the *output* of that particular response -- it has nothing to do with how the next *input* request from that browser will be handled. In order to deal with request parameters in an incoming request, you must tell Tomcat what encoding to use, *before* processing the parameters. This is done by calling the request.setCharacterEncoding() method that was added in Servlet 2.3. As long as you call this before calling methods like request.getParameter(), the proper encoding will be applied. One way to do this without modifying your application itself is to use a Filter that looks at incoming requests and decides what encoding should be used -- perhaps by looking at the codeAccept-Language/code header, or based on attributes you have stored in the current session that indicate what the user will be supplying. But what happens if you really do not know what character set to expect ? In our company, the webserver is used for B2B messaging with customers and not purely serving web pages. For example, we can accept a message with a query string like this http://vpn.ourcompany.com/servlet/incoming? company=CustomerNamerefere nceId=1234noteText= The noteText could be in one of several languages since we do international business. We're currently considering adding a language parameter such as Language=English but it would be nicer to autodetect the language. Is this possible ? It would be possible if the HTTP specs defined a way to tell the server what language the HTTP URL is encoded in, and if browsers actually sent along that indication. Neither seems to be the case in general -- even on a POST transaction (where the browsers really have no excuse for not including the character encoding in the Content-Type header), many don't. Thus, you're stuck haveing to figure it out for yourself. Note that adding a language parameter to the query string isn't going to do you much good -- you have to call setCharacterEncoding() *before* you call request.getParameter(), so you won't have been able to read the language field first. Thank you, Soefara _ Craig Join the world's largest e-mail service with MSN Hotmail. http://www.hotmail.com -- To unsubscribe: mailto:tomcat-user- [EMAIL PROTECTED] For additional commands: mailto:[EMAIL PROTECTED] Troubles with the list: mailto:[EMAIL PROTECTED] -- To unsubscribe: mailto:[EMAIL PROTECTED] For additional commands: mailto:[EMAIL PROTECTED] Troubles with the list: mailto:[EMAIL PROTECTED] -- To unsubscribe: mailto:[EMAIL PROTECTED] For additional commands: mailto:[EMAIL PROTECTED] Troubles with the list: mailto:[EMAIL PROTECTED] _ Chat with friends online, try MSN Messenger: http://messenger.msn.com -- To unsubscribe: mailto:[EMAIL PROTECTED] For additional commands: mailto:[EMAIL PROTECTED] Troubles with the list: mailto:[EMAIL PROTECTED]
RE: Tomcat and Unicode parameters in URLs ???
If you can live with the Servlet 2.2 spec, Tomcat 3.3 has a work around for this. The DecodeInterceptor can accept a URL like the following to specify the encoding as part of the URI: http://localhost:8080/myapp/index.jsp;charset=UTF-8?param=value For details, see the charsetAttribute attribute of the DecodeInterceptor: http://jakarta.apache.org/tomcat/tomcat-3.3-doc/serverxml.html#DecodeInterceptor You are welcome to give it a try. Cheers, Larry -Original Message- From: Craig R. McClanahan [mailto:[EMAIL PROTECTED]] Sent: Monday, March 18, 2002 8:50 PM To: Tomcat Users List Subject: Re: Tomcat and Unicode parameters in URLs ??? On Tue, 19 Mar 2002, Soefara Redzuan wrote: Date: Tue, 19 Mar 2002 09:20:47 +0800 From: Soefara Redzuan [EMAIL PROTECTED] Reply-To: Tomcat Users List [EMAIL PROTECTED] To: [EMAIL PROTECTED] Subject: Re: Tomcat and Unicode parameters in URLs ??? Setting the content type, as you did above, only affects the *output* of that particular response -- it has nothing to do with how the next *input* request from that browser will be handled. In order to deal with request parameters in an incoming request, you must tell Tomcat what encoding to use, *before* processing the parameters. This is done by calling the request.setCharacterEncoding() method that was added in Servlet 2.3. As long as you call this before calling methods like request.getParameter(), the proper encoding will be applied. One way to do this without modifying your application itself is to use a Filter that looks at incoming requests and decides what encoding should be used -- perhaps by looking at the codeAccept-Language/code header, or based on attributes you have stored in the current session that indicate what the user will be supplying. But what happens if you really do not know what character set to expect ? In our company, the webserver is used for B2B messaging with customers and not purely serving web pages. For example, we can accept a message with a query string like this http://vpn.ourcompany.com/servlet/incoming? company=CustomerNamerefere nceId=1234noteText= The noteText could be in one of several languages since we do international business. We're currently considering adding a language parameter such as Language=English but it would be nicer to autodetect the language. Is this possible ? It would be possible if the HTTP specs defined a way to tell the server what language the HTTP URL is encoded in, and if browsers actually sent along that indication. Neither seems to be the case in general -- even on a POST transaction (where the browsers really have no excuse for not including the character encoding in the Content-Type header), many don't. Thus, you're stuck haveing to figure it out for yourself. Note that adding a language parameter to the query string isn't going to do you much good -- you have to call setCharacterEncoding() *before* you call request.getParameter(), so you won't have been able to read the language field first. Thank you, Soefara _ Craig Join the world's largest e-mail service with MSN Hotmail. http://www.hotmail.com -- To unsubscribe: mailto:tomcat-user- [EMAIL PROTECTED] For additional commands: mailto:[EMAIL PROTECTED] Troubles with the list: mailto:[EMAIL PROTECTED] -- To unsubscribe: mailto:[EMAIL PROTECTED] For additional commands: mailto:[EMAIL PROTECTED] Troubles with the list: mailto:[EMAIL PROTECTED] -- To unsubscribe: mailto:[EMAIL PROTECTED] For additional commands: mailto:[EMAIL PROTECTED] Troubles with the list: mailto:[EMAIL PROTECTED]
Re: Tomcat and Unicode parameters in URLs ???
Setting the content type, as you did above, only affects the *output* of that particular response -- it has nothing to do with how the next *input* request from that browser will be handled. In order to deal with request parameters in an incoming request, you must tell Tomcat what encoding to use, *before* processing the parameters. This is done by calling the request.setCharacterEncoding() method that was added in Servlet 2.3. As long as you call this before calling methods like request.getParameter(), the proper encoding will be applied. One way to do this without modifying your application itself is to use a Filter that looks at incoming requests and decides what encoding should be used -- perhaps by looking at the codeAccept-Language/code header, or based on attributes you have stored in the current session that indicate what the user will be supplying. But what happens if you really do not know what character set to expect ? In our company, the webserver is used for B2B messaging with customers and not purely serving web pages. For example, we can accept a message with a query string like this http://vpn.ourcompany.com/servlet/incoming?company=CustomerNamereferenceId=1234noteText= The noteText could be in one of several languages since we do international business. We're currently considering adding a language parameter such as Language=English but it would be nicer to autodetect the language. Is this possible ? Thank you, Soefara _ Join the worlds largest e-mail service with MSN Hotmail. http://www.hotmail.com -- To unsubscribe: mailto:[EMAIL PROTECTED] For additional commands: mailto:[EMAIL PROTECTED] Troubles with the list: mailto:[EMAIL PROTECTED]
Re: Tomcat and Unicode parameters in URLs ???
On Tue, 19 Mar 2002, Soefara Redzuan wrote: Date: Tue, 19 Mar 2002 09:20:47 +0800 From: Soefara Redzuan [EMAIL PROTECTED] Reply-To: Tomcat Users List [EMAIL PROTECTED] To: [EMAIL PROTECTED] Subject: Re: Tomcat and Unicode parameters in URLs ??? Setting the content type, as you did above, only affects the *output* of that particular response -- it has nothing to do with how the next *input* request from that browser will be handled. In order to deal with request parameters in an incoming request, you must tell Tomcat what encoding to use, *before* processing the parameters. This is done by calling the request.setCharacterEncoding() method that was added in Servlet 2.3. As long as you call this before calling methods like request.getParameter(), the proper encoding will be applied. One way to do this without modifying your application itself is to use a Filter that looks at incoming requests and decides what encoding should be used -- perhaps by looking at the codeAccept-Language/code header, or based on attributes you have stored in the current session that indicate what the user will be supplying. But what happens if you really do not know what character set to expect ? In our company, the webserver is used for B2B messaging with customers and not purely serving web pages. For example, we can accept a message with a query string like this http://vpn.ourcompany.com/servlet/incoming?company=CustomerNamereferenceId=1234noteText= The noteText could be in one of several languages since we do international business. We're currently considering adding a language parameter such as Language=English but it would be nicer to autodetect the language. Is this possible ? It would be possible if the HTTP specs defined a way to tell the server what language the HTTP URL is encoded in, and if browsers actually sent along that indication. Neither seems to be the case in general -- even on a POST transaction (where the browsers really have no excuse for not including the character encoding in the Content-Type header), many don't. Thus, you're stuck haveing to figure it out for yourself. Note that adding a language parameter to the query string isn't going to do you much good -- you have to call setCharacterEncoding() *before* you call request.getParameter(), so you won't have been able to read the language field first. Thank you, Soefara _ Craig Join the worlds largest e-mail service with MSN Hotmail. http://www.hotmail.com -- To unsubscribe: mailto:[EMAIL PROTECTED] For additional commands: mailto:[EMAIL PROTECTED] Troubles with the list: mailto:[EMAIL PROTECTED] -- To unsubscribe: mailto:[EMAIL PROTECTED] For additional commands: mailto:[EMAIL PROTECTED] Troubles with the list: mailto:[EMAIL PROTECTED]
Tomcat and Unicode parameters in URLs ???
Hello all, I tried a million ways of making Tomcat 4.0.3 work with Unicode URL parameters, but nothing seems to work. It always corrupts the parameters. Does anybody know a workaround to make Unicode request parameters work with Tomcat? For instance, I changed the SnoopServlet example given with Tomcat 4 to output the response in the unicode with setContentType(text/html;charset=utf-8). But when I write unicode parameters in the URL text area of Internet Explorer as parameters to SnoopServlet, it always corrupts my parameters. Instead of printing them a two-bye unicode characters, it prints every unicode character as two one-byte garbage or otherwise ASCII characters. I also tried making a URL request using the URL class in JAVA SDK 1.4. That didn't work as well. The URLEncode and URLDecode classes in the SDK don't seem to do their job right also. Has anyone been able to make use of these classes? Any workarounds and bug reports will be greatly appreciated. Thanks, Mete __ Do You Yahoo!? Yahoo! Sports - live college hoops coverage http://sports.yahoo.com/ -- To unsubscribe: mailto:[EMAIL PROTECTED] For additional commands: mailto:[EMAIL PROTECTED] Troubles with the list: mailto:[EMAIL PROTECTED]
Re: Tomcat and Unicode parameters in URLs ???
On Sat, 16 Mar 2002, Mete Kural wrote: Date: Sat, 16 Mar 2002 11:05:08 -0800 (PST) From: Mete Kural [EMAIL PROTECTED] Reply-To: Tomcat Users List [EMAIL PROTECTED] To: [EMAIL PROTECTED] Subject: Tomcat and Unicode parameters in URLs ??? Hello all, I tried a million ways of making Tomcat 4.0.3 work with Unicode URL parameters, but nothing seems to work. It always corrupts the parameters. Does anybody know a workaround to make Unicode request parameters work with Tomcat? For instance, I changed the SnoopServlet example given with Tomcat 4 to output the response in the unicode with setContentType(text/html;charset=utf-8). But when I write unicode parameters in the URL text area of Internet Explorer as parameters to SnoopServlet, it always corrupts my parameters. Instead of printing them a two-bye unicode characters, it prints every unicode character as two one-byte garbage or otherwise ASCII characters. I also tried making a URL request using the URL class in JAVA SDK 1.4. That didn't work as well. The URLEncode and URLDecode classes in the SDK don't seem to do their job right also. Has anyone been able to make use of these classes? Any workarounds and bug reports will be greatly appreciated. Setting the content type, as you did above, only affects the *output* of that particular response -- it has nothing to do with how the next *input* request from that browser will be handled. In order to deal with request parameters in an incoming request, you must tell Tomcat what encoding to use, *before* processing the parameters. This is done by calling the request.setCharacterEncoding() method that was added in Servlet 2.3. As long as you call this before calling methods like request.getParameter(), the proper encoding will be applied. One way to do this without modifying your application itself is to use a Filter that looks at incoming requests and decides what encoding should be used -- perhaps by looking at the codeAccept-Language/code header, or based on attributes you have stored in the current session that indicate what the user will be supplying. A very simple example of such a filter is included in the /examples webapp shipped with Tomcat 4 -- the source code for this filter is in file SetCharacterEncodingFilter.java in the $CATALINA_HOME/webapps/examples/WEB-INF/classes/filters subdirectory. This example is fairly simpleminded -- you just configure a filter initialization parameter that is used to set the encoding for all requests -- but you can use it as a starting point for more sophisticated processing by subclassing it and overriding the selectEncoding() method. This filter can be enabled by copying the appropriate class file to your own WEB-INF/classes directory, and adding a filter definition to your web.xml file: filter filter-nameCharacter Encoding Filter/filter-name filter-classfilters.SetCharacterEncodingFilter/filter-class init-param param-nameencoding/param-name param-valueUTF-8/param-value /init-param /filter Then, you select which requests this filter applies to with a filter mapping -- the /* pattern says apply it to *all* requests: filter-mapping filter-nameCharacter Encoding Filter/filter-name url-pattern/*/url-pattern /filter-mapping With filter mappings like this, you can be more selective about which URLs it applies to by using a more precise URL pattern, or apply different filters to different URLs -- all without affecting your servlets or JSP apges at all. The syntax for the url-pattern element in a filter mapping is the same as that used for a servlet mapping. Be sure to put these elements in the correct places in your web.xml file to maintain the element order that is required by the DTD. Thanks, Mete Craig -- To unsubscribe: mailto:[EMAIL PROTECTED] For additional commands: mailto:[EMAIL PROTECTED] Troubles with the list: mailto:[EMAIL PROTECTED]