RE: Tomcat and Unicode parameters in URLs ???

2002-03-20 Thread Soefara Redzuan

Thanks for pointing this out Larry. Unfortunately we use Tomcat 4 only 
because it seems quite a bit faster than the Tomcat 3 series. Thank you 
though. It looks like I'm going to have to learn how to guess the 
character set and language.

Thank you, Soefara.

From: Larry Isaacs [EMAIL PROTECTED]
Reply-To: Tomcat Users List [EMAIL PROTECTED]
To: 'Tomcat Users List' [EMAIL PROTECTED]
Subject: RE: Tomcat and Unicode parameters in URLs ???
Date: Tue, 19 Mar 2002 07:51:46 -0500

If you can live with the Servlet 2.2 spec, Tomcat 3.3
has a work around for this.  The DecodeInterceptor can
accept a URL like the following to specify the encoding
as part of the URI:

http://localhost:8080/myapp/index.jsp;charset=UTF-8?param=value

For details, see the charsetAttribute attribute of the
DecodeInterceptor:

http://jakarta.apache.org/tomcat/tomcat-3.3-doc/serverxml.html#DecodeInterceptor

You are welcome to give it a try.

Cheers,
Larry

  -Original Message-
  From: Craig R. McClanahan [mailto:[EMAIL PROTECTED]]
  Sent: Monday, March 18, 2002 8:50 PM
  To: Tomcat Users List
  Subject: Re: Tomcat and Unicode parameters in URLs ???
 
 
 
 
  On Tue, 19 Mar 2002, Soefara Redzuan wrote:
 
   Date: Tue, 19 Mar 2002 09:20:47 +0800
   From: Soefara Redzuan [EMAIL PROTECTED]
   Reply-To: Tomcat Users List [EMAIL PROTECTED]
   To: [EMAIL PROTECTED]
   Subject: Re: Tomcat and Unicode parameters in URLs ???
  
  
   Setting the content type, as you did above, only affects
  the *output*
   of that particular response -- it has nothing to do with
  how the next
   *input* request from that browser will be handled.
   
   In order to deal with request parameters in an incoming
  request, you
   must tell Tomcat what encoding to use, *before* processing the
   parameters. This is done by calling the
   request.setCharacterEncoding() method that was added in
  Servlet 2.3.
   As long as you call this before calling methods like
   request.getParameter(), the proper encoding will be applied.
   
   One way to do this without modifying your application itself is to
   use a Filter that looks at incoming requests and decides what
   encoding should be used -- perhaps by looking at the
   codeAccept-Language/code header, or based on
  attributes you have
   stored in the current session that indicate what the user will be
   supplying.
  
   But what happens if you really do not know what character set to
   expect ? In our company, the webserver is used for B2B
  messaging with
   customers and not purely serving web pages.  For example, we can
   accept a message with a query string like this
  
  http://vpn.ourcompany.com/servlet/incoming? company=CustomerNamerefere
   nceId=1234noteText=
  
   The noteText could be in one of several languages since we do
   international business. We're currently considering adding
  a language
   parameter such as Language=English but it would be nicer to
   autodetect the language. Is this possible ?
  
 
  It would be possible if the HTTP specs defined a way to tell
  the server what language the HTTP URL is encoded in, and if
  browsers actually sent along that indication.  Neither seems
  to be the case in general -- even on a POST transaction
  (where the browsers really have no excuse for not including
  the character encoding in the Content-Type header), many
  don't. Thus, you're stuck haveing to figure it out for yourself.
 
  Note that adding a language parameter to the query string
  isn't going to do you much good -- you have to call
  setCharacterEncoding() *before* you call
  request.getParameter(), so you won't have been able to read
  the language field first.
 
   Thank you, Soefara
  
   _
 
  Craig
 
   Join the world's largest e-mail service with MSN Hotmail.
   http://www.hotmail.com
  
  
   --
   To unsubscribe:
  mailto:tomcat-user- [EMAIL PROTECTED]
   For
  additional commands:
  mailto:[EMAIL PROTECTED]
   Troubles with the list:
  mailto:[EMAIL PROTECTED]
  
  
 
 
  --
  To
  unsubscribe:   mailto:[EMAIL PROTECTED]
  For additional commands: mailto:[EMAIL PROTECTED]
  Troubles with the list: mailto:[EMAIL PROTECTED]
 

--
To unsubscribe:   mailto:[EMAIL PROTECTED]
For additional commands: mailto:[EMAIL PROTECTED]
Troubles with the list: mailto:[EMAIL PROTECTED]



_
Chat with friends online, try MSN Messenger: http://messenger.msn.com


--
To unsubscribe:   mailto:[EMAIL PROTECTED]
For additional commands: mailto:[EMAIL PROTECTED]
Troubles with the list: mailto:[EMAIL PROTECTED]




RE: Tomcat and Unicode parameters in URLs ???

2002-03-19 Thread Larry Isaacs

If you can live with the Servlet 2.2 spec, Tomcat 3.3
has a work around for this.  The DecodeInterceptor can
accept a URL like the following to specify the encoding
as part of the URI:

http://localhost:8080/myapp/index.jsp;charset=UTF-8?param=value

For details, see the charsetAttribute attribute of the
DecodeInterceptor:

http://jakarta.apache.org/tomcat/tomcat-3.3-doc/serverxml.html#DecodeInterceptor

You are welcome to give it a try.

Cheers,
Larry

 -Original Message-
 From: Craig R. McClanahan [mailto:[EMAIL PROTECTED]] 
 Sent: Monday, March 18, 2002 8:50 PM
 To: Tomcat Users List
 Subject: Re: Tomcat and Unicode parameters in URLs ???
 
 
 
 
 On Tue, 19 Mar 2002, Soefara Redzuan wrote:
 
  Date: Tue, 19 Mar 2002 09:20:47 +0800
  From: Soefara Redzuan [EMAIL PROTECTED]
  Reply-To: Tomcat Users List [EMAIL PROTECTED]
  To: [EMAIL PROTECTED]
  Subject: Re: Tomcat and Unicode parameters in URLs ???
 
 
  Setting the content type, as you did above, only affects 
 the *output* 
  of that particular response -- it has nothing to do with 
 how the next 
  *input* request from that browser will be handled.
  
  In order to deal with request parameters in an incoming 
 request, you 
  must tell Tomcat what encoding to use, *before* processing the 
  parameters. This is done by calling the 
  request.setCharacterEncoding() method that was added in 
 Servlet 2.3.  
  As long as you call this before calling methods like 
  request.getParameter(), the proper encoding will be applied.
  
  One way to do this without modifying your application itself is to 
  use a Filter that looks at incoming requests and decides what 
  encoding should be used -- perhaps by looking at the 
  codeAccept-Language/code header, or based on 
 attributes you have 
  stored in the current session that indicate what the user will be 
  supplying.
 
  But what happens if you really do not know what character set to 
  expect ? In our company, the webserver is used for B2B 
 messaging with 
  customers and not purely serving web pages.  For example, we can 
  accept a message with a query string like this 
  
 http://vpn.ourcompany.com/servlet/incoming? company=CustomerNamerefere
  nceId=1234noteText=
 
  The noteText could be in one of several languages since we do 
  international business. We're currently considering adding 
 a language 
  parameter such as Language=English but it would be nicer to 
  autodetect the language. Is this possible ?
 
 
 It would be possible if the HTTP specs defined a way to tell 
 the server what language the HTTP URL is encoded in, and if 
 browsers actually sent along that indication.  Neither seems 
 to be the case in general -- even on a POST transaction 
 (where the browsers really have no excuse for not including 
 the character encoding in the Content-Type header), many 
 don't. Thus, you're stuck haveing to figure it out for yourself.
 
 Note that adding a language parameter to the query string 
 isn't going to do you much good -- you have to call 
 setCharacterEncoding() *before* you call 
 request.getParameter(), so you won't have been able to read 
 the language field first.
 
  Thank you, Soefara
 
  _
 
 Craig
 
  Join the world's largest e-mail service with MSN Hotmail. 
  http://www.hotmail.com
 
 
  --
  To unsubscribe:   
 mailto:tomcat-user- [EMAIL PROTECTED]
  For 
 additional commands: 
 mailto:[EMAIL PROTECTED]
  Troubles with the list: 
 mailto:[EMAIL PROTECTED]
 
 
 
 
 --
 To 
 unsubscribe:   mailto:[EMAIL PROTECTED]
 For additional commands: mailto:[EMAIL PROTECTED]
 Troubles with the list: mailto:[EMAIL PROTECTED]
 

--
To unsubscribe:   mailto:[EMAIL PROTECTED]
For additional commands: mailto:[EMAIL PROTECTED]
Troubles with the list: mailto:[EMAIL PROTECTED]




Re: Tomcat and Unicode parameters in URLs ???

2002-03-18 Thread Soefara Redzuan


Setting the content type, as you did above, only affects the *output* of
that particular response -- it has nothing to do with how the next *input*
request from that browser will be handled.

In order to deal with request parameters in an incoming request, you must
tell Tomcat what encoding to use, *before* processing the parameters.
This is done by calling the request.setCharacterEncoding() method that was
added in Servlet 2.3.  As long as you call this before calling methods
like request.getParameter(), the proper encoding will be applied.

One way to do this without modifying your application itself is to use a
Filter that looks at incoming requests and decides what encoding should be
used -- perhaps by looking at the codeAccept-Language/code header, or
based on attributes you have stored in the current session that indicate
what the user will be supplying.

But what happens if you really do not know what character set to expect ? In 
our company, the webserver is used for B2B messaging with customers and not 
purely serving web pages.  For example, we can accept a message with a query 
string like this
http://vpn.ourcompany.com/servlet/incoming?company=CustomerNamereferenceId=1234noteText=

The noteText could be in one of several languages since we do international 
business. We're currently considering adding a language parameter such as 
Language=English but it would be nicer to autodetect the language. Is this 
possible ?

Thank you, Soefara

_
Join the world’s largest e-mail service with MSN Hotmail. 
http://www.hotmail.com


--
To unsubscribe:   mailto:[EMAIL PROTECTED]
For additional commands: mailto:[EMAIL PROTECTED]
Troubles with the list: mailto:[EMAIL PROTECTED]




Re: Tomcat and Unicode parameters in URLs ???

2002-03-18 Thread Craig R. McClanahan



On Tue, 19 Mar 2002, Soefara Redzuan wrote:

 Date: Tue, 19 Mar 2002 09:20:47 +0800
 From: Soefara Redzuan [EMAIL PROTECTED]
 Reply-To: Tomcat Users List [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Subject: Re: Tomcat and Unicode parameters in URLs ???


 Setting the content type, as you did above, only affects the *output* of
 that particular response -- it has nothing to do with how the next *input*
 request from that browser will be handled.
 
 In order to deal with request parameters in an incoming request, you must
 tell Tomcat what encoding to use, *before* processing the parameters.
 This is done by calling the request.setCharacterEncoding() method that was
 added in Servlet 2.3.  As long as you call this before calling methods
 like request.getParameter(), the proper encoding will be applied.
 
 One way to do this without modifying your application itself is to use a
 Filter that looks at incoming requests and decides what encoding should be
 used -- perhaps by looking at the codeAccept-Language/code header, or
 based on attributes you have stored in the current session that indicate
 what the user will be supplying.

 But what happens if you really do not know what character set to expect ? In
 our company, the webserver is used for B2B messaging with customers and not
 purely serving web pages.  For example, we can accept a message with a query
 string like this
 
http://vpn.ourcompany.com/servlet/incoming?company=CustomerNamereferenceId=1234noteText=

 The noteText could be in one of several languages since we do international
 business. We're currently considering adding a language parameter such as
 Language=English but it would be nicer to autodetect the language. Is this
 possible ?


It would be possible if the HTTP specs defined a way to tell the server
what language the HTTP URL is encoded in, and if browsers actually sent
along that indication.  Neither seems to be the case in general -- even on
a POST transaction (where the browsers really have no excuse for not
including the character encoding in the Content-Type header), many don't.
Thus, you're stuck haveing to figure it out for yourself.

Note that adding a language parameter to the query string isn't going to
do you much good -- you have to call setCharacterEncoding() *before* you
call request.getParameter(), so you won't have been able to read the
language field first.

 Thank you, Soefara

 _

Craig

 Join the world’s largest e-mail service with MSN Hotmail.
 http://www.hotmail.com


 --
 To unsubscribe:   mailto:[EMAIL PROTECTED]
 For additional commands: mailto:[EMAIL PROTECTED]
 Troubles with the list: mailto:[EMAIL PROTECTED]




--
To unsubscribe:   mailto:[EMAIL PROTECTED]
For additional commands: mailto:[EMAIL PROTECTED]
Troubles with the list: mailto:[EMAIL PROTECTED]




Tomcat and Unicode parameters in URLs ???

2002-03-16 Thread Mete Kural

Hello all,

I tried a million ways of making Tomcat 4.0.3 work
with Unicode URL parameters, but nothing seems to
work. It always corrupts the parameters. Does anybody
know a workaround to make Unicode request parameters
work with Tomcat?

For instance, I changed the SnoopServlet example given
with Tomcat 4 to output the response in the unicode
with setContentType(text/html;charset=utf-8). But
when I write unicode parameters in the URL text area
of Internet Explorer as parameters to SnoopServlet, it
always corrupts my parameters. Instead of printing
them a two-bye unicode characters, it prints every
unicode character as two one-byte garbage or otherwise
ASCII characters. I also tried making a URL request
using the URL class in JAVA SDK 1.4. That didn't work
as well. The URLEncode and URLDecode classes in the
SDK don't seem to do their job right also. Has anyone
been able to make use of these classes? Any
workarounds and bug reports will be greatly
appreciated.

Thanks,
Mete


__
Do You Yahoo!?
Yahoo! Sports - live college hoops coverage
http://sports.yahoo.com/

--
To unsubscribe:   mailto:[EMAIL PROTECTED]
For additional commands: mailto:[EMAIL PROTECTED]
Troubles with the list: mailto:[EMAIL PROTECTED]




Re: Tomcat and Unicode parameters in URLs ???

2002-03-16 Thread Craig R. McClanahan



On Sat, 16 Mar 2002, Mete Kural wrote:

 Date: Sat, 16 Mar 2002 11:05:08 -0800 (PST)
 From: Mete Kural [EMAIL PROTECTED]
 Reply-To: Tomcat Users List [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Subject: Tomcat and Unicode parameters in URLs ???

 Hello all,

 I tried a million ways of making Tomcat 4.0.3 work
 with Unicode URL parameters, but nothing seems to
 work. It always corrupts the parameters. Does anybody
 know a workaround to make Unicode request parameters
 work with Tomcat?

 For instance, I changed the SnoopServlet example given
 with Tomcat 4 to output the response in the unicode
 with setContentType(text/html;charset=utf-8). But
 when I write unicode parameters in the URL text area
 of Internet Explorer as parameters to SnoopServlet, it
 always corrupts my parameters. Instead of printing
 them a two-bye unicode characters, it prints every
 unicode character as two one-byte garbage or otherwise
 ASCII characters. I also tried making a URL request
 using the URL class in JAVA SDK 1.4. That didn't work
 as well. The URLEncode and URLDecode classes in the
 SDK don't seem to do their job right also. Has anyone
 been able to make use of these classes? Any
 workarounds and bug reports will be greatly
 appreciated.


Setting the content type, as you did above, only affects the *output* of
that particular response -- it has nothing to do with how the next *input*
request from that browser will be handled.

In order to deal with request parameters in an incoming request, you must
tell Tomcat what encoding to use, *before* processing the parameters.
This is done by calling the request.setCharacterEncoding() method that was
added in Servlet 2.3.  As long as you call this before calling methods
like request.getParameter(), the proper encoding will be applied.

One way to do this without modifying your application itself is to use a
Filter that looks at incoming requests and decides what encoding should be
used -- perhaps by looking at the codeAccept-Language/code header, or
based on attributes you have stored in the current session that indicate
what the user will be supplying.  A very simple example of such a filter
is included in the /examples webapp shipped with Tomcat 4 -- the source
code for this filter is in file SetCharacterEncodingFilter.java in the
$CATALINA_HOME/webapps/examples/WEB-INF/classes/filters subdirectory.
This example is fairly simpleminded -- you just configure a filter
initialization parameter that is used to set the encoding for all requests
-- but you can use it as a starting point for more sophisticated
processing by subclassing it and overriding the selectEncoding() method.

This filter can be enabled by copying the appropriate class file to your
own WEB-INF/classes directory, and adding a filter definition to your
web.xml file:

  filter
filter-nameCharacter Encoding Filter/filter-name
filter-classfilters.SetCharacterEncodingFilter/filter-class
init-param
  param-nameencoding/param-name
  param-valueUTF-8/param-value
/init-param
  /filter

Then, you select which requests this filter applies to with a filter
mapping -- the /* pattern says apply it to *all* requests:

  filter-mapping
filter-nameCharacter Encoding Filter/filter-name
url-pattern/*/url-pattern
  /filter-mapping

With filter mappings like this, you can be more selective about which URLs
it applies to by using a more precise URL pattern, or apply different
filters to different URLs -- all without affecting your servlets or JSP
apges at all.  The syntax for the url-pattern element in a filter
mapping is the same as that used for a servlet mapping.

Be sure to put these elements in the correct places in your web.xml file
to maintain the element order that is required by the DTD.

 Thanks,
 Mete


Craig


--
To unsubscribe:   mailto:[EMAIL PROTECTED]
For additional commands: mailto:[EMAIL PROTECTED]
Troubles with the list: mailto:[EMAIL PROTECTED]