RE: Tomcat and Unicode parameters in URLs ???

2002-03-20 Thread Soefara Redzuan

Thanks for pointing this out Larry. Unfortunately we use Tomcat 4 only 
because it seems quite a bit faster than the Tomcat 3 series. Thank you 
though. It looks like I'm going to have to learn how to "guess" the 
character set and language.

Thank you, Soefara.

>From: Larry Isaacs <[EMAIL PROTECTED]>
>Reply-To: "Tomcat Users List" <[EMAIL PROTECTED]>
>To: 'Tomcat Users List' <[EMAIL PROTECTED]>
>Subject: RE: Tomcat and Unicode parameters in URLs ???
>Date: Tue, 19 Mar 2002 07:51:46 -0500
>
>If you can live with the Servlet 2.2 spec, Tomcat 3.3
>has a work around for this.  The DecodeInterceptor can
>accept a URL like the following to specify the encoding
>as part of the URI:
>
>http://localhost:8080/myapp/index.jsp;charset=UTF-8?param=value
>
>For details, see the charsetAttribute attribute of the
>DecodeInterceptor:
>
><http://jakarta.apache.org/tomcat/tomcat-3.3-doc/serverxml.html#DecodeInterceptor>
>
>You are welcome to give it a try.
>
>Cheers,
>Larry
>
> > -Original Message-
> > From: Craig R. McClanahan [mailto:[EMAIL PROTECTED]]
> > Sent: Monday, March 18, 2002 8:50 PM
> > To: Tomcat Users List
> > Subject: Re: Tomcat and Unicode parameters in URLs ???
> >
> >
> >
> >
> > On Tue, 19 Mar 2002, Soefara Redzuan wrote:
> >
> > > Date: Tue, 19 Mar 2002 09:20:47 +0800
> > > From: Soefara Redzuan <[EMAIL PROTECTED]>
> > > Reply-To: Tomcat Users List <[EMAIL PROTECTED]>
> > > To: [EMAIL PROTECTED]
> > > Subject: Re: Tomcat and Unicode parameters in URLs ???
> > >
> > >
> > > >Setting the content type, as you did above, only affects
> > the *output*
> > > >of that particular response -- it has nothing to do with
> > how the next
> > > >*input* request from that browser will be handled.
> > > >
> > > >In order to deal with request parameters in an incoming
> > request, you
> > > >must tell Tomcat what encoding to use, *before* processing the
> > > >parameters. This is done by calling the
> > > >request.setCharacterEncoding() method that was added in
> > Servlet 2.3.
> > > >As long as you call this before calling methods like
> > > >request.getParameter(), the proper encoding will be applied.
> > > >
> > > >One way to do this without modifying your application itself is to
> > > >use a Filter that looks at incoming requests and decides what
> > > >encoding should be used -- perhaps by looking at the
> > > >Accept-Language header, or based on
> > attributes you have
> > > >stored in the current session that indicate what the user will be
> > > >supplying.
> > >
> > > But what happens if you really do not know what character set to
> > > expect ? In our company, the webserver is used for B2B
> > messaging with
> > > customers and not purely serving web pages.  For example, we can
> > > accept a message with a query string like this
> > >
> > http://vpn.ourcompany.com/servlet/incoming?> company=CustomerName&refere
> > > nceId=1234¬eText=
> > >
> > > The noteText could be in one of several languages since we do
> > > international business. We're currently considering adding
> > a language
> > > parameter such as &Language=English but it would be nicer to
> > > autodetect the language. Is this possible ?
> > >
> >
> > It would be possible if the HTTP specs defined a way to tell
> > the server what language the HTTP URL is encoded in, and if
> > browsers actually sent along that indication.  Neither seems
> > to be the case in general -- even on a POST transaction
> > (where the browsers really have no excuse for not including
> > the character encoding in the Content-Type header), many
> > don't. Thus, you're stuck haveing to figure it out for yourself.
> >
> > Note that adding a language parameter to the query string
> > isn't going to do you much good -- you have to call
> > setCharacterEncoding() *before* you call
> > request.getParameter(), so you won't have been able to read
> > the language field first.
> >
> > > Thank you, Soefara
> > >
> > > _
> >
> > Craig
> >
> > > Join the world's largest e-mail service with MSN Hotmail.
> > > http://www.hotmail.com
> > >
> > >
> > > --
> > > To unsubscribe:
> > <mailto:tomcat-user-> [EMAIL PROTECTED]>
> > > For
> > additional commands:
> > <mailto:[EMAIL PROTECTED]>
> > > Troubles with the list:
> > <mailto:[EMAIL PROTECTED]>
> > >
> > >
> >
> >
> > --
> > To
> > unsubscribe:   <mailto:[EMAIL PROTECTED]>
> > For additional commands: <mailto:[EMAIL PROTECTED]>
> > Troubles with the list: <mailto:[EMAIL PROTECTED]>
> >
>
>--
>To unsubscribe:   <mailto:[EMAIL PROTECTED]>
>For additional commands: <mailto:[EMAIL PROTECTED]>
>Troubles with the list: <mailto:[EMAIL PROTECTED]>
>


_
Chat with friends online, try MSN Messenger: http://messenger.msn.com


--
To unsubscribe:   <mailto:[EMAIL PROTECTED]>
For additional commands: <mailto:[EMAIL PROTECTED]>
Troubles with the list: <mailto:[EMAIL PROTECTED]>




RE: Tomcat and Unicode parameters in URLs ???

2002-03-19 Thread Larry Isaacs

If you can live with the Servlet 2.2 spec, Tomcat 3.3
has a work around for this.  The DecodeInterceptor can
accept a URL like the following to specify the encoding
as part of the URI:

http://localhost:8080/myapp/index.jsp;charset=UTF-8?param=value

For details, see the charsetAttribute attribute of the
DecodeInterceptor:

<http://jakarta.apache.org/tomcat/tomcat-3.3-doc/serverxml.html#DecodeInterceptor>

You are welcome to give it a try.

Cheers,
Larry

> -Original Message-
> From: Craig R. McClanahan [mailto:[EMAIL PROTECTED]] 
> Sent: Monday, March 18, 2002 8:50 PM
> To: Tomcat Users List
> Subject: Re: Tomcat and Unicode parameters in URLs ???
> 
> 
> 
> 
> On Tue, 19 Mar 2002, Soefara Redzuan wrote:
> 
> > Date: Tue, 19 Mar 2002 09:20:47 +0800
> > From: Soefara Redzuan <[EMAIL PROTECTED]>
> > Reply-To: Tomcat Users List <[EMAIL PROTECTED]>
> > To: [EMAIL PROTECTED]
> > Subject: Re: Tomcat and Unicode parameters in URLs ???
> >
> >
> > >Setting the content type, as you did above, only affects 
> the *output* 
> > >of that particular response -- it has nothing to do with 
> how the next 
> > >*input* request from that browser will be handled.
> > >
> > >In order to deal with request parameters in an incoming 
> request, you 
> > >must tell Tomcat what encoding to use, *before* processing the 
> > >parameters. This is done by calling the 
> > >request.setCharacterEncoding() method that was added in 
> Servlet 2.3.  
> > >As long as you call this before calling methods like 
> > >request.getParameter(), the proper encoding will be applied.
> > >
> > >One way to do this without modifying your application itself is to 
> > >use a Filter that looks at incoming requests and decides what 
> > >encoding should be used -- perhaps by looking at the 
> > >Accept-Language header, or based on 
> attributes you have 
> > >stored in the current session that indicate what the user will be 
> > >supplying.
> >
> > But what happens if you really do not know what character set to 
> > expect ? In our company, the webserver is used for B2B 
> messaging with 
> > customers and not purely serving web pages.  For example, we can 
> > accept a message with a query string like this 
> > 
> http://vpn.ourcompany.com/servlet/incoming?> company=CustomerName&refere
> > nceId=1234¬eText=
> >
> > The noteText could be in one of several languages since we do 
> > international business. We're currently considering adding 
> a language 
> > parameter such as &Language=English but it would be nicer to 
> > autodetect the language. Is this possible ?
> >
> 
> It would be possible if the HTTP specs defined a way to tell 
> the server what language the HTTP URL is encoded in, and if 
> browsers actually sent along that indication.  Neither seems 
> to be the case in general -- even on a POST transaction 
> (where the browsers really have no excuse for not including 
> the character encoding in the Content-Type header), many 
> don't. Thus, you're stuck haveing to figure it out for yourself.
> 
> Note that adding a language parameter to the query string 
> isn't going to do you much good -- you have to call 
> setCharacterEncoding() *before* you call 
> request.getParameter(), so you won't have been able to read 
> the language field first.
> 
> > Thank you, Soefara
> >
> > _
> 
> Craig
> 
> > Join the world's largest e-mail service with MSN Hotmail. 
> > http://www.hotmail.com
> >
> >
> > --
> > To unsubscribe:   
> <mailto:tomcat-user-> [EMAIL PROTECTED]>
> > For 
> additional commands: 
> <mailto:[EMAIL PROTECTED]>
> > Troubles with the list: 
> <mailto:[EMAIL PROTECTED]>
> >
> >
> 
> 
> --
> To 
> unsubscribe:   <mailto:[EMAIL PROTECTED]>
> For additional commands: <mailto:[EMAIL PROTECTED]>
> Troubles with the list: <mailto:[EMAIL PROTECTED]>
> 

--
To unsubscribe:   <mailto:[EMAIL PROTECTED]>
For additional commands: <mailto:[EMAIL PROTECTED]>
Troubles with the list: <mailto:[EMAIL PROTECTED]>




Re: Tomcat and Unicode parameters in URLs ???

2002-03-18 Thread Craig R. McClanahan



On Tue, 19 Mar 2002, Soefara Redzuan wrote:

> Date: Tue, 19 Mar 2002 09:20:47 +0800
> From: Soefara Redzuan <[EMAIL PROTECTED]>
> Reply-To: Tomcat Users List <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Subject: Re: Tomcat and Unicode parameters in URLs ???
>
>
> >Setting the content type, as you did above, only affects the *output* of
> >that particular response -- it has nothing to do with how the next *input*
> >request from that browser will be handled.
> >
> >In order to deal with request parameters in an incoming request, you must
> >tell Tomcat what encoding to use, *before* processing the parameters.
> >This is done by calling the request.setCharacterEncoding() method that was
> >added in Servlet 2.3.  As long as you call this before calling methods
> >like request.getParameter(), the proper encoding will be applied.
> >
> >One way to do this without modifying your application itself is to use a
> >Filter that looks at incoming requests and decides what encoding should be
> >used -- perhaps by looking at the Accept-Language header, or
> >based on attributes you have stored in the current session that indicate
> >what the user will be supplying.
>
> But what happens if you really do not know what character set to expect ? In
> our company, the webserver is used for B2B messaging with customers and not
> purely serving web pages.  For example, we can accept a message with a query
> string like this
> 
>http://vpn.ourcompany.com/servlet/incoming?company=CustomerName&referenceId=1234¬eText=
>
> The noteText could be in one of several languages since we do international
> business. We're currently considering adding a language parameter such as
> &Language=English but it would be nicer to autodetect the language. Is this
> possible ?
>

It would be possible if the HTTP specs defined a way to tell the server
what language the HTTP URL is encoded in, and if browsers actually sent
along that indication.  Neither seems to be the case in general -- even on
a POST transaction (where the browsers really have no excuse for not
including the character encoding in the Content-Type header), many don't.
Thus, you're stuck haveing to figure it out for yourself.

Note that adding a language parameter to the query string isn't going to
do you much good -- you have to call setCharacterEncoding() *before* you
call request.getParameter(), so you won't have been able to read the
language field first.

> Thank you, Soefara
>
> _

Craig

> Join the world’s largest e-mail service with MSN Hotmail.
> http://www.hotmail.com
>
>
> --
> To unsubscribe:   <mailto:[EMAIL PROTECTED]>
> For additional commands: <mailto:[EMAIL PROTECTED]>
> Troubles with the list: <mailto:[EMAIL PROTECTED]>
>
>


--
To unsubscribe:   <mailto:[EMAIL PROTECTED]>
For additional commands: <mailto:[EMAIL PROTECTED]>
Troubles with the list: <mailto:[EMAIL PROTECTED]>




Re: Tomcat and Unicode parameters in URLs ???

2002-03-18 Thread Soefara Redzuan


>Setting the content type, as you did above, only affects the *output* of
>that particular response -- it has nothing to do with how the next *input*
>request from that browser will be handled.
>
>In order to deal with request parameters in an incoming request, you must
>tell Tomcat what encoding to use, *before* processing the parameters.
>This is done by calling the request.setCharacterEncoding() method that was
>added in Servlet 2.3.  As long as you call this before calling methods
>like request.getParameter(), the proper encoding will be applied.
>
>One way to do this without modifying your application itself is to use a
>Filter that looks at incoming requests and decides what encoding should be
>used -- perhaps by looking at the Accept-Language header, or
>based on attributes you have stored in the current session that indicate
>what the user will be supplying.

But what happens if you really do not know what character set to expect ? In 
our company, the webserver is used for B2B messaging with customers and not 
purely serving web pages.  For example, we can accept a message with a query 
string like this
http://vpn.ourcompany.com/servlet/incoming?company=CustomerName&referenceId=1234¬eText=

The noteText could be in one of several languages since we do international 
business. We're currently considering adding a language parameter such as 
&Language=English but it would be nicer to autodetect the language. Is this 
possible ?

Thank you, Soefara

_
Join the world’s largest e-mail service with MSN Hotmail. 
http://www.hotmail.com


--
To unsubscribe:   
For additional commands: 
Troubles with the list: 




Re: Tomcat and Unicode parameters in URLs ???

2002-03-16 Thread Craig R. McClanahan



On Sat, 16 Mar 2002, Mete Kural wrote:

> Date: Sat, 16 Mar 2002 11:05:08 -0800 (PST)
> From: Mete Kural <[EMAIL PROTECTED]>
> Reply-To: Tomcat Users List <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]
> Subject: Tomcat and Unicode parameters in URLs ???
>
> Hello all,
>
> I tried a million ways of making Tomcat 4.0.3 work
> with Unicode URL parameters, but nothing seems to
> work. It always corrupts the parameters. Does anybody
> know a workaround to make Unicode request parameters
> work with Tomcat?
>
> For instance, I changed the SnoopServlet example given
> with Tomcat 4 to output the response in the unicode
> with setContentType("text/html;charset=utf-8"). But
> when I write unicode parameters in the URL text area
> of Internet Explorer as parameters to SnoopServlet, it
> always corrupts my parameters. Instead of printing
> them a two-bye unicode characters, it prints every
> unicode character as two one-byte garbage or otherwise
> ASCII characters. I also tried making a URL request
> using the URL class in JAVA SDK 1.4. That didn't work
> as well. The URLEncode and URLDecode classes in the
> SDK don't seem to do their job right also. Has anyone
> been able to make use of these classes? Any
> workarounds and bug reports will be greatly
> appreciated.
>

Setting the content type, as you did above, only affects the *output* of
that particular response -- it has nothing to do with how the next *input*
request from that browser will be handled.

In order to deal with request parameters in an incoming request, you must
tell Tomcat what encoding to use, *before* processing the parameters.
This is done by calling the request.setCharacterEncoding() method that was
added in Servlet 2.3.  As long as you call this before calling methods
like request.getParameter(), the proper encoding will be applied.

One way to do this without modifying your application itself is to use a
Filter that looks at incoming requests and decides what encoding should be
used -- perhaps by looking at the Accept-Language header, or
based on attributes you have stored in the current session that indicate
what the user will be supplying.  A very simple example of such a filter
is included in the "/examples" webapp shipped with Tomcat 4 -- the source
code for this filter is in file SetCharacterEncodingFilter.java in the
$CATALINA_HOME/webapps/examples/WEB-INF/classes/filters subdirectory.
This example is fairly simpleminded -- you just configure a filter
initialization parameter that is used to set the encoding for all requests
-- but you can use it as a starting point for more sophisticated
processing by subclassing it and overriding the selectEncoding() method.

This filter can be enabled by copying the appropriate class file to your
own WEB-INF/classes directory, and adding a filter definition to your
web.xml file:

  
Character Encoding Filter
filters.SetCharacterEncodingFilter

  encoding
  UTF-8

  

Then, you select which requests this filter applies to with a filter
mapping -- the "/*" pattern says apply it to *all* requests:

  
Character Encoding Filter
/*
  

With filter mappings like this, you can be more selective about which URLs
it applies to by using a more precise URL pattern, or apply different
filters to different URLs -- all without affecting your servlets or JSP
apges at all.  The syntax for the  element in a filter
mapping is the same as that used for a servlet mapping.

Be sure to put these elements in the correct places in your web.xml file
to maintain the element order that is required by the DTD.

> Thanks,
> Mete
>

Craig


--
To unsubscribe:   
For additional commands: 
Troubles with the list: