works in Tomcat 6.0.20 http://localhost:8007/sampleweb/params?test=%D8
log output Servlet init IN doFilter....UTF-8 Ø U+00D8 Ø c3 98 LATIN CAPITAL LETTER O WITH STROKE http://www.utf8-chartable.de/unicode-utf8-table.pl ? Martin ______________________________________________ do not alter or disrupt this transmission. Date: Tue, 3 Aug 2010 04:18:36 -0700 From: arunbha...@yahoo.com Subject: Re: UTF-8 encoding in Tomcat 6.0 To: users@tomcat.apache.org Hello Mark I have tomcat version: apache-tomcat-6.0.29 that i downloaded from: http://tomcat.apache.org/ As per my understanding, this version does not come bundled with any other components, reverse proxies etc? Am i correct? I wrote a sample application - the source is in SAMPLEWEB-src.zip: 1. unzip sampleweb.zip into the webapps folder of tomcat. The application is basically just a sample servlet that prints out what it gets. There is a filter also attached. 2. Invoke it with a GET: http://localhost:8080/sampleweb/params?test=%D8 The param shows up as ? in the tomcat console Now send it as %25 and we see %D8 3. Invoke it with a post (TestInvoke.html) - entering %D8 in the text field and we see it as %D8 only The result is the same if i use java.net API's ( URLEncoder used to encode and decode the characters). The behavior is the same with and without filters. I can see the %xx character in the tomcat console without the filter for POST but not for GET. Am i sending some parameter wrongly? Thanks and Regards Arun --- On Sun, 8/1/10, Mark Thomas <ma...@apache.org> wrote: > From: Mark Thomas <ma...@apache.org> > Subject: Re: UTF-8 encoding in Tomcat 6.0 > To: "Tomcat Users List" <users@tomcat.apache.org> > Date: Sunday, August 1, 2010, 5:05 AM > On 31/07/2010 17:34, arun kumar > wrote: > > > > I ran my example webapp on a standalone tomcat and the > behavior was the same: > > When the param is being sent using GET, I need to send > the param as %25xx for it to be read correctly > > When the method is PUT, %xx works fine. > > Then something in your setup is badly broken, evidenced by > the fact you > have to encode the % as %25 to get things to work. > > > I believe this is a known issue with Tomcat: I > remember reading this on many forums. I believe this is the > same behavior that Erik reports. > > This is absolutely *not* a Tomcat problem. Tomcat does not > behave the > way you describe. A clean Tomcat install with no other > components > (reverse proxy etc) using the test encoding JSP from the > wiki [1] works > correctly with POST and GET (if URIEncoding="UTF-8" is > used). > > > Sorry Mark - i did not get what you said. Could you > please elaborate? > > Decoding is happening twice. i.e.: > %25xx -> %xx > %xx -> whatever character > > Tomcat absolutely, 100% does not do this. Either your test > application > is doing it or there is another component - such as a > reverse proxy - in > the mix that is doing a second decoding. > > This represents a significant security risks. Issues caused > by double > decoding in the past include: > - XSS > - source code disclosure > - authentication bypass > - directory traversal > > It does not mean that these issues will be present, but > double decoding > has been the cause of all of these - and probably more - at > various > points in the past. The details will depend on system > configuration but > seeing an issue like this is certainly indicative that > there may well be > a problem. > > Mark > > [1] http://wiki.apache.org/tomcat/FAQ/CharacterEncoding#Q4 > > > > Regards > > Arun > > > > --- On Sat, 7/31/10, Mark Thomas <ma...@apache.org> > wrote: > > > >> From: Mark Thomas <ma...@apache.org> > >> Subject: Re: UTF-8 encoding in Tomcat 6.0 > >> To: "Tomcat Users List" <users@tomcat.apache.org> > >> Date: Saturday, July 31, 2010, 12:18 PM > >> On 31/07/2010 15:40, arun kumar > >> wrote: > >>> Hi Erik > >>> Thanks very much for your > responses. > >>> I can assure that i'm interested in this topic > even > >> now :). > >>> > >>> My scenario is this: > >>> > >>> 1. I use a web application that runs in > JBOSS. > >>> > >>> 2. JBOSS uses a tomcat web container from what > i can > >> see. > >>> > >>> 3. To my application if i pass a UTF-8 encoded > value > >> in hex e.g: > >>> > >> > http://<server>:<port>/<servlet>/param=%xx... > >>> > >>> Then %xx is not decoded properly. I initially > used to > >> send the request with a mozilla browser but later > started > >> sending it with a java program as well with the > same > >> results. > >>> > >>> I tried setting the URI Encoding parameters in > the > >> tomcat server.xml - with no success. > >>> I then set a filter to specifically set the > encoding > >> to utf-8 - again with no luck - behavior was > exactly the > >> same. > >>> > >>> But when i sent the param as %25xx ( %25= hex > value of > >> the % character), it worked fine but i suspect > that the > >> string gets stored in ISO 8859 format - like you > say: it > >> gets mangled... > >> > >> That smells of double-decoding which as well as > breaking > >> your app is > >> also a security risk. I have seen this when a > reverse proxy > >> is in the mix. > >> > >> Tomcat will *not* do this on its own. > >> > >> Mark > >> > >> > >> > >>> I wrote a standalone web application that > showed the > >> same behavior. > >>> I haven't tried with a standalone tomcat. > >>> > >>> I know that we need to take care of the > encodings at > >> various points but how can i rule out a > problem with > >> my web container configuration settings? Or can it > be a > >> problem coming from the web container itself? > >>> > >>> Thanks and regards > >>> Arun > >>> > >>> > >>> --- On Fri, 7/30/10, Erik Bunn <e...@memecry.net> > >> wrote: > >>> > >>>> From: Erik Bunn <e...@memecry.net> > >>>> Subject: Re: UTF-8 encoding in Tomcat 6.0 > >>>> To: "Tomcat Users List" <users@tomcat.apache.org> > >>>> Date: Friday, July 30, 2010, 1:55 PM > >>>> On 7/30/10 6:33 PM, Christopher > >>>> Schultz wrote: > >>>> > >>>>> If all you want to do is set the > character > >> encoding, > >>>> you can easily call > >>>>> setCharacterEncoding and be done with > it: > >> subclassing > >>>> and overriding > >>>>> should not be necessary at all, > otherwise > >> nobody would > >>>> have written one > >>>>> of these: > >>>> > >>>> No, I have other reasons to mess there. > >> Nevertheless, > >>>> adding a filter is > >>>> probably less iffy, thanks for pointing > that out. > >> TC7 > >>>> provides a suitable > >>>> example: > >>>> > >> > .../webapps/examples/WEB-INF/classes/filters/SetCharacterEncodingFilter.java > >>>> > >>>>> Tomcat versions before 7.x had an > option in > >>>> the<Connector> which could > >>>>> be used to set the request URI > encoding to > >> that of the > >>>> Content-Type of > >>>>> the request (useBodyEncodingForURI) > and > >> another option > >>>> for explicitly > >>>>> and unconditionally setting the > encoding to be > >> used > >>>> for URI decoding > >>>>> (URIEncoding). I haven't read-up on > Tomcat 7 > >>>> behavior. > >>>> > >>>> 7.x Connector has the exact same options. > I'll > >> restate, > >>>> though, that setting > >>>> the Connector URIEncoding in TC7.x won't > currently > >> help > >>>> when decoding GET > >>>> parameters in a no-content-type case - > without the > >> filter, > >>>> they will be > >>>> mangled as ISO-8859-1. If this is > different from > >> previous > >>>> behaviour, maybe I > >>>> should report a bug. > >>>> > >>>> Thanks, > >>>> //e > >>>> > >>>> > >>>> > >> > --------------------------------------------------------------------- > >>>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > >>>> For additional commands, e-mail: users-h...@tomcat.apache.org > >>> > >>> > >>> > >>> > >>> > >> > --------------------------------------------------------------------- > >>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > >>> For additional commands, e-mail: users-h...@tomcat.apache.org > >>> > >> > >> > >> > >> > >> > --------------------------------------------------------------------- > >> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > >> For additional commands, e-mail: users-h...@tomcat.apache.org > >> > >> > > > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > > For additional commands, e-mail: users-h...@tomcat.apache.org > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > For additional commands, e-mail: users-h...@tomcat.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org