RE: XSLT and international characters in parameters
After deep searching I found that all information were on the mail list, but very distributed. I think that the subject 'International characters in Tomcat/Cocoon' would be good to describe in FAQ (is very common question). Maybe the content below can be good startup point. How to setup Tomcat/Cocoon/Java to support fully Unicode (UTF-8)? 1. Problem - unicode characters in xsp/etc code. add file.encoding to your JVM (for IBM -Dfile.encoding=UTF-8) 2. Problem - request parameters (for xsp/xsl/etc.): a) use always xsp-request:get-parameter name=yourparamname form-encoding=UTF-8 container-encoding=UTF-8/ b) add filter to web.xml below is a part of mail RE: Tomcat and Unicode parameters in URLs ??? From: Larry Isaacs In order to deal with request parameters in an incoming request, you must tell Tomcat what encoding to use, *before* processing the parameters. This is done by calling the request.setCharacterEncoding() method that was added in Servlet 2.3. As long as you call this before calling methods like request.getParameter(), the proper encoding will be applied. One way to do this without modifying your application itself is to use a Filter that looks at incoming requests and decides what encoding should be used -- perhaps by looking at the codeAccept-Language/code header, or based on attributes you have stored in the current session that indicate what the user will be supplying. A very simple example of such a filter is included in the /examples webapp shipped with Tomcat 4 -- the source code for this filter is in file SetCharacterEncodingFilter.java in the $CATALINA_HOME/webapps/examples/WEB-INF/classes/filters subdirectory. This example is fairly simpleminded -- you just configure a filter initialization parameter that is used to set the encoding for all requests -- but you can use it as a starting point for more sophisticated processing by subclassing it and overriding the selectEncoding() method. This filter can be enabled by copying the appropriate class file to your own WEB-INF/classes directory, and adding a filter definition to your web.xml file: filter filter-nameCharacter Encoding Filter/filter-name filter-classfilters.SetCharacterEncodingFilter/filter-class init-param param-nameencoding/param-name param-valueUTF-8/param-value /init-param /filter Then, you select which requests this filter applies to with a filter mapping -- the /* pattern says apply it to *all* requests: filter-mapping filter-nameCharacter Encoding Filter/filter-name url-pattern/*/url-pattern /filter-mapping With filter mappings like this, you can be more selective about which URLs it applies to by using a more precise URL pattern, or apply different filters to different URLs -- all without affecting your servlets or JSP apges at all. The syntax for the url-pattern element in a filter mapping is the same as that used for a servlet mapping. Be sure to put these elements in the correct places in your web.xml file to maintain the element order that is required by the DTD. I'm sure that it can be done simpler/better. Probably there are still some other problems which I didn't find. Post comments, ideas. Tomasz - Please check that your question has not already been answered in the FAQ before posting. http://xml.apache.org/cocoon/faqs.html To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: XSLT and international characters in parameters
From: Tomasz Bech [mailto:[EMAIL PROTECTED]] Hi all, I found some request about this problem, but no answer. Problem is: when form is in utf8, parameters are encoded as utf8. But server always wants to treat them as iso-something. In XSP remedy is: xsp-request:get-parameter name=searchtext form-encoding=UTF-8/ ^^^ Yup. But problem appears when I try to use parameters in xsl (as xsl:param name='searchtext/). No idea how to set encoding to proper one? Try custom action. More questions: 2. Why server doesn't understand encoding of request parameters? Sounds like question to your server. Is it tomcat? Try tomcat docs/user list. 3. How to set encoding of request globaly (for all and forever)? You can change encoding of the JVM if you either change locale of the OS or provide -D parameter on the startup of the JVM (see JDK docs) Vadim Thanks, Tomasz - Please check that your question has not already been answered in the FAQ before posting. http://xml.apache.org/cocoon/faqs.html To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: XSLT and international characters.
Hello Marcin, On Tue, Dec 04, 2001 at 12:33:00PM +0100, Marcin Kłos wrote: Hi! For serveral days I'm trying to solve the following problem. I'm sure that somebody has already faced it, but I could not find any answer on the net. I use Cocoon 2 for XSLT transformations (Tomcat 4.0.1 glued to Apache using mod_webapp). These are very simple transformations producing HTML output. I pass some parameters in the QUERY STRING to the XSL. Those parameters contain international (polish) characters, somewhere during the transformation those charcters are converted into some rubbish. For example polish character %C5%82 is transformed into Aring;#130; i.e., each byte of the two-byte value is treated separetely. How to overcome this problem? Use desired encoding for html serialization (in your case it is iso8859-2 or utf-8). All files containing national characters, should be encoded in UTF-8. IE supports UTF-8, as well as Mazilla does. You can look my http://office.ferienwelt.com.pl:8080/cocoon/xml_tests/ i have used there national characters. Best wishes, Hubert. - Please check that your question has not already been answered in the FAQ before posting. http://xml.apache.org/cocoon/faqs.html To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: XSLT and international characters.
Marcin, you may try setting the enconding of the HTML serializer in your sitemp to the one that fits you (in the example below I used, quite understandably, an Italian enconding): map:serializer name=html mime-type=text/html src=org.apache.cocoon.serialization.HTMLSerializer encodingiso-8859-1/encoding /map:serializer Best regards, - Luca Morandini GIS Consultant [EMAIL PROTECTED] http://utenti.tripod.it/lmorandini/index.html - -Original Message- From: Marcin Kos [mailto:[EMAIL PROTECTED]] Sent: Tuesday, December 04, 2001 12:33 PM To: [EMAIL PROTECTED] Subject: XSLT and international characters. Hi! For serveral days I'm trying to solve the following problem. I'm sure that somebody has already faced it, but I could not find any answer on the net. I use Cocoon 2 for XSLT transformations (Tomcat 4.0.1 glued to Apache using mod_webapp). These are very simple transformations producing HTML output. I pass some parameters in the QUERY STRING to the XSL. Those parameters contain international (polish) characters, somewhere during the transformation those charcters are converted into some rubbish. For example polish character %C5%82 is transformed into Aring;#130; i.e., each byte of the two-byte value is treated separetely. How to overcome this problem? -- Pozdrawiam Marcin 'Quosoo' Kos - Please check that your question has not already been answered in the FAQ before posting. http://xml.apache.org/cocoon/faqs.html To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - Please check that your question has not already been answered in the FAQ before posting. http://xml.apache.org/cocoon/faqs.html To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]