Hi Laurie, And thanks for your quick answer! Here are my comments. I tried that first, changing the default encoding (in struts.xml) to utf-8. That works fine, in java and in our web application. The problem is our Sybase database which is configured to ISO-8859-1. And as our JDBC driver (jconn2) does not convert from utf-8 to iso-8859-1, it will throw an exception when trying to update or insert the characters it does not understand.
So therefore I had to convert them myself. I can also add that there is a special case when it comes to the Euro (€) character. It did not exist when iso-8859-1 was created, but added as part of iso-8859-15. But our Sybase database still only understands iso-8859-1, so a conversion needs to take place. What I did was first convert it from utf-8 to iso-8859-15, then from iso-8859-15 to iso-8859-1. Here is the code: byte[] characters = charsBeforeConvert.getBytes("iso-8859-15"); for (int i = 0; i < characters.length; i++) { if (characters[i] == (byte) 0xa4) { //0x80 is control character and has no symbol in iso-8859-1. It is used for € in windows-1252 characters[i] = (byte) 0x80; } } return new String(characters, "iso-8859-1"); Kind of a hassle, but it works. It was a good idea to override the setCharacterEncoding method. This would open the opportunity to move my converting logic from the filter to an interceptor. But then another problem occurs. If I do the conversion in an interceptor, I would need to know exactly which parameters that would need to be converted. We are working with a solution for maintaining CV’s. I would then have to do something like (pseudocode): - String firstName = Request.getParamater(“firstName”); - get CV object from the value stack - firstName = performConversion(firstName) - cv.setFirstName(firstName) - put cv back on the value stack In some cases this would work fine, but I have so many parameters I need to retrieve and convert that it would not work as a proper solution. My filter takes care of all requests parameters without the need of specifying which parameter it is. To improve my code, I will move the converting logic to a utility class, so the filter can stay as thin as possible. I will post the entire code if you like to take a look at it. Any comments would be appreciated! Thanks import com.google.common.collect.Maps; import javax.servlet.*; import javax.servlet.http.HttpServletRequest; import javax.servlet.http.HttpServletRequestWrapper; import java.io.IOException; import java.io.UnsupportedEncodingException; import java.util.Map; /** * Filter to fix utf-8 to iso-8859-1 conversion * * @author Asgaut Mjolne * @version $Revision: 1.6 $, 05.feb.2008, modified by: $Author: fiasmjol */ public class CharsetEncodingFilter implements Filter { @Override public void init(FilterConfig filterConfig) throws ServletException { } @Override public void doFilter(ServletRequest servletRequest, ServletResponse servletResponse, FilterChain filterChain) throws IOException, ServletException { HttpServletRequest req = (HttpServletRequest) servletRequest; if ("utf-8".equalsIgnoreCase(req.getCharacterEncoding())) { req = new CharsetRequestWrapper(req); req.getParameter("foo"); //Needed to fill params. Do not remove } filterChain.doFilter(req, servletResponse); } @Override public void destroy() { } static class CharsetRequestWrapper extends HttpServletRequestWrapper { private static final byte ISO_8859_15_EURO_CODE_POINT = (byte) 0xa4; /** * Not in use in ISO-8859-1 */ private static final byte CP_1252_EURO_CODE_POINT = (byte) 0x80; public CharsetRequestWrapper(HttpServletRequest httpServletRequest) { super(httpServletRequest); } @Override public String getParameter(String s) { return super.getParameter(s); } Map<String, String[]> iso88591EncodedParams = null; /** * Looping through all parameters on the request, checking for special characters. * If any found, convert them with the fixCharset method */ @Override public Map<String, String[]> getParameterMap() { if (iso88591EncodedParams == null) { iso88591EncodedParams = Maps.newHashMap(); Map<String, String[]> params = super.getParameterMap(); for (String key : params.keySet()) { String[] values = params.get(key); for (int j = 0; j < values.length; j++) { values[j] = fixCharset(values[j]); } iso88591EncodedParams.put(key, values); } } return iso88591EncodedParams; } /** * Converting special chars from utf-8 to iso-8859-1 * Add more convertions here when needed */ static String fixCharset(String charsBeforeConvert) { try { byte[] characters = charsBeforeConvert.getBytes("iso-8859-15"); for (int i = 0; i < characters.length; i++) { if (characters[i] == ISO_8859_15_EURO_CODE_POINT) { characters[i] = CP_1252_EURO_CODE_POINT; } } return new String(characters, "iso-8859-1"); } catch (UnsupportedEncodingException e) { return charsBeforeConvert; } } @Override public String[] getParameterValues(String s) { return super.getParameterValues(s); } } } Laurie Harper wrote: > > Asgaut wrote: >> I have recently been struggling with a utf-8 to ISO-8859-1 problem with >> Ajax >> and Struts2. >> >> The problem is basically that our application requires iso-8859-1 >> characters >> and Ajax is configured to only post utf-8 (ajax is utf-8 either way, can >> not >> be changed). So some kind of conversion has to take place at some level. >> >> My problem can be divided into two parts: >> 1. Make Struts2 understand that there is a incoming utf-8 POST, even >> though >> struts.xml (which set the struts2 default encoding) is configured to use >> iso-8859-1 >> 2. Convert the characters from utf-8 to iso-8859-1 > > 3. Change your default encoding to utf-8, which should have no effect on > any of your code but will allow greater flexibility in the range of > characters you can display and read. Is there any reason you must use > iso-8859-1? > >> [...] >> >> If you take a look at this piece of code, you can see that it overrides >> the >> encoding if it is set as defaultEncoding (from struts.xml). This is OK, >> the >> problem is this check: >> if (encoding != null) { >> try { >> request.setCharacterEncoding(encoding); >> } catch (Exception e) { >> LOG.error("Error setting character encoding to '" + >> encoding >> + "' - ignoring.", e); >> } >> } >> >> I think the correct thing would be to also do a check if the >> request.getCharacterEncoding was already set. I should look like this: >> if (encoding != null && request.getCharacterEncoding() == null ) { >> try { >> request.setCharacterEncoding(encoding); >> } catch (Exception e) { >> LOG.error("Error setting character encoding to '" + >> encoding >> + "' - ignoring.", e); >> } >> } >> With this change utf-8 would be kept as the request character encoding >> and I >> could do my conversion in my interceptor. >> This would solve my problem number 1. Am I correct when I say this is a >> bug? > > I don't know if I'd call that a bug, but it does seem like a reasonable > enhancement. It would probably require some testing with different > browsers to make sure getCharacterEncoding() really is returning null in > the 'normal' cases, but assuming that's true you could open a ticket in > JIRA and attach a patch. > >> The way I went around it was to create a filter which is executed before >> FilterDispatcher in struts2. In this filter I check if it is a uft-8 post >> and if it is, I wrap the HttpServletRequest into my own >> CharsetRequestWrapper. In my wrapper I will override getParameterMap >> which >> converts my characters, put them back into the map and return them. I >> also >> run a req.getParameter("foo"); after my wrapping to populate the >> parameters >> on the request. >> >> It works, but it took me a couple of days to work it out. >> >> Any comments on this? > > It might be simpler for your filter to call > setCharacterEncoding("utf-8") and use a trivial request wrapper that > delegates all calls to the wrapped request *except* > setCharacterEncoding(), making that a no-op. It would make it clearer > what the filter was acutaly doing with less code :-) Otherwise, seems > like a reasonable work-around. > > L. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > -- View this message in context: http://www.nabble.com/CharacterEncoding-bug-in-Struts2--tp15408328p15497775.html Sent from the Struts - User mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]