On Fri, Feb 25, 2011 at 5:12 PM, Felix Meschberger <[email protected]> wrote: > Hi, > > The problem is that browsers tend to not tell the character encoding > used when posting data ... Don't ask me why ;-) > > So we have to do guessing, something I really do not like. > > But it looks like browsers send POST data in the same encoding as the > form was received as. So if the form is received as UTF-8 encoded, > browsers send back encoded in UTF-8. > > Now, how does Sling know what encoding has been used to send the form ? > Short answer: It cannot know. > > Hence the _charset_ request parameter. > > But listening to our clients and users and understanding that most of > the time UTF-8 is used anyway, how about this solution: > > * We stick with the _charset_ parameter. Whatever that parameter > conveys is used to decode parameters. > * If the parameter does not exist, we support a new configuration > option defining the default encoding to be used. > * If the configuration option is also missing, we default to the > same value as we do today; which is ISO-8859-1 > > Of course the configuration option would not be set by default (for > backwards compatibility reasons). > > Would that help your case ?
That would be perfect! > Am Mittwoch, den 20.10.2010, 14:05 -0400 schrieb sam lee: >> according to: >> http://download.oracle.com/javaee/6/api/javax/servlet/ServletRequest.html#getCharacterEncoding%28%29 >> request.getCharacterEncoding() should return " the name of the character >> encoding used in the body of this request. ". >> >> But request.getCharacterEncoding() always seems to return ISO-8859-1. >> For example, my html.jsp looks like: >> <%@ page language="java" contentType="text/html; charset=UTF-8" >> pageEncoding="UTF-8"%> >> ... >> <form method="POST" action="/some/path" >> accept-charset="utf-8" >> enctype="application/x-www-form-urlencoded; charset=utf-8"> >> <input type="hidden" name="_charset_" value="UTF-8" /> >> <input type="submit" value="Save" /> >> ... >> >> Then I would expect request.getCharacterEncoding() (from POST.jsp) to >> return "UTF-8". But it still returns "ISO-8859-1". >> >> Is this intended? >> >> >From sling documentation: >> http://sling.apache.org/site/request-parameters.html#RequestParameters-CharacterEncoding >> I don't get this part: "This identity transformation happens to generate >> strings as the original data was generated with ISO-8859-1 encoding." >> >> As long as I set _charset_ to the encoding of the rendered page (with >> <form>), I don't have a problem. But, I was wondering if >> .getCharacterEncoding() should be set to whatever request body was encoded >> as, not what sling used to perform "identity transform" with. >> >> Also, wouldn't it be better if _charset_ is missing from request, it's >> automatically set to request body encoding? Or, browsers don't send request >> body encoding information? >> >> Thanks. >> Sam > > >
