On Fri, Feb 25, 2011 at 5:12 PM, Felix Meschberger <[email protected]> wrote:
> Hi,
>
> The problem is that browsers tend to not tell the character encoding
> used when posting data ... Don't ask me why ;-)
>
> So we have to do guessing, something I really do not like.
>
> But it looks like browsers send POST data in the same encoding as the
> form was received as. So if the form is received as UTF-8 encoded,
> browsers send back encoded in UTF-8.
>
> Now, how does Sling know what encoding has been used to send the form ?
> Short answer: It cannot know.
>
> Hence the _charset_ request parameter.
>
> But listening to our clients and users and understanding that most of
> the time UTF-8 is used anyway, how about this solution:
>
>  * We stick with the _charset_ parameter. Whatever that parameter
>    conveys is used to decode parameters.
>  * If the parameter does not exist, we support a new configuration
>    option defining the default encoding to be used.
>  * If the configuration option is also missing, we default to the
>    same value as we do today; which is ISO-8859-1
>
> Of course the configuration option would not be set by default (for
> backwards compatibility reasons).
>
> Would that help your case ?

That would be perfect!



> Am Mittwoch, den 20.10.2010, 14:05 -0400 schrieb sam lee:
>> according to:
>> http://download.oracle.com/javaee/6/api/javax/servlet/ServletRequest.html#getCharacterEncoding%28%29
>> request.getCharacterEncoding() should return " the name of the character
>> encoding used in the body of this request. ".
>>
>> But request.getCharacterEncoding() always seems to return  ISO-8859-1.
>> For example, my html.jsp looks like:
>> <%@ page language="java" contentType="text/html; charset=UTF-8"
>>     pageEncoding="UTF-8"%>
>> ...
>> <form method="POST" action="/some/path"
>>     accept-charset="utf-8"
>>     enctype="application/x-www-form-urlencoded; charset=utf-8">
>>     <input type="hidden" name="_charset_" value="UTF-8" />
>>     <input type="submit" value="Save" />
>> ...
>>
>> Then I would expect request.getCharacterEncoding()  (from POST.jsp) to
>> return "UTF-8". But it still returns "ISO-8859-1".
>>
>> Is this intended?
>>
>> >From sling documentation:
>> http://sling.apache.org/site/request-parameters.html#RequestParameters-CharacterEncoding
>> I don't get this part:  "This identity transformation happens to generate
>> strings as the original data was generated with ISO-8859-1 encoding."
>>
>> As long as I set _charset_ to the encoding of the rendered page (with
>> <form>), I don't have a problem. But, I was wondering if
>> .getCharacterEncoding() should be set to whatever request body was encoded
>> as, not what sling used to perform "identity transform" with.
>>
>> Also, wouldn't it be better if _charset_ is missing from request, it's
>> automatically set to request body encoding? Or, browsers don't send request
>> body encoding information?
>>
>> Thanks.
>> Sam
>
>
>

Reply via email to