request.getCharacterEncoding() always returns ISO-8859-1

2010-10-20 Thread sam lee
according to:
http://download.oracle.com/javaee/6/api/javax/servlet/ServletRequest.html#getCharacterEncoding%28%29
request.getCharacterEncoding() should return " the name of the character
encoding used in the body of this request. ".

But request.getCharacterEncoding() always seems to return  ISO-8859-1.
For example, my html.jsp looks like:
<%@ page language="java" contentType="text/html; charset=UTF-8"
pageEncoding="UTF-8"%>
...



...

Then I would expect request.getCharacterEncoding()  (from POST.jsp) to
return "UTF-8". But it still returns "ISO-8859-1".

Is this intended?

>From sling documentation:
http://sling.apache.org/site/request-parameters.html#RequestParameters-CharacterEncoding
I don't get this part:  "This identity transformation happens to generate
strings as the original data was generated with ISO-8859-1 encoding."

As long as I set _charset_ to the encoding of the rendered page (with
), I don't have a problem. But, I was wondering if
.getCharacterEncoding() should be set to whatever request body was encoded
as, not what sling used to perform "identity transform" with.

Also, wouldn't it be better if _charset_ is missing from request, it's
automatically set to request body encoding? Or, browsers don't send request
body encoding information?

Thanks.
Sam


Re: request.getCharacterEncoding() always returns ISO-8859-1

2010-10-21 Thread Alexander Klimetschek
On Wed, Oct 20, 2010 at 20:05, sam lee  wrote:
>     accept-charset="utf-8"
>    enctype="application/x-www-form-urlencoded; charset=utf-8">
>    
>    
> ...
>
> Then I would expect request.getCharacterEncoding()  (from POST.jsp) to
> return "UTF-8". But it still returns "ISO-8859-1".

Have you validated what is actually sent in the HTTP request?

> Or, browsers don't send request body encoding information?

Browsers don't send it (at least not reliable). Simply always use
_charset_, it is the most stable solution you can find.

Regards,
Alex

-- 
Alexander Klimetschek
alexander.klimetsc...@day.com


Re: request.getCharacterEncoding() always returns ISO-8859-1

2011-02-11 Thread Peter Dotchev

Hi,

Recently 
http://dotev.blogspot.com/2011/02/posting-non-ascii-characters-in-web.html I
stumbled  over this issue too.

I don't want to add _charset_ input to all the forms.
Is there a way to set the request encoding to UTF-8?
IMHO it would be better if the request encoding is configurable like it is
done in Wicket.

http://wiki.apache.org/tomcat/FAQ/CharacterEncoding Tomcat FAQ  suggests
using a filter.
How can I do that in Sling? 
I start Sling with launchpad standalone.

Best regards,
Peter

-- 
View this message in context: 
http://apache-sling.73963.n3.nabble.com/request-getCharacterEncoding-always-returns-ISO-8859-1-tp1740512p2469927.html
Sent from the Sling - Users mailing list archive at Nabble.com.


Re: request.getCharacterEncoding() always returns ISO-8859-1

2011-02-11 Thread Vidar Ramdal
On Thu, Feb 10, 2011 at 11:56 PM, Peter Dotchev  wrote:
>
> Hi,
>
> Recently
> http://dotev.blogspot.com/2011/02/posting-non-ascii-characters-in-web.html I
> stumbled  over this issue too.
>
> I don't want to add _charset_ input to all the forms.
> Is there a way to set the request encoding to UTF-8?
> IMHO it would be better if the request encoding is configurable like it is
> done in Wicket.
>
> http://wiki.apache.org/tomcat/FAQ/CharacterEncoding Tomcat FAQ  suggests
> using a filter.
> How can I do that in Sling?

Hi, you can implement the javax.servlet.Filter interface and register
your implementation as a Filter service:

@Component(immediate = true)
@Properties({
@Property(name = "filter.scope", value = "request",
propertyPrivate = true),
@Property(name = "filter.order", value = "-9", propertyPrivate = true)
})
@Services({@Service(javax.servlet.Filter.class)})
public class YourFilter implements javax.servlet.Filter {
 ...
}



-- 
Vidar S. Ramdal  - http://www.idium.no
Sommerrogata 13-15, N-0255 Oslo, Norway
+ 47 22 00 84 00
Quando omni flunkus moritatus!


Re: request.getCharacterEncoding() always returns ISO-8859-1

2011-02-13 Thread Peter Dotchev

Hi Vidar,

Thank you for the suggestion. I will try it.
Now I found that request filters are described here
http://sling.apache.org/site/filters.html

Best regards,
Peter

On Fri, Feb 11, 2011 at 3:27 PM, Vidar Ramdal-2 [via Apache Sling] <
ml-node+2473494-998661287-5...@n3.nabble.com> wrote:

> On Thu, Feb 10, 2011 at 11:56 PM, Peter Dotchev <[hidden 
> email]<http://user/SendEmail.jtp?type=node&node=2473494&i=0>>
> wrote:
>
> >
> > Hi,
> >
> > Recently
> >
> http://dotev.blogspot.com/2011/02/posting-non-ascii-characters-in-web.html I
>
> > stumbled  over this issue too.
> >
> > I don't want to add _charset_ input to all the forms.
> > Is there a way to set the request encoding to UTF-8?
> > IMHO it would be better if the request encoding is configurable like it
> is
> > done in Wicket.
> >
> > http://wiki.apache.org/tomcat/FAQ/CharacterEncoding Tomcat FAQ  suggests
>
> > using a filter.
> > How can I do that in Sling?
>
> Hi, you can implement the javax.servlet.Filter interface and register
> your implementation as a Filter service:
>
> @Component(immediate = true)
> @Properties({
> @Property(name = "filter.scope", value = "request",
> propertyPrivate = true),
> @Property(name = "filter.order", value = "-9", propertyPrivate =
> true)
> })
> @Services({@Service(javax.servlet.Filter.class)})
> public class YourFilter implements javax.servlet.Filter {
>  ...
> }
>
>
>
> --
> Vidar S. Ramdal <[hidden 
> email]<http://user/SendEmail.jtp?type=node&node=2473494&i=1>>
> - http://www.idium.no
> Sommerrogata 13-15, N-0255 Oslo, Norway
> + 47 22 00 84 00
> Quando omni flunkus moritatus!
>
>
> ------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://apache-sling.73963.n3.nabble.com/request-getCharacterEncoding-always-returns-ISO-8859-1-tp1740512p2473494.html
>  To unsubscribe from request.getCharacterEncoding() always returns
> ISO-8859-1, click 
> here<http://apache-sling.73963.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=1740512&code=ZG90Y2hldkBnbWFpbC5jb218MTc0MDUxMnwtMTcyMjY3ODE2NA==>.
>
>

-- 
View this message in context: 
http://apache-sling.73963.n3.nabble.com/request-getCharacterEncoding-always-returns-ISO-8859-1-tp1740512p2474201.html
Sent from the Sling - Users mailing list archive at Nabble.com.


Re: request.getCharacterEncoding() always returns ISO-8859-1

2011-02-24 Thread Markus Joschko
Hi Vidar and all,
I think that this approach is not working.
I have the exact same use case. We don't want to put the _charset_
parameter into every form. Therefore I tried to automatically add a
_charset_ request parameter to the request in a filter.
However that parameter never gets picked up as ParameterSupport is
created before the filter gets called (and then uses the reference to
the original servletrequest and not the wrapped one).
I could theoretically reinstantiate Parametersupport but that requires
knowledge of the servlet attribute key where parametersupport is
stored. And that is a private variable in parametersupport.

Any other chance to not have the _charset_ parameter in every post
request sent to the system?

Thanks,
 Markus

On Fri, Feb 11, 2011 at 2:25 PM, Vidar Ramdal  wrote:
> On Thu, Feb 10, 2011 at 11:56 PM, Peter Dotchev  wrote:
>>
>> Hi,
>>
>> Recently
>> http://dotev.blogspot.com/2011/02/posting-non-ascii-characters-in-web.html I
>> stumbled  over this issue too.
>>
>> I don't want to add _charset_ input to all the forms.
>> Is there a way to set the request encoding to UTF-8?
>> IMHO it would be better if the request encoding is configurable like it is
>> done in Wicket.
>>
>> http://wiki.apache.org/tomcat/FAQ/CharacterEncoding Tomcat FAQ  suggests
>> using a filter.
>> How can I do that in Sling?
>
> Hi, you can implement the javax.servlet.Filter interface and register
> your implementation as a Filter service:
>
> @Component(immediate = true)
> @Properties({
>        @Property(name = "filter.scope", value = "request",
> propertyPrivate = true),
>        @Property(name = "filter.order", value = "-9", propertyPrivate = true)
> })
> @Services({@Service(javax.servlet.Filter.class)})
> public class YourFilter implements javax.servlet.Filter {
>  ...
> }
>
>
>
> --
> Vidar S. Ramdal  - http://www.idium.no
> Sommerrogata 13-15, N-0255 Oslo, Norway
> + 47 22 00 84 00
> Quando omni flunkus moritatus!
>


Re: request.getCharacterEncoding() always returns ISO-8859-1

2011-02-24 Thread Julian Sedding
Hello Markus

You can register servlet filters in Sling, but you can also register
servlet filters in the OSGi HttpService using the whiteboard[0]. As
the SlingMainServlet runs within the OSGi HttpService, you may have
more success registering a filter there. I don't know where in the
stack ParameterSupport is added, so YMMV. Let us know whether this
does the trick.

Regards
Julian

[0] 
http://felix.apache.org/site/apache-felix-http-service.html#ApacheFelixHTTPService-UsingtheWhiteboard



On Thu, Feb 24, 2011 at 4:29 PM, Markus Joschko
 wrote:
> Hi Vidar and all,
> I think that this approach is not working.
> I have the exact same use case. We don't want to put the _charset_
> parameter into every form. Therefore I tried to automatically add a
> _charset_ request parameter to the request in a filter.
> However that parameter never gets picked up as ParameterSupport is
> created before the filter gets called (and then uses the reference to
> the original servletrequest and not the wrapped one).
> I could theoretically reinstantiate Parametersupport but that requires
> knowledge of the servlet attribute key where parametersupport is
> stored. And that is a private variable in parametersupport.
>
> Any other chance to not have the _charset_ parameter in every post
> request sent to the system?
>
> Thanks,
>  Markus
>
> On Fri, Feb 11, 2011 at 2:25 PM, Vidar Ramdal  wrote:
>> On Thu, Feb 10, 2011 at 11:56 PM, Peter Dotchev  wrote:
>>>
>>> Hi,
>>>
>>> Recently
>>> http://dotev.blogspot.com/2011/02/posting-non-ascii-characters-in-web.html I
>>> stumbled  over this issue too.
>>>
>>> I don't want to add _charset_ input to all the forms.
>>> Is there a way to set the request encoding to UTF-8?
>>> IMHO it would be better if the request encoding is configurable like it is
>>> done in Wicket.
>>>
>>> http://wiki.apache.org/tomcat/FAQ/CharacterEncoding Tomcat FAQ  suggests
>>> using a filter.
>>> How can I do that in Sling?
>>
>> Hi, you can implement the javax.servlet.Filter interface and register
>> your implementation as a Filter service:
>>
>> @Component(immediate = true)
>> @Properties({
>>        @Property(name = "filter.scope", value = "request",
>> propertyPrivate = true),
>>        @Property(name = "filter.order", value = "-9", propertyPrivate = true)
>> })
>> @Services({@Service(javax.servlet.Filter.class)})
>> public class YourFilter implements javax.servlet.Filter {
>>  ...
>> }
>>
>>
>>
>> --
>> Vidar S. Ramdal  - http://www.idium.no
>> Sommerrogata 13-15, N-0255 Oslo, Norway
>> + 47 22 00 84 00
>> Quando omni flunkus moritatus!
>>
>


Re: request.getCharacterEncoding() always returns ISO-8859-1

2011-02-25 Thread Ian Boston
Markus,
Side question, but related.
I have tried in the past to register a Filter before the SlingMain Servlet 
which also has Filter registration functionality (presumably from a time when 
the OSGi HttpService didnt support filters), however I was always stuck 
locating the HttpContext. (the Pax HttpService impl needs the a real 
HttpContext class)

Is the Context ID mentioned in [0], the service ID of the service that 
implements the HttpContext? (which would be SlingMain as that implements 
HttpContext and registers against itself)

and

Does registering against a null context make a Filter active on all contexts?

TIA
Ian



On 24 Feb 2011, at 18:21, Julian Sedding wrote:

> Hello Markus
> 
> You can register servlet filters in Sling, but you can also register
> servlet filters in the OSGi HttpService using the whiteboard[0]. As
> the SlingMainServlet runs within the OSGi HttpService, you may have
> more success registering a filter there. I don't know where in the
> stack ParameterSupport is added, so YMMV. Let us know whether this
> does the trick.
> 
> Regards
> Julian
> 
> [0] 
> http://felix.apache.org/site/apache-felix-http-service.html#ApacheFelixHTTPService-UsingtheWhiteboard
> 
> 
> 
> On Thu, Feb 24, 2011 at 4:29 PM, Markus Joschko
>  wrote:
>> Hi Vidar and all,
>> I think that this approach is not working.
>> I have the exact same use case. We don't want to put the _charset_
>> parameter into every form. Therefore I tried to automatically add a
>> _charset_ request parameter to the request in a filter.
>> However that parameter never gets picked up as ParameterSupport is
>> created before the filter gets called (and then uses the reference to
>> the original servletrequest and not the wrapped one).
>> I could theoretically reinstantiate Parametersupport but that requires
>> knowledge of the servlet attribute key where parametersupport is
>> stored. And that is a private variable in parametersupport.
>> 
>> Any other chance to not have the _charset_ parameter in every post
>> request sent to the system?
>> 
>> Thanks,
>>  Markus
>> 
>> On Fri, Feb 11, 2011 at 2:25 PM, Vidar Ramdal  wrote:
>>> On Thu, Feb 10, 2011 at 11:56 PM, Peter Dotchev  wrote:
 
 Hi,
 
 Recently
 http://dotev.blogspot.com/2011/02/posting-non-ascii-characters-in-web.html 
 I
 stumbled  over this issue too.
 
 I don't want to add _charset_ input to all the forms.
 Is there a way to set the request encoding to UTF-8?
 IMHO it would be better if the request encoding is configurable like it is
 done in Wicket.
 
 http://wiki.apache.org/tomcat/FAQ/CharacterEncoding Tomcat FAQ  suggests
 using a filter.
 How can I do that in Sling?
>>> 
>>> Hi, you can implement the javax.servlet.Filter interface and register
>>> your implementation as a Filter service:
>>> 
>>> @Component(immediate = true)
>>> @Properties({
>>>@Property(name = "filter.scope", value = "request",
>>> propertyPrivate = true),
>>>@Property(name = "filter.order", value = "-9", propertyPrivate = 
>>> true)
>>> })
>>> @Services({@Service(javax.servlet.Filter.class)})
>>> public class YourFilter implements javax.servlet.Filter {
>>>  ...
>>> }
>>> 
>>> 
>>> 
>>> --
>>> Vidar S. Ramdal  - http://www.idium.no
>>> Sommerrogata 13-15, N-0255 Oslo, Norway
>>> + 47 22 00 84 00
>>> Quando omni flunkus moritatus!
>>> 
>> 



Re: request.getCharacterEncoding() always returns ISO-8859-1

2011-02-25 Thread Markus Joschko
Hi Julian,

> You can register servlet filters in Sling, but you can also register
> servlet filters in the OSGi HttpService using the whiteboard[0]. As
> the SlingMainServlet runs within the OSGi HttpService, you may have
> more success registering a filter there. I don't know where in the
> stack ParameterSupport is added, so YMMV. Let us know whether this
> does the trick.

Thanks. Using that filter chain does the trick as the filter is added
before any sling processing kicks in.

Nevertheless I wonder why it is necessary to include a mandatory
parameter with always the same value.
Can't this be done by sling?

Regards,
 Markus

> [0] 
> http://felix.apache.org/site/apache-felix-http-service.html#ApacheFelixHTTPService-UsingtheWhiteboard
>
>
>
> On Thu, Feb 24, 2011 at 4:29 PM, Markus Joschko
>  wrote:
>> Hi Vidar and all,
>> I think that this approach is not working.
>> I have the exact same use case. We don't want to put the _charset_
>> parameter into every form. Therefore I tried to automatically add a
>> _charset_ request parameter to the request in a filter.
>> However that parameter never gets picked up as ParameterSupport is
>> created before the filter gets called (and then uses the reference to
>> the original servletrequest and not the wrapped one).
>> I could theoretically reinstantiate Parametersupport but that requires
>> knowledge of the servlet attribute key where parametersupport is
>> stored. And that is a private variable in parametersupport.
>>
>> Any other chance to not have the _charset_ parameter in every post
>> request sent to the system?
>>
>> Thanks,
>>  Markus
>>
>> On Fri, Feb 11, 2011 at 2:25 PM, Vidar Ramdal  wrote:
>>> On Thu, Feb 10, 2011 at 11:56 PM, Peter Dotchev  wrote:

 Hi,

 Recently
 http://dotev.blogspot.com/2011/02/posting-non-ascii-characters-in-web.html 
 I
 stumbled  over this issue too.

 I don't want to add _charset_ input to all the forms.
 Is there a way to set the request encoding to UTF-8?
 IMHO it would be better if the request encoding is configurable like it is
 done in Wicket.

 http://wiki.apache.org/tomcat/FAQ/CharacterEncoding Tomcat FAQ  suggests
 using a filter.
 How can I do that in Sling?
>>>
>>> Hi, you can implement the javax.servlet.Filter interface and register
>>> your implementation as a Filter service:
>>>
>>> @Component(immediate = true)
>>> @Properties({
>>>        @Property(name = "filter.scope", value = "request",
>>> propertyPrivate = true),
>>>        @Property(name = "filter.order", value = "-9", propertyPrivate = 
>>> true)
>>> })
>>> @Services({@Service(javax.servlet.Filter.class)})
>>> public class YourFilter implements javax.servlet.Filter {
>>>  ...
>>> }
>>>
>>>
>>>
>>> --
>>> Vidar S. Ramdal  - http://www.idium.no
>>> Sommerrogata 13-15, N-0255 Oslo, Norway
>>> + 47 22 00 84 00
>>> Quando omni flunkus moritatus!
>>>
>>
>


Re: request.getCharacterEncoding() always returns ISO-8859-1

2011-02-25 Thread Markus Joschko
Hi Ian,
> Side question, but related.
> I have tried in the past to register a Filter before the SlingMain Servlet 
> which also has Filter registration functionality (presumably from a time when 
> the OSGi HttpService didnt support filters), however I was always stuck 
> locating the HttpContext. (the Pax HttpService impl needs the a real 
> HttpContext class)
>
> Is the Context ID mentioned in [0], the service ID of the service that 
> implements the HttpContext? (which would be SlingMain as that implements 
> HttpContext and registers against itself)

I would guess so. The only ServletContext Service  I can quickly
identify in the console is the one from sling with the PID
org.apache.sling.engine.impl.helper.SlingServletContext
Using that as a contextId does not change anything. The servletcontext
I get is of type
org.apache.felix.http.base.internal.context.ServletContextImpl

>
> and
>
> Does registering against a null context make a Filter active on all contexts?

Admittedely I am not sure where to look after to verify that.
I see no difference using the above mentioned service PID as contextId
or using no contextId.

Regards,
 Markus



> On 24 Feb 2011, at 18:21, Julian Sedding wrote:
>
>> Hello Markus
>>
>> You can register servlet filters in Sling, but you can also register
>> servlet filters in the OSGi HttpService using the whiteboard[0]. As
>> the SlingMainServlet runs within the OSGi HttpService, you may have
>> more success registering a filter there. I don't know where in the
>> stack ParameterSupport is added, so YMMV. Let us know whether this
>> does the trick.
>>
>> Regards
>> Julian
>>
>> [0] 
>> http://felix.apache.org/site/apache-felix-http-service.html#ApacheFelixHTTPService-UsingtheWhiteboard
>>
>>
>>
>> On Thu, Feb 24, 2011 at 4:29 PM, Markus Joschko
>>  wrote:
>>> Hi Vidar and all,
>>> I think that this approach is not working.
>>> I have the exact same use case. We don't want to put the _charset_
>>> parameter into every form. Therefore I tried to automatically add a
>>> _charset_ request parameter to the request in a filter.
>>> However that parameter never gets picked up as ParameterSupport is
>>> created before the filter gets called (and then uses the reference to
>>> the original servletrequest and not the wrapped one).
>>> I could theoretically reinstantiate Parametersupport but that requires
>>> knowledge of the servlet attribute key where parametersupport is
>>> stored. And that is a private variable in parametersupport.
>>>
>>> Any other chance to not have the _charset_ parameter in every post
>>> request sent to the system?
>>>
>>> Thanks,
>>>  Markus
>>>
>>> On Fri, Feb 11, 2011 at 2:25 PM, Vidar Ramdal  wrote:
 On Thu, Feb 10, 2011 at 11:56 PM, Peter Dotchev  wrote:
>
> Hi,
>
> Recently
> http://dotev.blogspot.com/2011/02/posting-non-ascii-characters-in-web.html
>  I
> stumbled  over this issue too.
>
> I don't want to add _charset_ input to all the forms.
> Is there a way to set the request encoding to UTF-8?
> IMHO it would be better if the request encoding is configurable like it is
> done in Wicket.
>
> http://wiki.apache.org/tomcat/FAQ/CharacterEncoding Tomcat FAQ  suggests
> using a filter.
> How can I do that in Sling?

 Hi, you can implement the javax.servlet.Filter interface and register
 your implementation as a Filter service:

 @Component(immediate = true)
 @Properties({
        @Property(name = "filter.scope", value = "request",
 propertyPrivate = true),
        @Property(name = "filter.order", value = "-9", propertyPrivate = 
 true)
 })
 @Services({@Service(javax.servlet.Filter.class)})
 public class YourFilter implements javax.servlet.Filter {
  ...
 }



 --
 Vidar S. Ramdal  - http://www.idium.no
 Sommerrogata 13-15, N-0255 Oslo, Norway
 + 47 22 00 84 00
 Quando omni flunkus moritatus!

>>>
>
>


Re: request.getCharacterEncoding() always returns ISO-8859-1

2011-02-25 Thread Alexander Klimetschek
On 25.02.11 11:43, "Markus Joschko"  wrote:
>Nevertheless I wonder why it is necessary to include a mandatory
>parameter with always the same value.
>Can't this be done by sling?

Yes, but it must match the value of the response of the previous request
(i.e. the html containing the form). Because this is the character
encoding that all browsers will use to construct the form - the problem is
only that they don't explicitly mention that in the request.

So it depends what character encoding the servlet or jsp is using to build
the html in the first place, hence by making it "global" on the input side
(you can't easily make output in utf-8 a global setting for all
servlets/jsps), you could run into issues. That's why the typical approach
settled on explicitly including the _charset_ parameter - because that one
is written in the code that is also setting the response output encoding.

Regards,
Alex

-- 
Alexander Klimetschek
Developer // Adobe (Day) // Berlin - Basel






Re: request.getCharacterEncoding() always returns ISO-8859-1

2011-02-25 Thread Markus Joschko
On Fri, Feb 25, 2011 at 3:37 PM, Alexander Klimetschek
 wrote:
> On 25.02.11 11:43, "Markus Joschko"  wrote:
>>Nevertheless I wonder why it is necessary to include a mandatory
>>parameter with always the same value.
>>Can't this be done by sling?
>
> Yes, but it must match the value of the response of the previous request
> (i.e. the html containing the form). Because this is the character
> encoding that all browsers will use to construct the form - the problem is
> only that they don't explicitly mention that in the request.
>
> So it depends what character encoding the servlet or jsp is using to build
> the html in the first place, hence by making it "global" on the input side
> (you can't easily make output in utf-8 a global setting for all
> servlets/jsps), you could run into issues. That's why the typical approach
> settled on explicitly including the _charset_ parameter - because that one
> is written in the code that is also setting the response output encoding.

Not sure if I get this correctly. I understand that I (the application
developer)
know which encoding I chose for the HTML pages (otherwise I could not
write the _charset_ field anyway).

If I choose the same encoding for all my forms I could easily set
default/fallback encoding which sling
can use when no _charset_ field is explicitly provided.
Seems to be much more convenient then always writing the field or
adding a filter that does that.


Re: request.getCharacterEncoding() always returns ISO-8859-1

2011-02-25 Thread James Stansell
On Fri, Feb 25, 2011 at 9:09 AM, Markus Joschko wrote:

>
> Not sure if I get this correctly. ...
> If I choose the same encoding for all my forms I could easily set
> default/fallback encoding which sling
> can use when no _charset_ field is explicitly provided.
>

This phrase comes to mind: convention over configuration

These days UTF-8 seems like a reasonable convention to take advantage of.

Sincerely,

-james.


Re: request.getCharacterEncoding() always returns ISO-8859-1

2011-02-25 Thread Alexander Klimetschek
On 25.02.11 16:41, "James Stansell"  wrote:
>These days UTF-8 seems like a reasonable convention to take advantage of.

Yes, but AFAIU from a standards perspective ISO-8859-1 is the default
"fallback". Especially if you consider requests coming from other clients,
that were not provided by your web application itself as HTML pages.

Regards,
Alex

-- 
Alexander Klimetschek
Developer // Adobe (Day) // Berlin - Basel






Re: request.getCharacterEncoding() always returns ISO-8859-1

2011-02-25 Thread Felix Meschberger
Hi,

Am Freitag, den 25.02.2011, 08:52 + schrieb Ian Boston: 
> Markus,
> Side question, but related.
> I have tried in the past to register a Filter before the SlingMain Servlet 
> which also has Filter registration functionality (presumably from a time when 
> the OSGi HttpService didnt support filters), however I was always stuck 
> locating the HttpContext. (the Pax HttpService impl needs the a real 
> HttpContext class)
> 
> Is the Context ID mentioned in [0], the service ID of the service that 
> implements the HttpContext? (which would be SlingMain as that implements 
> HttpContext and registers against itself)
> 
> and
> 
> Does registering against a null context make a Filter active on all contexts?

The Felix Http Service has whiteboard pattern support for servlet Filter
registration. So you just register your filter as a javax.servlet.Filter
service with an alias service property and you should be done.

Regards
Felix

> 
> TIA
> Ian
> 
> 
> 
> On 24 Feb 2011, at 18:21, Julian Sedding wrote:
> 
> > Hello Markus
> > 
> > You can register servlet filters in Sling, but you can also register
> > servlet filters in the OSGi HttpService using the whiteboard[0]. As
> > the SlingMainServlet runs within the OSGi HttpService, you may have
> > more success registering a filter there. I don't know where in the
> > stack ParameterSupport is added, so YMMV. Let us know whether this
> > does the trick.
> > 
> > Regards
> > Julian
> > 
> > [0] 
> > http://felix.apache.org/site/apache-felix-http-service.html#ApacheFelixHTTPService-UsingtheWhiteboard
> > 
> > 
> > 
> > On Thu, Feb 24, 2011 at 4:29 PM, Markus Joschko
> >  wrote:
> >> Hi Vidar and all,
> >> I think that this approach is not working.
> >> I have the exact same use case. We don't want to put the _charset_
> >> parameter into every form. Therefore I tried to automatically add a
> >> _charset_ request parameter to the request in a filter.
> >> However that parameter never gets picked up as ParameterSupport is
> >> created before the filter gets called (and then uses the reference to
> >> the original servletrequest and not the wrapped one).
> >> I could theoretically reinstantiate Parametersupport but that requires
> >> knowledge of the servlet attribute key where parametersupport is
> >> stored. And that is a private variable in parametersupport.
> >> 
> >> Any other chance to not have the _charset_ parameter in every post
> >> request sent to the system?
> >> 
> >> Thanks,
> >>  Markus
> >> 
> >> On Fri, Feb 11, 2011 at 2:25 PM, Vidar Ramdal  wrote:
> >>> On Thu, Feb 10, 2011 at 11:56 PM, Peter Dotchev  wrote:
>  
>  Hi,
>  
>  Recently
>  http://dotev.blogspot.com/2011/02/posting-non-ascii-characters-in-web.html
>   I
>  stumbled  over this issue too.
>  
>  I don't want to add _charset_ input to all the forms.
>  Is there a way to set the request encoding to UTF-8?
>  IMHO it would be better if the request encoding is configurable like it 
>  is
>  done in Wicket.
>  
>  http://wiki.apache.org/tomcat/FAQ/CharacterEncoding Tomcat FAQ  suggests
>  using a filter.
>  How can I do that in Sling?
> >>> 
> >>> Hi, you can implement the javax.servlet.Filter interface and register
> >>> your implementation as a Filter service:
> >>> 
> >>> @Component(immediate = true)
> >>> @Properties({
> >>>@Property(name = "filter.scope", value = "request",
> >>> propertyPrivate = true),
> >>>@Property(name = "filter.order", value = "-9", propertyPrivate = 
> >>> true)
> >>> })
> >>> @Services({@Service(javax.servlet.Filter.class)})
> >>> public class YourFilter implements javax.servlet.Filter {
> >>>  ...
> >>> }
> >>> 
> >>> 
> >>> 
> >>> --
> >>> Vidar S. Ramdal  - http://www.idium.no
> >>> Sommerrogata 13-15, N-0255 Oslo, Norway
> >>> + 47 22 00 84 00
> >>> Quando omni flunkus moritatus!
> >>> 
> >> 
> 




Re: request.getCharacterEncoding() always returns ISO-8859-1

2011-02-25 Thread Felix Meschberger
Hi,

The problem is that browsers tend to not tell the character encoding
used when posting data ... Don't ask me why ;-)

So we have to do guessing, something I really do not like.

But it looks like browsers send POST data in the same encoding as the
form was received as. So if the form is received as UTF-8 encoded,
browsers send back encoded in UTF-8.

Now, how does Sling know what encoding has been used to send the form ?
Short answer: It cannot know.

Hence the _charset_ request parameter.

But listening to our clients and users and understanding that most of
the time UTF-8 is used anyway, how about this solution:

  * We stick with the _charset_ parameter. Whatever that parameter
conveys is used to decode parameters.
  * If the parameter does not exist, we support a new configuration
option defining the default encoding to be used.
  * If the configuration option is also missing, we default to the
same value as we do today; which is ISO-8859-1

Of course the configuration option would not be set by default (for
backwards compatibility reasons).

Would that help your case ?

Regards
Felix

Am Mittwoch, den 20.10.2010, 14:05 -0400 schrieb sam lee: 
> according to:
> http://download.oracle.com/javaee/6/api/javax/servlet/ServletRequest.html#getCharacterEncoding%28%29
> request.getCharacterEncoding() should return " the name of the character
> encoding used in the body of this request. ".
> 
> But request.getCharacterEncoding() always seems to return  ISO-8859-1.
> For example, my html.jsp looks like:
> <%@ page language="java" contentType="text/html; charset=UTF-8"
> pageEncoding="UTF-8"%>
> ...
>  accept-charset="utf-8"
> enctype="application/x-www-form-urlencoded; charset=utf-8">
> 
> 
> ...
> 
> Then I would expect request.getCharacterEncoding()  (from POST.jsp) to
> return "UTF-8". But it still returns "ISO-8859-1".
> 
> Is this intended?
> 
> >From sling documentation:
> http://sling.apache.org/site/request-parameters.html#RequestParameters-CharacterEncoding
> I don't get this part:  "This identity transformation happens to generate
> strings as the original data was generated with ISO-8859-1 encoding."
> 
> As long as I set _charset_ to the encoding of the rendered page (with
> ), I don't have a problem. But, I was wondering if
> .getCharacterEncoding() should be set to whatever request body was encoded
> as, not what sling used to perform "identity transform" with.
> 
> Also, wouldn't it be better if _charset_ is missing from request, it's
> automatically set to request body encoding? Or, browsers don't send request
> body encoding information?
> 
> Thanks.
> Sam




Re: request.getCharacterEncoding() always returns ISO-8859-1

2011-02-25 Thread Alexander Klimetschek
On 25.02.11 17:12, "Felix Meschberger"  wrote:
>But listening to our clients and users and understanding that most of
>the time UTF-8 is used anyway, how about this solution:
>
>  * We stick with the _charset_ parameter. Whatever that parameter
>conveys is used to decode parameters.
>  * If the parameter does not exist, we support a new configuration
>option defining the default encoding to be used.
>  * If the configuration option is also missing, we default to the
>same value as we do today; which is ISO-8859-1
>
>Of course the configuration option would not be set by default (for
>backwards compatibility reasons).

Sounds good!

Regards,
Alex

-- 
Alexander Klimetschek
Developer // Adobe (Day) // Berlin - Basel






Re: request.getCharacterEncoding() always returns ISO-8859-1

2011-02-25 Thread Markus Joschko
On Fri, Feb 25, 2011 at 5:12 PM, Felix Meschberger  wrote:
> Hi,
>
> The problem is that browsers tend to not tell the character encoding
> used when posting data ... Don't ask me why ;-)
>
> So we have to do guessing, something I really do not like.
>
> But it looks like browsers send POST data in the same encoding as the
> form was received as. So if the form is received as UTF-8 encoded,
> browsers send back encoded in UTF-8.
>
> Now, how does Sling know what encoding has been used to send the form ?
> Short answer: It cannot know.
>
> Hence the _charset_ request parameter.
>
> But listening to our clients and users and understanding that most of
> the time UTF-8 is used anyway, how about this solution:
>
>  * We stick with the _charset_ parameter. Whatever that parameter
>    conveys is used to decode parameters.
>  * If the parameter does not exist, we support a new configuration
>    option defining the default encoding to be used.
>  * If the configuration option is also missing, we default to the
>    same value as we do today; which is ISO-8859-1
>
> Of course the configuration option would not be set by default (for
> backwards compatibility reasons).
>
> Would that help your case ?

That would be perfect!



> Am Mittwoch, den 20.10.2010, 14:05 -0400 schrieb sam lee:
>> according to:
>> http://download.oracle.com/javaee/6/api/javax/servlet/ServletRequest.html#getCharacterEncoding%28%29
>> request.getCharacterEncoding() should return " the name of the character
>> encoding used in the body of this request. ".
>>
>> But request.getCharacterEncoding() always seems to return  ISO-8859-1.
>> For example, my html.jsp looks like:
>> <%@ page language="java" contentType="text/html; charset=UTF-8"
>>     pageEncoding="UTF-8"%>
>> ...
>> >     accept-charset="utf-8"
>>     enctype="application/x-www-form-urlencoded; charset=utf-8">
>>     
>>     
>> ...
>>
>> Then I would expect request.getCharacterEncoding()  (from POST.jsp) to
>> return "UTF-8". But it still returns "ISO-8859-1".
>>
>> Is this intended?
>>
>> >From sling documentation:
>> http://sling.apache.org/site/request-parameters.html#RequestParameters-CharacterEncoding
>> I don't get this part:  "This identity transformation happens to generate
>> strings as the original data was generated with ISO-8859-1 encoding."
>>
>> As long as I set _charset_ to the encoding of the rendered page (with
>> ), I don't have a problem. But, I was wondering if
>> .getCharacterEncoding() should be set to whatever request body was encoded
>> as, not what sling used to perform "identity transform" with.
>>
>> Also, wouldn't it be better if _charset_ is missing from request, it's
>> automatically set to request body encoding? Or, browsers don't send request
>> body encoding information?
>>
>> Thanks.
>> Sam
>
>
>


Re: request.getCharacterEncoding() always returns ISO-8859-1

2011-02-25 Thread Ian Boston

On 25 Feb 2011, at 16:06, Felix Meschberger wrote:

> The Felix Http Service has whiteboard pattern support for servlet Filter
> registration. So you just register your filter as a javax.servlet.Filter
> service with an alias service property and you should be done.
> 
> Regards
> Felix


Cool, thanks, that makes sense.
I think the reason it wasn't working was because we were still using the Pax 
Web http service impl with the Felix whiteboard, now using the Felix 
HttpService.
Ian

Re: request.getCharacterEncoding() always returns ISO-8859-1

2011-02-28 Thread Felix Meschberger
Hi,

I have implemented this support in trunk (see SLING-1998 [1]) and
described it on the Request Parameter Handling page  [2].

Regards
Felix

[1] https://issues.apache.org/jira/browse/SLING-1998
[2] http://sling.apache.org/site/request-parameters.html

Am Freitag, den 25.02.2011, 16:12 + schrieb Felix Meschberger: 
> Hi,
> 
> The problem is that browsers tend to not tell the character encoding
> used when posting data ... Don't ask me why ;-)
> 
> So we have to do guessing, something I really do not like.
> 
> But it looks like browsers send POST data in the same encoding as the
> form was received as. So if the form is received as UTF-8 encoded,
> browsers send back encoded in UTF-8.
> 
> Now, how does Sling know what encoding has been used to send the form ?
> Short answer: It cannot know.
> 
> Hence the _charset_ request parameter.
> 
> But listening to our clients and users and understanding that most of
> the time UTF-8 is used anyway, how about this solution:
> 
>   * We stick with the _charset_ parameter. Whatever that parameter
> conveys is used to decode parameters.
>   * If the parameter does not exist, we support a new configuration
> option defining the default encoding to be used.
>   * If the configuration option is also missing, we default to the
> same value as we do today; which is ISO-8859-1
> 
> Of course the configuration option would not be set by default (for
> backwards compatibility reasons).
> 
> Would that help your case ?
> 
> Regards
> Felix
> 
> Am Mittwoch, den 20.10.2010, 14:05 -0400 schrieb sam lee: 
> > according to:
> > http://download.oracle.com/javaee/6/api/javax/servlet/ServletRequest.html#getCharacterEncoding%28%29
> > request.getCharacterEncoding() should return " the name of the character
> > encoding used in the body of this request. ".
> > 
> > But request.getCharacterEncoding() always seems to return  ISO-8859-1.
> > For example, my html.jsp looks like:
> > <%@ page language="java" contentType="text/html; charset=UTF-8"
> > pageEncoding="UTF-8"%>
> > ...
> >  > accept-charset="utf-8"
> > enctype="application/x-www-form-urlencoded; charset=utf-8">
> > 
> > 
> > ...
> > 
> > Then I would expect request.getCharacterEncoding()  (from POST.jsp) to
> > return "UTF-8". But it still returns "ISO-8859-1".
> > 
> > Is this intended?
> > 
> > >From sling documentation:
> > http://sling.apache.org/site/request-parameters.html#RequestParameters-CharacterEncoding
> > I don't get this part:  "This identity transformation happens to generate
> > strings as the original data was generated with ISO-8859-1 encoding."
> > 
> > As long as I set _charset_ to the encoding of the rendered page (with
> > ), I don't have a problem. But, I was wondering if
> > .getCharacterEncoding() should be set to whatever request body was encoded
> > as, not what sling used to perform "identity transform" with.
> > 
> > Also, wouldn't it be better if _charset_ is missing from request, it's
> > automatically set to request body encoding? Or, browsers don't send request
> > body encoding information?
> > 
> > Thanks.
> > Sam
> 
>