On 2/1/07, Chris Hostetter <[EMAIL PROTECTED]> wrote:

: > should we add:
: >  request.setCharacterEncoding ("utf-8")
: > to GET requests in StandardRequestParser?
:
: Perhaps.  I wonder if there's any performance impact, and if it fixes
: Tomcat's default of latin1 too.

see my comments in the related thread about POST...

http://www.nabble.com/charset-in-POST-from-browser-tf3153057.html#a8744560

...my reading of the servlet spec was that request.setCharacterEncoding
only impacted request *body* data, not the URL.

Yeah, hence I wouldn't do it if it only fixed resin, but if it fixed
tomcat too, it would save a lot of people headaches

According to the javadocs for it, using it also means that if the client
is well behaved and *does* set a charset in the Content-Type it will be
ignored.

Content-Type for a GET?

Solr users should be able to pick their encoding as much as possible -- so
we definitely shouldnt' do anything that overrides the charset specified
in the request (if there is one)

Sure.

but we also shoudn't hardcode UTF-8
anywhere if possible ... the default charset should come from some config
-- either the solrconfig or the servlet containers config.

The problem is that one needs to be an expert to figure all this crap out.

Defaulting to UTF-8 in a url-encoded POST (where browsers refuse to
add charset) seems like a good default, and one that will increase
interop and prevent people from getting backed into a corner later.

-Yonik

Reply via email to