On 2/1/07, Chris Hostetter <[EMAIL PROTECTED]> wrote:
: > should we add: : > request.setCharacterEncoding ("utf-8") : > to GET requests in StandardRequestParser? : : Perhaps. I wonder if there's any performance impact, and if it fixes : Tomcat's default of latin1 too. see my comments in the related thread about POST... http://www.nabble.com/charset-in-POST-from-browser-tf3153057.html#a8744560 ...my reading of the servlet spec was that request.setCharacterEncoding only impacted request *body* data, not the URL.
Yeah, hence I wouldn't do it if it only fixed resin, but if it fixed tomcat too, it would save a lot of people headaches
According to the javadocs for it, using it also means that if the client is well behaved and *does* set a charset in the Content-Type it will be ignored.
Content-Type for a GET?
Solr users should be able to pick their encoding as much as possible -- so we definitely shouldnt' do anything that overrides the charset specified in the request (if there is one)
Sure.
but we also shoudn't hardcode UTF-8 anywhere if possible ... the default charset should come from some config -- either the solrconfig or the servlet containers config.
The problem is that one needs to be an expert to figure all this crap out. Defaulting to UTF-8 in a url-encoded POST (where browsers refuse to add charset) seems like a good default, and one that will increase interop and prevent people from getting backed into a corner later. -Yonik