[ https://issues.apache.org/jira/browse/SOLR-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12606886#action_12606886 ]
Yonik Seeley commented on SOLR-443: ----------------------------------- You're right Lars, setting the URIEncoding didn't work for Tomcat. I checked in a test program: solr/example/exampledocs/test_utf8.sh It seems that using {code}Content-Type: application/x-www-form-urlencoded; charset=UTF-8 {code} works for Jetty, Tomcat (I tested 5.5), and Resin (I tested 3.1) On a related note, I checked in a fix for distributed faceting refinement to ignore facet.query values that it doesn't know about. It's unfortunate that it will hide this problem (that's why i made the UTF8 test script), but it seems like the correct thing to do since another component may add additional request parts. > POST queries don't declare its charset > -------------------------------------- > > Key: SOLR-443 > URL: https://issues.apache.org/jira/browse/SOLR-443 > Project: Solr > Issue Type: Bug > Components: clients - java > Affects Versions: 1.2 > Environment: Tomcat 6.0.14 > Reporter: Andrew Schurman > Priority: Minor > Attachments: SOLR-443-multipart.patch, solr-443.patch, > solr-443.patch, SolrDispatchFilter.patch > > > When sending a query via POST, the content-type is not set. The content > charset for the POST parameters are set, but this only appears to be used for > creating the Content-Length header in the commons library. Since a query is > encoded in UTF-8, the http headers should also specify content type charset. > On Tomcat, this causes problems when the query string contains non-ascii > characters (characters with accents and such) as it tries to parse the POST > body in its default ISO-9886-1. There appears to be no way to set/change the > default encoding for a message body on Tomcat. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.