[
https://issues.apache.org/jira/browse/SOLR-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12554211
]
Andrew Schurman commented on SOLR-443:
--------------------------------------
Hmm... I just tested the latest patch on a different machine with Tomcat 6.0.14
and it does appear to work (I must have some sort of caching problem on my
other machine).
As for standards, I don't believe it's updated, but I found HTML
Internationalization RFC http://www.ietf.org/rfc/rfc2070.txt. On page 16, it
mentions that setting the charset with a content-type of
{{x-www-form-urlencoded}} should have the understanding that the "URL encoding
of [RFC1738] is applied on top of the specified character encoding, as a kind
of implicit Content-Transfer-Encoding". In this case, it does seem valid to be
setting the charset on the post.
> POST queries don't declare its charset
> --------------------------------------
>
> Key: SOLR-443
> URL: https://issues.apache.org/jira/browse/SOLR-443
> Project: Solr
> Issue Type: Bug
> Components: clients - java
> Affects Versions: 1.2
> Environment: Tomcat 6.0.14
> Reporter: Andrew Schurman
> Priority: Minor
> Attachments: solr-443.patch, solr-443.patch
>
>
> When sending a query via POST, the content-type is not set. The content
> charset for the POST parameters are set, but this only appears to be used for
> creating the Content-Length header in the commons library. Since a query is
> encoded in UTF-8, the http headers should also specify content type charset.
> On Tomcat, this causes problems when the query string contains non-ascii
> characters (characters with accents and such) as it tries to parse the POST
> body in its default ISO-9886-1. There appears to be no way to set/change the
> default encoding for a message body on Tomcat.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.