[ 
https://issues.apache.org/jira/browse/SOLR-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12607133#action_12607133
 ] 

Lars Kotthoff commented on SOLR-443:
------------------------------------

I've just done some tests with curl and a servlet that does nothing but parse 
the request parameters on Tomcat 5.5. POSTing a 48KB file as a single part 
takes about 13ms and generates about 50KB of traffic. Almost all of that time 
is spent processing at the client, i.e. executing curl and assembling the 
request. POSTing the same file as a multi-part request with 1 part per line 
(6318 parts total) takes about 80ms and generates about 650KB of traffic. About 
half of that time is spent at the client assembling the request.

The time was measured at the client and is the total time required for 
everything -- curl assembles the request, sends it to the server, the servlet 
parses the parameters, generates a dummy page, and sends it back. Client and 
server are connected with Gigabit ethernet.

In conclusion, yes, the overhead is significant, but even with large requests 
it's nowhere near to being a bottleneck. Processing more than 6000 queries is 
going to take significantly longer than 80ms ;)

But YMMV of course.

> POST queries don't declare its charset
> --------------------------------------
>
>                 Key: SOLR-443
>                 URL: https://issues.apache.org/jira/browse/SOLR-443
>             Project: Solr
>          Issue Type: Bug
>          Components: clients - java
>    Affects Versions: 1.2
>         Environment: Tomcat 6.0.14
>            Reporter: Andrew Schurman
>            Priority: Minor
>         Attachments: SOLR-443-multipart.patch, solr-443.patch, 
> solr-443.patch, SolrDispatchFilter.patch
>
>
> When sending a query via POST, the content-type is not set. The content 
> charset for the POST parameters are set, but this only appears to be used for 
> creating the Content-Length header in the commons library. Since a query is 
> encoded in UTF-8, the http headers should also specify content type charset.
> On Tomcat, this causes problems when the query string contains non-ascii 
> characters (characters with accents and such) as it tries to parse the POST 
> body in its default ISO-9886-1. There appears to be no way to set/change the 
> default encoding for a message body on Tomcat.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to