[
https://issues.apache.org/jira/browse/SOLR-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Uwe Schindler resolved SOLR-4283.
---------------------------------
Resolution: Fixed
Committed to trunk and 4.x.
A next step would be to make the encoding of the GET-URLs configureable (using
the defacto standard "&ie=charset" URL parameter, as used by most REST
webservices of major search engines).
> Improve URL decoding (followup of SOLR-4265)
> --------------------------------------------
>
> Key: SOLR-4283
> URL: https://issues.apache.org/jira/browse/SOLR-4283
> Project: Solr
> Issue Type: Improvement
> Affects Versions: 4.0
> Reporter: Uwe Schindler
> Assignee: Uwe Schindler
> Fix For: 4.1, 5.0
>
> Attachments: index.jsp, request.http, SOLR-4283.patch,
> SOLR-4283.patch, SOLR-4283.patch, SOLR-4283.patch, SOLR-4283.patch
>
>
> Followup of SOLR-4265:
> SOLR-4265 has 2 problems:
> - it reads the whole InputStream into a String and this one can be big. This
> wastes memory, especially when your query string from the POSted form data is
> near the 2 Megabyte limit. The String is then packed in splitted form into a
> big Map.
> - it does not report corrupt UTF-8
> The attached patch will do 2 things:
> - The decoding of the POSTed form data is done on the ServletInputStream,
> directly parsing the bytes (not chars). Key/Value pairs are extracted and
> %-decoded to byte[] on the fly. URL-parameters from getQueryString() are
> parsed with the same code using ByteArrayInputStream on the original String,
> interpreted as UTF-8 (this is a hack, because Servlet API does not give back
> the original bytes from the HTTP request). To be standards conform, the query
> String should be interpreted as US-ASCII, but with this approach, not full
> escaped UTF-8 from the HTTP request survive.
> - the byte[] key/value pairs are converted to Strings using CharsetDecoder
> This will be memory efficient and will report incorrect escaped form data, so
> people will no longer complain if searches hit no results or similar.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]