[
https://issues.apache.org/jira/browse/CONNECTORS-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Karl Wright updated CONNECTORS-1434:
------------------------------------
Attachment: CONNECTORS-1434.patch
Tentative patch, which escapes filename according to the hint found in Stack
Overflow.
> Bad characters in file name can cause Solr 500 errors
> -----------------------------------------------------
>
> Key: CONNECTORS-1434
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1434
> Project: ManifoldCF
> Issue Type: Bug
> Components: Lucene/SOLR connector
> Affects Versions: ManifoldCF 2.7
> Reporter: Karl Wright
> Assignee: Karl Wright
> Fix For: ManifoldCF 2.8
>
> Attachments: CONNECTORS-1434.patch
>
>
> There are reports that quotes or spaces in a file name can blow up the Solr
> indexing of the document and cause it to throw a 500 error.
> The code in question (from ModifiedHttpSolrClient) is the following:
> {code}
> String name = content.getName();
> if (name == null) {
> name = "";
> }
> parts.add(new FormBodyPart(name,
> new InputStreamBody(
> content.getStream(),
> contentType,
> content.getName())));
> {code}
> ... where content.getName() would be returning a name with illegal
> characters. The question is, what does httpclient do with this name, and
> should it be escaping it in some way?
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)