Olivier Tavard created CONNECTORS-1610:
------------------------------------------
Summary: handle error 500 in WindowsShare repository connector
Key: CONNECTORS-1610
URL: https://issues.apache.org/jira/browse/CONNECTORS-1610
Project: ManifoldCF
Issue Type: Bug
Reporter: Olivier Tavard
Hi,
I have a question regarding error 500 in the WindowsShare repository connector.
I recently noticed that I have a problem with a particular file that contains
metadata with non ASCII characters. My pipeline in MCF basically contains the
embedded Tika and the data is sent to Solr.
For this particular file (it is a autocad file btw) there is an error 500 that
occurs in Solr. This happens after the embedded Tika in MCF has extracted
content+metadata and has sent it to Solr.
The job does not stop and the file is sent many times to Solr which responds
with the same error again and again :
The detail of the error in Solr is :
null:org.apache.commons.fileupload.FileUploadException: Header section has more
than 10240 bytes (maybe it is not properly terminated)
at
org.apache.commons.fileupload.FileUploadBase.parseRequest(FileUploadBase.java:362)
at
org.apache.commons.fileupload.servlet.ServletFileUpload.parseRequest(ServletFileUpload.java:115)
In the MCF simple history, I can see that the same file is retried endlessly
(see below) and the job is still running.
Is there a chance to change this behavior to skip the file in this case or at
least to stop the job after a certain number of retries ?
PS : I sent 2 times an email in the dev mailing list but the emails never
showed up, it is why I have created directly this issue.
Thanks,
Olivier
{code:java}
27/05/19 14:24:48 document ingest (DatafariSolrNoTika)
file://///x.x.x.x/testfiler0...
.dwg
500 34 369 Error from server at http://127.0.0.1:8983/solr/FileShare: Expected
mime type application/octet-stream but got application/json. { "error":{
"msg":"Header section has more than 10240 bytes (maybe it is not properly
terminated)", "trace":"org.apache.commons.fileupload.FileUploadException:
Header section has more than 10240 bytes (maybe it is not properly
terminated)\n\tat
org.apache.commons.fileupload.FileUploadBase.parseRequest(FileUploadBase.java:362)\n\tat
org.apache.commons.fileupload.servlet.ServletFileUpload.parseRequest(ServletFileUpload.java:115)\n\tat
org.apache.solr.servlet.SolrRequestParsers$MultipartRequestParser.parseParamsAndFillStreams(SolrRequestParsers.java:602)\n\tat
org.apache.solr.servlet.SolrRequestParsers$StandardRequestParser.parseParamsAndFillStreams(SolrRequestParsers.java:784)\n\tat
org.apache.solr.servlet.So 27/05/19 14:24:47 extract [Tika]
file://///x.x.x.x/testfiler0...
.dwg
OK 34 74
27/05/19 14:23:45 document ingest (DatafariSolrNoTika)
file://///x.x.x.x/testfiler0...
.dwg
500 34 393 Error from server at http://127.0.0.1:8983/solr/FileShare: Expected
mime type application/octet-stream but got application/json. { "error":{
"msg":"Header section has more than 10240 bytes (maybe it is not properly
terminated)", "trace":"org.apache.commons.fileupload.FileUploadException:
Header section has more than 10240 bytes (maybe it is not properly
terminated)\n\tat
org.apache.commons.fileupload.FileUploadBase.parseRequest(FileUploadBase.java:362)\n\tat
org.apache.commons.fileupload.servlet.ServletFileUpload.parseRequest(ServletFileUpload.java:115)\n\tat
org.apache.solr.servlet.SolrRequestParsers$MultipartRequestParser.parseParamsAndFillStreams(SolrRequestParsers.java:602)\n\tat
org.apache.solr.servlet{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)