Hi Lalit, This looks like a solr bug, but until there's a wire log I can look at from the manifoldcf side I can't prove it.
Thanks, Karl Sent from my Windows Phone ------------------------------ From: lalit jangra Sent: 8/23/2014 2:37 PM To: [email protected] Subject: Re: Getting java.io.IOException: missing CR Error Thanks Karl, I did some more investigation and found out "Invalid chunk header " error in solr logs. I could say that firewall or proxy setting is not an issue as i am indexing sahrepoint with same solr as well and its going well. Also i have set solr's multipartUplaodLimitinKB to high value of 204800000 KB as well. 974159 [http-bio-8080-exec-82] ERROR org.apache.solr.servlet.SolrDispatchFilter – null:org.apache.commons.fileupload.FileUploadBase$IOFileUploadException: Processing of multipart/form-data request failed. Invalid chunk header at org.apache.commons.fileupload.FileUploadBase.parseRequest(FileUploadBase.java:367) at org.apache.commons.fileupload.servlet.ServletFileUpload.parseRequest(ServletFileUpload.java:126) at org.apache.solr.servlet.SolrRequestParsers$MultipartRequestParser.parseParamsAndFillStreams(SolrRequestParsers.java:547) at org.apache.solr.servlet.SolrRequestParsers$StandardRequestParser.parseParamsAndFillStreams(SolrRequestParsers.java:681) at org.apache.solr.servlet.SolrRequestParsers.parse(SolrRequestParsers.java:150) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:393) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:197) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:170) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:98) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1040) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:607) at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:315) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) Caused by: java.io.IOException: Invalid chunk header at org.apache.coyote.http11.filters.ChunkedInputFilter.doRead(ChunkedInputFilter.java:172) at org.apache.coyote.http11.AbstractInputBuffer.doRead(AbstractInputBuffer.java:346) at org.apache.coyote.Request.doRead(Request.java:422) at org.apache.catalina.connector.InputBuffer.realReadBytes(InputBuffer.java:290) at org.apache.tomcat.util.buf.ByteChunk.substract(ByteChunk.java:449) at org.apache.catalina.connector.InputBuffer.read(InputBuffer.java:315) at org.apache.catalina.connector.CoyoteInputStream.read(CoyoteInputStream.java:200) at java.io.FilterInputStream.read(FilterInputStream.java:133) at org.apache.commons.fileupload.util.LimitedInputStream.read(LimitedInputStream.java:125) at org.apache.commons.fileupload.MultipartStream$ItemInputStream.makeAvailable(MultipartStream.java:977) at org.apache.commons.fileupload.MultipartStream$ItemInputStream.read(MultipartStream.java:887) at java.io.InputStream.read(InputStream.java:101) Please sugset. On Fri, Aug 22, 2014 at 5:25 PM, Karl Wright <[email protected]> wrote: > Hi, > > The error you are getting is because "Solr"'s response is not valid HTTP. > It is therefore likely that it is not solr itself that is the problem but > rather some firewall or proxy that is failing to allow stuff to be posted > through. Either that, or you have triggered some Solr error condition > (maybe due to files being too large?) and Solr is erroneously responding > with a non-HTTP response. > > The way to debug this is to turn on httpclient wire logging, and then you > will see the back-and-forth with solr that is the problem. You do this in > the ManifoldCF logging.ini file. Here is a description of httpcomponents > wire logging: > > https://hc.apache.org/httpcomponents-client-4.3.x/logging.html > > Karl > > > > On Fri, Aug 22, 2014 at 6:49 AM, lalit jangra <[email protected]> > wrote: > >> Thanks, >> >> I checked everything, including replacing with new solr instance but >> still this error appears. Next with same solr, i am indexing SharePoint >> sites as well and its working fine (solr seems to be fine). >> >> I am using CMIS 1.0 to connect to alfresco in MCF on linux box . For same >> set of configurations in windows , i am not getting any error for alfresco. >> >> Regards. >> >> >> On Thu, Aug 21, 2014 at 5:24 PM, Karl Wright <[email protected]> wrote: >> >>> Hi Lalit, >>> >>> Check your Solr instance. Something is going wrong talking to it. >>> >>> Karl >>> >>> >>> >>> On Thu, Aug 21, 2014 at 7:52 AM, lalit jangra <[email protected]> >>> wrote: >>> >>>> Hi, >>>> >>>> I am using MCF 1.5.1 & indexing Alfresco 4.2 using CMIS. It was >>>> working fine till now but suddenly Alfresco job broke & i could see below >>>> error in manifoldcf.log. >>>> >>>> WARN 2014-08-21 12:28:27,030 (Worker thread '184') - Service >>>> interruption reported for job 1408620030828 connection 'Alfresco': IO >>>> exception during indexing >>>> http://iwdc2devbld02:8080/alfresco/api/-default-/public/cmis/versions/1.0/atom/content/EPA-EPA%20Mission.pdf?id=37d1cc7e-e284-4466-ac2f-3d81dcaeb8a3%3B1.0: >>>> missing CR >>>> >>>> WARN 2014-08-21 12:28:28,202 (Worker thread '76') - IO exception during >>>> indexing >>>> http://iwdc2devbld02:8080/alfresco/api/-default-/public/cmis/versions/1.0/atom/content/TCurran-Strategies%20for%20Domestic%20Wastewater%20Treatment.pdf?id=64e0e3c1-7b9e-451a-b4d1-876fbe5a0b8e%3B1.0: >>>> missing CR >>>> >>>> java.io.IOException: missing CR >>>> >>>> at >>>> sun.net.www.http.ChunkedInputStream.processRaw(ChunkedInputStream.java:405) >>>> >>>> at >>>> sun.net.www.http.ChunkedInputStream.readAheadBlocking(ChunkedInputStream.java:572) >>>> >>>> at >>>> sun.net.www.http.ChunkedInputStream.readAhead(ChunkedInputStream.java:609) >>>> >>>> at >>>> sun.net.www.http.ChunkedInputStream.read(ChunkedInputStream.java:696) >>>> >>>> at java.io.FilterInputStream.read(FilterInputStream.java:133) >>>> >>>> at >>>> sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:3052) >>>> >>>> at >>>> sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:3046) >>>> >>>> at >>>> org.apache.http.entity.mime.content.InputStreamBody.writeTo(InputStreamBody.java:69) >>>> >>>> at >>>> org.apache.manifoldcf.agents.output.solr.ModifiedHttpMultipart.doWriteTo(ModifiedHttpMultipart.java:211) >>>> >>>> at >>>> org.apache.manifoldcf.agents.output.solr.ModifiedHttpMultipart.writeTo(ModifiedHttpMultipart.java:229) >>>> >>>> at >>>> org.apache.manifoldcf.agents.output.solr.ModifiedMultipartEntity.writeTo(ModifiedMultipartEntity.java:186) >>>> >>>> at >>>> org.apache.http.entity.HttpEntityWrapper.writeTo(HttpEntityWrapper.java:98) >>>> >>>> at >>>> org.apache.http.impl.client.EntityEnclosingRequestWrapper$EntityWrapper.writeTo(EntityEnclosingRequestWrapper.java:108) >>>> >>>> at >>>> org.apache.http.impl.entity.EntitySerializer.serialize(EntitySerializer.java:122) >>>> >>>> at >>>> org.apache.http.impl.AbstractHttpClientConnection.sendRequestEntity(AbstractHttpClientConnection.java:271) >>>> >>>> at >>>> org.apache.http.impl.conn.ManagedClientConnectionImpl.sendRequestEntity(ManagedClientConnectionImpl.java:197) >>>> >>>> at >>>> org.apache.http.protocol.HttpRequestExecutor.doSendRequest(HttpRequestExecutor.java:257) >>>> >>>> at >>>> org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) >>>> >>>> at >>>> org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:715) >>>> >>>> at >>>> org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:520) >>>> >>>> at >>>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) >>>> >>>> at >>>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) >>>> >>>> at >>>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784) >>>> >>>> at >>>> org.apache.manifoldcf.agents.output.solr.ModifiedHttpSolrServer.request(ModifiedHttpSolrServer.java:291) >>>> >>>> at >>>> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:197) >>>> >>>> at >>>> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117) >>>> >>>> at >>>> org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:923) >>>> >>>> WARN 2014-08-21 12:28:28,227 (Worker thread '76') - Service >>>> interruption reported for job 1408620030828 connection 'Alfresco': IO >>>> exception during indexing >>>> http://iwdc2devbld02:8080/alfresco/api/-default-/public/cmis/versions/1.0/atom/content/TCurran-Strategies%20for%20Domestic%20Wastewater%20Treatment.pdf?id=64e0e3c1-7b9e-451a-b4d1-876fbe5a0b8e%3B1.0: >>>> missing CR >>>> >>>> Please help. >>>> >>>> Regards, >>>> Lalit. >>>> >>> >>> >> >> >> -- >> Regards, >> Lalit. >> > > -- Regards, Lalit.
