Hi, The error you are getting is because "Solr"'s response is not valid HTTP. It is therefore likely that it is not solr itself that is the problem but rather some firewall or proxy that is failing to allow stuff to be posted through. Either that, or you have triggered some Solr error condition (maybe due to files being too large?) and Solr is erroneously responding with a non-HTTP response.
The way to debug this is to turn on httpclient wire logging, and then you will see the back-and-forth with solr that is the problem. You do this in the ManifoldCF logging.ini file. Here is a description of httpcomponents wire logging: https://hc.apache.org/httpcomponents-client-4.3.x/logging.html Karl On Fri, Aug 22, 2014 at 6:49 AM, lalit jangra <[email protected]> wrote: > Thanks, > > I checked everything, including replacing with new solr instance but still > this error appears. Next with same solr, i am indexing SharePoint sites as > well and its working fine (solr seems to be fine). > > I am using CMIS 1.0 to connect to alfresco in MCF on linux box . For same > set of configurations in windows , i am not getting any error for alfresco. > > Regards. > > > On Thu, Aug 21, 2014 at 5:24 PM, Karl Wright <[email protected]> wrote: > >> Hi Lalit, >> >> Check your Solr instance. Something is going wrong talking to it. >> >> Karl >> >> >> >> On Thu, Aug 21, 2014 at 7:52 AM, lalit jangra <[email protected]> >> wrote: >> >>> Hi, >>> >>> I am using MCF 1.5.1 & indexing Alfresco 4.2 using CMIS. It was working >>> fine till now but suddenly Alfresco job broke & i could see below error in >>> manifoldcf.log. >>> >>> WARN 2014-08-21 12:28:27,030 (Worker thread '184') - Service >>> interruption reported for job 1408620030828 connection 'Alfresco': IO >>> exception during indexing >>> http://iwdc2devbld02:8080/alfresco/api/-default-/public/cmis/versions/1.0/atom/content/EPA-EPA%20Mission.pdf?id=37d1cc7e-e284-4466-ac2f-3d81dcaeb8a3%3B1.0: >>> missing CR >>> >>> WARN 2014-08-21 12:28:28,202 (Worker thread '76') - IO exception during >>> indexing >>> http://iwdc2devbld02:8080/alfresco/api/-default-/public/cmis/versions/1.0/atom/content/TCurran-Strategies%20for%20Domestic%20Wastewater%20Treatment.pdf?id=64e0e3c1-7b9e-451a-b4d1-876fbe5a0b8e%3B1.0: >>> missing CR >>> >>> java.io.IOException: missing CR >>> >>> at >>> sun.net.www.http.ChunkedInputStream.processRaw(ChunkedInputStream.java:405) >>> >>> at >>> sun.net.www.http.ChunkedInputStream.readAheadBlocking(ChunkedInputStream.java:572) >>> >>> at >>> sun.net.www.http.ChunkedInputStream.readAhead(ChunkedInputStream.java:609) >>> >>> at >>> sun.net.www.http.ChunkedInputStream.read(ChunkedInputStream.java:696) >>> >>> at java.io.FilterInputStream.read(FilterInputStream.java:133) >>> >>> at >>> sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:3052) >>> >>> at >>> sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:3046) >>> >>> at >>> org.apache.http.entity.mime.content.InputStreamBody.writeTo(InputStreamBody.java:69) >>> >>> at >>> org.apache.manifoldcf.agents.output.solr.ModifiedHttpMultipart.doWriteTo(ModifiedHttpMultipart.java:211) >>> >>> at >>> org.apache.manifoldcf.agents.output.solr.ModifiedHttpMultipart.writeTo(ModifiedHttpMultipart.java:229) >>> >>> at >>> org.apache.manifoldcf.agents.output.solr.ModifiedMultipartEntity.writeTo(ModifiedMultipartEntity.java:186) >>> >>> at >>> org.apache.http.entity.HttpEntityWrapper.writeTo(HttpEntityWrapper.java:98) >>> >>> at >>> org.apache.http.impl.client.EntityEnclosingRequestWrapper$EntityWrapper.writeTo(EntityEnclosingRequestWrapper.java:108) >>> >>> at >>> org.apache.http.impl.entity.EntitySerializer.serialize(EntitySerializer.java:122) >>> >>> at >>> org.apache.http.impl.AbstractHttpClientConnection.sendRequestEntity(AbstractHttpClientConnection.java:271) >>> >>> at >>> org.apache.http.impl.conn.ManagedClientConnectionImpl.sendRequestEntity(ManagedClientConnectionImpl.java:197) >>> >>> at >>> org.apache.http.protocol.HttpRequestExecutor.doSendRequest(HttpRequestExecutor.java:257) >>> >>> at >>> org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) >>> >>> at >>> org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:715) >>> >>> at >>> org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:520) >>> >>> at >>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906) >>> >>> at >>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) >>> >>> at >>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784) >>> >>> at >>> org.apache.manifoldcf.agents.output.solr.ModifiedHttpSolrServer.request(ModifiedHttpSolrServer.java:291) >>> >>> at >>> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:197) >>> >>> at >>> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117) >>> >>> at >>> org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:923) >>> >>> WARN 2014-08-21 12:28:28,227 (Worker thread '76') - Service interruption >>> reported for job 1408620030828 connection 'Alfresco': IO exception during >>> indexing >>> http://iwdc2devbld02:8080/alfresco/api/-default-/public/cmis/versions/1.0/atom/content/TCurran-Strategies%20for%20Domestic%20Wastewater%20Treatment.pdf?id=64e0e3c1-7b9e-451a-b4d1-876fbe5a0b8e%3B1.0: >>> missing CR >>> >>> Please help. >>> >>> Regards, >>> Lalit. >>> >> >> > > > -- > Regards, > Lalit. >
