Thanks, Karl!
I will first try to set the exact authentication restrictions we have on our prod server on our test server. If I get the same errors on our test server after I have changed the security settings, we may exclude some other possibilities.
Then it might be a good idea to turn off the retries. I have played around with HttpClient before and enabled this, so I think I know how to proceed. I will notify you.
Erlend On 07.03.13 14.00, Karl Wright wrote:
FWIW, to clarify, I think you are going to be best served by trying to first turn off the retries (however that can be done, since the current code is apparently insufficient), and then posting what the real underlying problem seems to be. Alternatively, it is possible that there's already another exception dumped into the log that you didn't include which would be helpful. If you need to figure out why the retries are still happening you may wind up needing to build the httpclient jar yourself, after adding appropriate diagnostics around the retry logic. I'd be happy to work with you on this but probably not until this evening Boston time. Karl On Thu, Mar 7, 2013 at 7:43 AM, Karl Wright <[email protected]> wrote:Hi Erlend, What is happening is the following. (1) Your indexing is failing (2) Httpclient by default retries 3 times on failure (3) Between each retry, it resets the input stream, but this is not a resettable input stream, so that can't work.. Because of (3), the Solr Connector explicitly disables retries, using this code: // No retries localClient.setHttpRequestRetryHandler(new HttpRequestRetryHandler() { public boolean retryRequest( IOException exception, int executionCount, HttpContext context) { return false; } }); I don't know why that isn't working - it certainly used to. Perhaps you could research it. Fundamentally, though, you have a problem upstream of that - you need to figure out why the indexing request is failing in the first place. It's likely to be a socket timeout or connection timeout underneath it all. Karl On Thu, Mar 7, 2013 at 7:34 AM, Erlend Garåsen <[email protected]> wrote:Hello list, I'm getting the following error when the web cralwer is trying to post documents to Solr 4: IO exception during indexing: null. This happens for all indexing attempts and just ends in the following: --8<-- WARN 2013-03-01 19:59:51,360 (Worker thread '0') - Service interruption reported for job 1362070726596 connection 'Web crawler': IO exception during indexing: null ERROR 2013-03-01 19:59:51,378 (Worker thread '0') - Exception tossed: Repeated service interruptions - failure processing document: null org.apache.manifoldcf.core.interfaces.ManifoldCFException: Repeated service interruptions - failure processing document: null at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:604) Caused by: org.apache.http.client.ClientProtocolException at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:909) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805) at org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:353) at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117) at org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:833) Caused by: org.apache.http.client.NonRepeatableRequestException: Cannot retry request with a non-repeatable request entity. at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:695) at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:522) --8<-- I'm running version 1.1.1 of MCF deployed on Resin. This does not happen on our test server which is equally configured as our prod server, except for some security restrictions. Basic auth is configured for both reading and writing on the Solr server. I *did* got the same error the first time I deployed version 1.1.1 of MCF on our test server, but it went away after I added the Solr core name in the core/collection name field. On our production server I *do* have the core named configured, so now I need help in order to figure out what's going on. The NonRepeatableRequestException is perhaps caused by a misconfiguration of HttpClient 4, but I'm not sure this is the root of the problem I'm facing here. It might be due to the basic auth restriction which is configured. Anyway, this was not a problem for previous versions of MCF. Erlend -- Erlend Garåsen Center for Information Technology Services University of Oslo P.O. Box 1086 Blindern, N-0317 OSLO, Norway Ph: (+47) 22840193, Fax: (+47) 22852970, Mobile: (+47) 91380968, VIP: 31050
-- Erlend Garåsen Center for Information Technology Services University of Oslo P.O. Box 1086 Blindern, N-0317 OSLO, Norway Ph: (+47) 22840193, Fax: (+47) 22852970, Mobile: (+47) 91380968, VIP: 31050
