FWIW, to clarify, I think you are going to be best served by trying to
first turn off the retries (however that can be done, since the
current code is apparently insufficient), and then posting what the
real underlying problem seems to be.  Alternatively, it is possible
that there's already another exception dumped into the log that you
didn't include which would be helpful.  If you need to figure out why
the retries are still happening you may wind up needing to build the
httpclient jar yourself, after adding appropriate diagnostics around
the retry logic.  I'd be happy to work with you on this but probably
not until this evening Boston time.

Karl

On Thu, Mar 7, 2013 at 7:43 AM, Karl Wright <[email protected]> wrote:
> Hi Erlend,
>
> What is happening is the following.
>
> (1) Your indexing is failing
> (2) Httpclient by default retries 3 times on failure
> (3) Between each retry, it resets the input stream, but this is not a
> resettable input stream, so that can't work..
>
> Because of (3), the Solr Connector explicitly disables retries, using this 
> code:
>
>     // No retries
>     localClient.setHttpRequestRetryHandler(new HttpRequestRetryHandler()
>       {
>         public boolean retryRequest(
>           IOException exception,
>           int executionCount,
>           HttpContext context)
>         {
>           return false;
>         }
>
>       });
>
>
> I don't know why that isn't working - it certainly used to.  Perhaps
> you could research it.
>
> Fundamentally, though, you have a problem upstream of that - you need
> to figure out why the indexing request is failing in the first place.
> It's likely to be a socket timeout or connection timeout underneath it
> all.
>
> Karl
>
> On Thu, Mar 7, 2013 at 7:34 AM, Erlend Garåsen <[email protected]> 
> wrote:
>>
>> Hello list,
>>
>> I'm getting the following error when the web cralwer is trying to post
>> documents to Solr 4: IO exception during indexing: null. This happens for
>> all indexing attempts and just ends in the following:
>>
>> --8<--
>>  WARN 2013-03-01 19:59:51,360 (Worker thread '0') - Service interruption
>> reported for job 1362070726596 connection 'Web crawler': IO exception during
>> indexing: null
>> ERROR 2013-03-01 19:59:51,378 (Worker thread '0') - Exception tossed:
>> Repeated service interruptions - failure processing document: null
>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: Repeated service
>> interruptions - failure processing document: null
>>         at
>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:604)
>> Caused by: org.apache.http.client.ClientProtocolException
>>         at
>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:909)
>>         at
>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
>>         at
>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
>>         at
>> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:353)
>>         at
>> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
>>         at
>> org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
>>         at
>> org.apache.manifoldcf.agents.output.solr.HttpPoster$IngestThread.run(HttpPoster.java:833)
>> Caused by: org.apache.http.client.NonRepeatableRequestException: Cannot
>> retry request with a non-repeatable request entity.
>>         at
>> org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:695)
>>         at
>> org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:522)
>> --8<--
>>
>> I'm running version 1.1.1 of MCF deployed on Resin. This does not happen on
>> our test server which is equally configured as our prod server, except for
>> some security restrictions. Basic auth is configured for both reading and
>> writing on the Solr server.
>>
>> I *did* got the same error the first time I deployed version 1.1.1 of MCF on
>> our test server, but it went away after I added the Solr core name in the
>> core/collection name field. On our production server I *do* have the core
>> named configured, so now I need help in order to figure out what's going on.
>>
>> The NonRepeatableRequestException is perhaps caused by a misconfiguration of
>> HttpClient 4, but I'm not sure this is the root of the problem I'm facing
>> here. It might be due to the basic auth restriction  which is configured.
>> Anyway, this was not a problem for previous versions of MCF.
>>
>> Erlend
>>
>> --
>> Erlend Garåsen
>> Center for Information Technology Services
>> University of Oslo
>> P.O. Box 1086 Blindern, N-0317 OSLO, Norway
>> Ph: (+47) 22840193, Fax: (+47) 22852970, Mobile: (+47) 91380968, VIP: 31050

Reply via email to