[ 
https://issues.apache.org/jira/browse/CONNECTORS-854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13866523#comment-13866523
 ] 

Karl Wright commented on CONNECTORS-854:
----------------------------------------

I've verified that 100-expect-continue had nothing to do with the stale 
connection check being added.
The branch branches/CONNECTORS-120 does not have this parameter at all.  But it 
was added later as part of CONNECTORS-120, and I can find no indication as to 
why.

What I propose is that we change it *everywhere* that it is set:

- WebConnector
- RSS
- SharePoint
- Meridio
- Jira
- GoogleDocs
- Dropbox
- Livelink

Then, run all the tests.  If they pass, I think it is OK to commit.


> Enable STALE_CONNECTION_CHECK
> -----------------------------
>
>                 Key: CONNECTORS-854
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-854
>             Project: ManifoldCF
>          Issue Type: Improvement
>          Components: Web connector
>    Affects Versions: ManifoldCF 1.4.1
>            Reporter: Shinichiro Abe
>            Priority: Minor
>             Fix For: ManifoldCF 1.5
>
>
> When crawling some sites( < 1000 docs), sometimes manifoldcf.log shows the 
> following "The target server failed to respond" messages. It seems that 
> NoHttpResponseException is thrown at ThrottledFetcher.
> {noformat}
>  WARN 2014-01-09 12:39:16,701 (Worker thread '10') - Pre-ingest service 
> interruption reported for job 1389238470356 connection '1': Timed out waiting 
> for response for 'http://www.rondhuit.com/?p=1890': The target server failed 
> to respond
>  WARN 2014-01-09 12:39:55,509 (Worker thread '7') - Pre-ingest service 
> interruption reported for job 1389238470356 connection '1': Timed out waiting 
> for response for 'http://www.rondhuit.com/?p=675': The target server failed 
> to respond
> {noformat}
> The fetching that page after retry time(15 minutes) passed was running 
> successfully.
> I tried to change a httpclient configuration then I confirmed that massage 
> was not shown.
> {noformat}
> +++ 
> connectors/webcrawler/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/webcrawler/ThrottledFetcher.java
> @@ -463,7 +463,7 @@
>          BasicHttpParams params = new BasicHttpParams();
>          params.setParameter(ClientPNames.DEFAULT_HOST,fetchHost);
>          params.setBooleanParameter(CoreConnectionPNames.TCP_NODELAY,true);
> -        
> params.setBooleanParameter(CoreConnectionPNames.STALE_CONNECTION_CHECK,false);
> +        
> params.setBooleanParameter(CoreConnectionPNames.STALE_CONNECTION_CHECK,true);
>          
> params.setBooleanParameter(ClientPNames.ALLOW_CIRCULAR_REDIRECTS,true);
> {noformat}
> I know two users who are hitting this issue and have resolved it by turning 
> on stale connection check.
> The crawling job is done more quickly than the check is false because there 
> are not retry fetches.
> May I switch false to true in stale connection check as well as 
> SolrConnector's httpclient configuration?



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to