[ 
https://issues.apache.org/jira/browse/CONNECTORS-1612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karl Wright resolved CONNECTORS-1612.
-------------------------------------
       Resolution: Fixed
    Fix Version/s: ManifoldCF 2.14

r1861582


> Postpone files in SMBException
> ------------------------------
>
>                 Key: CONNECTORS-1612
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1612
>             Project: ManifoldCF
>          Issue Type: Improvement
>          Components: JCIFS connector
>    Affects Versions: ManifoldCF 2.12
>            Reporter: Julien Massiera
>            Assignee: Karl Wright
>            Priority: Critical
>             Fix For: ManifoldCF 2.14
>
>
> When crawling using the jcifs connector, some unexpected errors may trigger a 
> class "SMBException" which is caught by MCF.
> The current behavior for the job is to abort after a few retry.
> Although it is a generic class of SMBException, we consider that it is worth 
> before aborting the job, to postpone the concerned problematic files and try 
> the ones already in the pipe before aborting. This way, the job can move on 
> before developers have to study the particular problems. More precisely, the 
> algorithm could look like the following:
> Whenever a job encounters an error that is not clearly identified :
> 1. It immediately retries one time; 
> 2. If it succeeds, the crawl moves on as usual; 
> 3. If it fails, the job moves this document to the current end of the 
> processing pipeline, and crawls the remaining documents. It increments the 
> counter of tentative for this document to 2.
> 4. When encountering this document again, the job tries again. If it 
> succeeds, the crawl moves on as usual. If it fails, it moves this document to 
> the current end of the processing pipeline, increment the counter of 1, and 
> doubles the delay between two tentatives.
> 5. We iterate until the maximum number of tentatives of the crawl for the 
> problematic document has been reached. If it fails, abort the crawl. With 
> this behavior, a job is finally aborted on critical errors but at least we 
> will be able to crawl a maximum number of non problematic documents till the 
> failure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to