[jira] [Commented] (CONNECTORS-1612) Postpone files in SMBException

2019-06-19 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867466#comment-16867466
 ] 

Karl Wright commented on CONNECTORS-1612:
-

I do not want to add yet more configuration to an already extremely complex 
connector.  If the use case you are describing (long, automatic crawls) is 
where this really is seen, then I think we're good.


> Postpone files in SMBException
> --
>
> Key: CONNECTORS-1612
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1612
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: JCIFS connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Julien Massiera
>Assignee: Karl Wright
>Priority: Critical
> Fix For: ManifoldCF 2.14
>
>
> When crawling using the jcifs connector, some unexpected errors may trigger a 
> class "SMBException" which is caught by MCF.
> The current behavior for the job is to abort after a few retry.
> Although it is a generic class of SMBException, we consider that it is worth 
> before aborting the job, to postpone the concerned problematic files and try 
> the ones already in the pipe before aborting. This way, the job can move on 
> before developers have to study the particular problems. More precisely, the 
> algorithm could look like the following:
> Whenever a job encounters an error that is not clearly identified :
> 1. It immediately retries one time; 
> 2. If it succeeds, the crawl moves on as usual; 
> 3. If it fails, the job moves this document to the current end of the 
> processing pipeline, and crawls the remaining documents. It increments the 
> counter of tentative for this document to 2.
> 4. When encountering this document again, the job tries again. If it 
> succeeds, the crawl moves on as usual. If it fails, it moves this document to 
> the current end of the processing pipeline, increment the counter of 1, and 
> doubles the delay between two tentatives.
> 5. We iterate until the maximum number of tentatives of the crawl for the 
> problematic document has been reached. If it fails, abort the crawl. With 
> this behavior, a job is finally aborted on critical errors but at least we 
> will be able to crawl a maximum number of non problematic documents till the 
> failure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CONNECTORS-1612) Postpone files in SMBException

2019-06-19 Thread Julien Massiera (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16867429#comment-16867429
 ] 

Julien Massiera commented on CONNECTORS-1612:
-

Thanks [~kwri...@metacarta.com] for the fix. Can't we make this retry time a 
configurable parameter ? Would be valuable since some jobs may last longer than 
others. And concerning my suggestion to increase the time at each retry, can't 
we apply a formula at each retry ? like next retry = retry time * retry number 
? Would give more flexibility than just a fix value.

> Postpone files in SMBException
> --
>
> Key: CONNECTORS-1612
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1612
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: JCIFS connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Julien Massiera
>Assignee: Karl Wright
>Priority: Critical
> Fix For: ManifoldCF 2.14
>
>
> When crawling using the jcifs connector, some unexpected errors may trigger a 
> class "SMBException" which is caught by MCF.
> The current behavior for the job is to abort after a few retry.
> Although it is a generic class of SMBException, we consider that it is worth 
> before aborting the job, to postpone the concerned problematic files and try 
> the ones already in the pipe before aborting. This way, the job can move on 
> before developers have to study the particular problems. More precisely, the 
> algorithm could look like the following:
> Whenever a job encounters an error that is not clearly identified :
> 1. It immediately retries one time; 
> 2. If it succeeds, the crawl moves on as usual; 
> 3. If it fails, the job moves this document to the current end of the 
> processing pipeline, and crawls the remaining documents. It increments the 
> counter of tentative for this document to 2.
> 4. When encountering this document again, the job tries again. If it 
> succeeds, the crawl moves on as usual. If it fails, it moves this document to 
> the current end of the processing pipeline, increment the counter of 1, and 
> doubles the delay between two tentatives.
> 5. We iterate until the maximum number of tentatives of the crawl for the 
> problematic document has been reached. If it fails, abort the crawl. With 
> this behavior, a job is finally aborted on critical errors but at least we 
> will be able to crawl a maximum number of non problematic documents till the 
> failure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CONNECTORS-1612) Postpone files in SMBException

2019-06-18 Thread Karl Wright (JIRA)


[ 
https://issues.apache.org/jira/browse/CONNECTORS-1612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16866714#comment-16866714
 ] 

Karl Wright commented on CONNECTORS-1612:
-

{quote}
3. If it fails, the job moves this document to the current end of the 
processing pipeline, and crawls the remaining documents. It increments the 
counter of tentative for this document to 2.
4. When encountering this document again, the job tries again. If it succeeds, 
the crawl moves on as usual. If it fails, it moves this document to the current 
end of the processing pipeline, increment the counter of 1, and doubles the 
delay between two tentatives.
{quote}

This logic is impossible to implement with the current architecture, given the 
way documents are queued and processed.  You will have to make do with the 
standard retry backoff mechanism that is already in place in the frame for 
documents that have retry-able errors.  These are not put at the "back of the 
queue" but are instead given a specific time that they are retried, and will 
not be looked at again until that time occurs.  For the SMB exceptions, we can 
make this time be something on the order of six hours or so; that should cover 
any intermittent problems with infrastructure.



> Postpone files in SMBException
> --
>
> Key: CONNECTORS-1612
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1612
> Project: ManifoldCF
>  Issue Type: Improvement
>  Components: JCIFS connector
>Affects Versions: ManifoldCF 2.12
>Reporter: Julien Massiera
>Assignee: Karl Wright
>Priority: Critical
>
> When crawling using the jcifs connector, some unexpected errors may trigger a 
> class "SMBException" which is caught by MCF.
> The current behavior for the job is to abort after a few retry.
> Although it is a generic class of SMBException, we consider that it is worth 
> before aborting the job, to postpone the concerned problematic files and try 
> the ones already in the pipe before aborting. This way, the job can move on 
> before developers have to study the particular problems. More precisely, the 
> algorithm could look like the following:
> Whenever a job encounters an error that is not clearly identified :
> 1. It immediately retries one time; 
> 2. If it succeeds, the crawl moves on as usual; 
> 3. If it fails, the job moves this document to the current end of the 
> processing pipeline, and crawls the remaining documents. It increments the 
> counter of tentative for this document to 2.
> 4. When encountering this document again, the job tries again. If it 
> succeeds, the crawl moves on as usual. If it fails, it moves this document to 
> the current end of the processing pipeline, increment the counter of 1, and 
> doubles the delay between two tentatives.
> 5. We iterate until the maximum number of tentatives of the crawl for the 
> problematic document has been reached. If it fails, abort the crawl. With 
> this behavior, a job is finally aborted on critical errors but at least we 
> will be able to crawl a maximum number of non problematic documents till the 
> failure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)