[
https://issues.apache.org/jira/browse/CONNECTORS-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nguyen Huu Nhat updated CONNECTORS-1731:
----------------------------------------
Description:
Hi there,
I would like to suggest the following retry-related improvements:
h3. +*1. Connector name*+
Generic Repository Connector
h3. +*2. Reasons for improvement*+
In the process of ExecuteSeedingThread, DocumentVersionThread and
ExecuteProcessThread, when HttpClient.execute is executed, there are chances
that connection gets interrupted and connection error occurs (HTTP status code
<> 200).
When HTTP status code <> 200, exception of type ManifoldCFException will be
thrown. However, there is no process to handle this, which leads to job being
aborted.
As errors relating to connection (HTTP status code <> 200) can be resolved
automatically, retry should be added for cases like this.
h3. +*3. Improvements*+
Improvement includes the followings:
* Adding method to handle retry for exception of type ManifoldCFException
* Calling method to handle ManifoldCFException exception generated when
executing in threads: ExecuteSeedingThread, DocumentVersionThread,
ExecuteProcessThread
h3. +*4. Suggested source code (based on release 2.22.1)*+
* Adding method to handle retry for exception of type ManifoldCFException
[https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L1026]
{code:java}
/**
* Function for handling ManifoldCFException exception caused by connection
error.
* In case of connection error, ServiceInterruption exception is thrown to
perform retry.
*
* @param e ManifoldCFException
* @throws ServiceInterruption
*/
protected static void handleManifoldCFException(ManifoldCFException e)
throws ServiceInterruption {
long currentTime = System.currentTimeMillis();
throw new ServiceInterruption("Connection error: " + e.getMessage(), e,
currentTime + 300000L,
currentTime + 3 * 60 * 60000L, -1, false);
}
{code}
* Calling method to handle ManifoldCFException exception generated when
executing in threads: ExecuteSeedingThread, DocumentVersionThread,
ExecuteProcessThread
** ExecuteSeedingThread
[https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L256]
{code:java}
} catch (InterruptedException e) {
t.interrupt();
throw new ManifoldCFException("Interrupted: " + e.getMessage(), e,
ManifoldCFException.INTERRUPTED);
+ } catch (ManifoldCFException e) {
+ handleManifoldCFException(e);
}
return new Long(seedTime).toString();
{code}
** DocumentVersionThread
[https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L304]
{code:java}
try {
versions = versioningThread.finishUp();
} catch (IOException ex) {
handleIOException((IOException)ex);
+ } catch (ManifoldCFException ex) {
+ handleManifoldCFException(ex);
} catch (InterruptedException ex) {
throw new ManifoldCFException(ex.getMessage(), ex,
ManifoldCFException.INTERRUPTED);
}
{code}
** ExecuteProcessThread
[https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L445]
{code:java}
} catch (InterruptedIOException e) {
t.interrupt();
throw new ManifoldCFException("Interrupted: " + e.getMessage(),
e, ManifoldCFException.INTERRUPTED);
} catch (IOException e) {
handleIOException(e);
+ } catch (ManifoldCFException e) {
+ handleManifoldCFException(e);
}
{code}
was:
Hi there,
I would like to suggest the following retry-related improvements:
h3. +*1. Connector name*+
Generic Repository Connector
h3. +*2. Reasons for improvement*+
In the process of ExecuteSeedingThread, DocumentVersionThread and
ExecuteProcessThread, when HttpClient.execute is executed, there are chances
that connection gets interrupted and connection error occurs (HTTP status code
<> 200).
When HTTP status code <> 200, exception of type ManifoldCFException will be
thrown. However, there is no process to handle this, which leads to job being
aborted.
As errors relating to connection (HTTP status code <> 200) can be resolved
automatically, retry should be added for cases like this.
h3. +*3. Improvements*+
Improvement includes the followings:
* Adding method to handle retry for exception of type ManifoldCFException
* Calling method to handle ManifoldCFException exception generated when
executing in threads: ExecuteSeedingThread, DocumentVersionThread,
ExecuteProcessThread
h3. +*4. Suggested source code (based on release 2.22.1)*+
* Adding method to handle retry for exception of type ManifoldCFException
[https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L1026]
{code:java}
/**
* Function for handling ManifoldCFException exception caused by connection
error.
* In case of connection error, ServiceInterruption exception is thrown to
perform retry.
*
* @param e ManifoldCFException
* @throws ServiceInterruption
*/
protected static void handleManifoldCFException(ManifoldCFException e)
throws ServiceInterruption {
long currentTime = System.currentTimeMillis();
throw new ServiceInterruption("Connection error: " + e.getMessage(), e,
currentTime + 300000L,
currentTime + timeToFail * 3 * 60 * 60000L, -1, false);
}
{code}
* Calling method to handle ManifoldCFException exception generated when
executing in threads: ExecuteSeedingThread, DocumentVersionThread,
ExecuteProcessThread
** ExecuteSeedingThread
[https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L256]
{code:java}
} catch (InterruptedException e) {
t.interrupt();
throw new ManifoldCFException("Interrupted: " + e.getMessage(), e,
ManifoldCFException.INTERRUPTED);
+ } catch (ManifoldCFException e) {
+ handleManifoldCFException(e);
}
return new Long(seedTime).toString();
{code}
** DocumentVersionThread
[https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L304]
{code:java}
try {
versions = versioningThread.finishUp();
} catch (IOException ex) {
handleIOException((IOException)ex);
+ } catch (ManifoldCFException ex) {
+ handleManifoldCFException(ex);
} catch (InterruptedException ex) {
throw new ManifoldCFException(ex.getMessage(), ex,
ManifoldCFException.INTERRUPTED);
}
{code}
** ExecuteProcessThread
[https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L445]
{code:java}
} catch (InterruptedIOException e) {
t.interrupt();
throw new ManifoldCFException("Interrupted: " + e.getMessage(),
e, ManifoldCFException.INTERRUPTED);
} catch (IOException e) {
handleIOException(e);
+ } catch (ManifoldCFException e) {
+ handleManifoldCFException(e);
}
{code}
> Suggestion for adding handling process for ManifoldCFException in Generic
> Repository Connector
> ----------------------------------------------------------------------------------------------
>
> Key: CONNECTORS-1731
> URL: https://issues.apache.org/jira/browse/CONNECTORS-1731
> Project: ManifoldCF
> Issue Type: Improvement
> Reporter: Nguyen Huu Nhat
> Assignee: Karl Wright
> Priority: Major
> Attachments: patch.txt, patch_update.txt
>
>
> Hi there,
> I would like to suggest the following retry-related improvements:
> h3. +*1. Connector name*+
> Generic Repository Connector
> h3. +*2. Reasons for improvement*+
> In the process of ExecuteSeedingThread, DocumentVersionThread and
> ExecuteProcessThread, when HttpClient.execute is executed, there are chances
> that connection gets interrupted and connection error occurs (HTTP status
> code <> 200).
> When HTTP status code <> 200, exception of type ManifoldCFException will be
> thrown. However, there is no process to handle this, which leads to job being
> aborted.
> As errors relating to connection (HTTP status code <> 200) can be resolved
> automatically, retry should be added for cases like this.
> h3. +*3. Improvements*+
> Improvement includes the followings:
> * Adding method to handle retry for exception of type ManifoldCFException
> * Calling method to handle ManifoldCFException exception generated when
> executing in threads: ExecuteSeedingThread, DocumentVersionThread,
> ExecuteProcessThread
> h3. +*4. Suggested source code (based on release 2.22.1)*+
> * Adding method to handle retry for exception of type ManifoldCFException
> [https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L1026]
> {code:java}
> /**
> * Function for handling ManifoldCFException exception caused by connection
> error.
> * In case of connection error, ServiceInterruption exception is thrown to
> perform retry.
> *
> * @param e ManifoldCFException
> * @throws ServiceInterruption
> */
> protected static void handleManifoldCFException(ManifoldCFException e)
> throws ServiceInterruption {
> long currentTime = System.currentTimeMillis();
> throw new ServiceInterruption("Connection error: " + e.getMessage(), e,
> currentTime + 300000L,
> currentTime + 3 * 60 * 60000L, -1, false);
> }
> {code}
> * Calling method to handle ManifoldCFException exception generated when
> executing in threads: ExecuteSeedingThread, DocumentVersionThread,
> ExecuteProcessThread
> ** ExecuteSeedingThread
> [https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L256]
> {code:java}
> } catch (InterruptedException e) {
> t.interrupt();
> throw new ManifoldCFException("Interrupted: " + e.getMessage(), e,
> ManifoldCFException.INTERRUPTED);
> + } catch (ManifoldCFException e) {
> + handleManifoldCFException(e);
> }
> return new Long(seedTime).toString();
> {code}
> ** DocumentVersionThread
> [https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L304]
> {code:java}
> try {
> versions = versioningThread.finishUp();
> } catch (IOException ex) {
> handleIOException((IOException)ex);
> + } catch (ManifoldCFException ex) {
> + handleManifoldCFException(ex);
> } catch (InterruptedException ex) {
> throw new ManifoldCFException(ex.getMessage(), ex,
> ManifoldCFException.INTERRUPTED);
> }
> {code}
> ** ExecuteProcessThread
> [https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L445]
> {code:java}
> } catch (InterruptedIOException e) {
> t.interrupt();
> throw new ManifoldCFException("Interrupted: " + e.getMessage(),
> e, ManifoldCFException.INTERRUPTED);
> } catch (IOException e) {
> handleIOException(e);
> + } catch (ManifoldCFException e) {
> + handleManifoldCFException(e);
> }
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)