[ https://issues.apache.org/jira/browse/CONNECTORS-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nguyen Huu Nhat updated CONNECTORS-1731: ---------------------------------------- Description: Hi there, I would like to suggest the following retry-related improvements: h3. +*1. Connector name*+ Generic Repository Connector h3. +*2. Reasons for improvement*+ In the process of ExecuteSeedingThread, DocumentVersionThread and ExecuteProcessThread, when HttpClient.execute is executed, there are chances that connection gets interrupted and connection error occurs (HTTP status code <> 200). When HTTP status code <> 200, exception of type ManifoldCFException will be thrown. However, there is no process to handle this, which leads to job being aborted. As errors relating to connection (HTTP status code <> 200) can be resolved automatically, retry should be added for cases like this. h3. +*3. Improvements*+ Improvement includes the followings: * Adding method to handle retry for exception of type ManifoldCFException * Calling method to handle ManifoldCFException exception generated when executing in threads: ExecuteSeedingThread, DocumentVersionThread, ExecuteProcessThread h3. +*4. Suggested source code (based on release 2.22.1)*+ * Adding method to handle retry for exception of type ManifoldCFException [https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L1026] {code:java} /** * Function for handling ManifoldCFException exception caused by connection error. * In case of connection error, ServiceInterruption exception is thrown to perform retry. * * @param e ManifoldCFException * @throws ServiceInterruption */ protected static void handleManifoldCFException(ManifoldCFException e) throws ServiceInterruption { long currentTime = System.currentTimeMillis(); throw new ServiceInterruption("Connection error: " + e.getMessage(), e, currentTime + 300000L, currentTime + 3 * 60 * 60000L, -1, false); } {code} * Calling method to handle ManifoldCFException exception generated when executing in threads: ExecuteSeedingThread, DocumentVersionThread, ExecuteProcessThread ** ExecuteSeedingThread [https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L256] {code:java} } catch (InterruptedException e) { t.interrupt(); throw new ManifoldCFException("Interrupted: " + e.getMessage(), e, ManifoldCFException.INTERRUPTED); + } catch (ManifoldCFException e) { + handleManifoldCFException(e); } return new Long(seedTime).toString(); {code} ** DocumentVersionThread [https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L304] {code:java} try { versions = versioningThread.finishUp(); } catch (IOException ex) { handleIOException((IOException)ex); + } catch (ManifoldCFException ex) { + handleManifoldCFException(ex); } catch (InterruptedException ex) { throw new ManifoldCFException(ex.getMessage(), ex, ManifoldCFException.INTERRUPTED); } {code} ** ExecuteProcessThread [https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L445] {code:java} } catch (InterruptedIOException e) { t.interrupt(); throw new ManifoldCFException("Interrupted: " + e.getMessage(), e, ManifoldCFException.INTERRUPTED); } catch (IOException e) { handleIOException(e); + } catch (ManifoldCFException e) { + handleManifoldCFException(e); } {code} was: Hi there, I would like to suggest the following retry-related improvements: h3. +*1. Connector name*+ Generic Repository Connector h3. +*2. Reasons for improvement*+ In the process of ExecuteSeedingThread, DocumentVersionThread and ExecuteProcessThread, when HttpClient.execute is executed, there are chances that connection gets interrupted and connection error occurs (HTTP status code <> 200). When HTTP status code <> 200, exception of type ManifoldCFException will be thrown. However, there is no process to handle this, which leads to job being aborted. As errors relating to connection (HTTP status code <> 200) can be resolved automatically, retry should be added for cases like this. h3. +*3. Improvements*+ Improvement includes the followings: * Adding method to handle retry for exception of type ManifoldCFException * Calling method to handle ManifoldCFException exception generated when executing in threads: ExecuteSeedingThread, DocumentVersionThread, ExecuteProcessThread h3. +*4. Suggested source code (based on release 2.22.1)*+ * Adding method to handle retry for exception of type ManifoldCFException [https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L1026] {code:java} /** * Function for handling ManifoldCFException exception caused by connection error. * In case of connection error, ServiceInterruption exception is thrown to perform retry. * * @param e ManifoldCFException * @throws ServiceInterruption */ protected static void handleManifoldCFException(ManifoldCFException e) throws ServiceInterruption { long currentTime = System.currentTimeMillis(); throw new ServiceInterruption("Connection error: " + e.getMessage(), e, currentTime + 300000L, currentTime + timeToFail * 3 * 60 * 60000L, -1, false); } {code} * Calling method to handle ManifoldCFException exception generated when executing in threads: ExecuteSeedingThread, DocumentVersionThread, ExecuteProcessThread ** ExecuteSeedingThread [https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L256] {code:java} } catch (InterruptedException e) { t.interrupt(); throw new ManifoldCFException("Interrupted: " + e.getMessage(), e, ManifoldCFException.INTERRUPTED); + } catch (ManifoldCFException e) { + handleManifoldCFException(e); } return new Long(seedTime).toString(); {code} ** DocumentVersionThread [https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L304] {code:java} try { versions = versioningThread.finishUp(); } catch (IOException ex) { handleIOException((IOException)ex); + } catch (ManifoldCFException ex) { + handleManifoldCFException(ex); } catch (InterruptedException ex) { throw new ManifoldCFException(ex.getMessage(), ex, ManifoldCFException.INTERRUPTED); } {code} ** ExecuteProcessThread [https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L445] {code:java} } catch (InterruptedIOException e) { t.interrupt(); throw new ManifoldCFException("Interrupted: " + e.getMessage(), e, ManifoldCFException.INTERRUPTED); } catch (IOException e) { handleIOException(e); + } catch (ManifoldCFException e) { + handleManifoldCFException(e); } {code} > Suggestion for adding handling process for ManifoldCFException in Generic > Repository Connector > ---------------------------------------------------------------------------------------------- > > Key: CONNECTORS-1731 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1731 > Project: ManifoldCF > Issue Type: Improvement > Reporter: Nguyen Huu Nhat > Assignee: Karl Wright > Priority: Major > Attachments: patch.txt, patch_update.txt > > > Hi there, > I would like to suggest the following retry-related improvements: > h3. +*1. Connector name*+ > Generic Repository Connector > h3. +*2. Reasons for improvement*+ > In the process of ExecuteSeedingThread, DocumentVersionThread and > ExecuteProcessThread, when HttpClient.execute is executed, there are chances > that connection gets interrupted and connection error occurs (HTTP status > code <> 200). > When HTTP status code <> 200, exception of type ManifoldCFException will be > thrown. However, there is no process to handle this, which leads to job being > aborted. > As errors relating to connection (HTTP status code <> 200) can be resolved > automatically, retry should be added for cases like this. > h3. +*3. Improvements*+ > Improvement includes the followings: > * Adding method to handle retry for exception of type ManifoldCFException > * Calling method to handle ManifoldCFException exception generated when > executing in threads: ExecuteSeedingThread, DocumentVersionThread, > ExecuteProcessThread > h3. +*4. Suggested source code (based on release 2.22.1)*+ > * Adding method to handle retry for exception of type ManifoldCFException > [https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L1026] > {code:java} > /** > * Function for handling ManifoldCFException exception caused by connection > error. > * In case of connection error, ServiceInterruption exception is thrown to > perform retry. > * > * @param e ManifoldCFException > * @throws ServiceInterruption > */ > protected static void handleManifoldCFException(ManifoldCFException e) > throws ServiceInterruption { > long currentTime = System.currentTimeMillis(); > throw new ServiceInterruption("Connection error: " + e.getMessage(), e, > currentTime + 300000L, > currentTime + 3 * 60 * 60000L, -1, false); > } > {code} > * Calling method to handle ManifoldCFException exception generated when > executing in threads: ExecuteSeedingThread, DocumentVersionThread, > ExecuteProcessThread > ** ExecuteSeedingThread > [https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L256] > {code:java} > } catch (InterruptedException e) { > t.interrupt(); > throw new ManifoldCFException("Interrupted: " + e.getMessage(), e, > ManifoldCFException.INTERRUPTED); > + } catch (ManifoldCFException e) { > + handleManifoldCFException(e); > } > return new Long(seedTime).toString(); > {code} > ** DocumentVersionThread > [https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L304] > {code:java} > try { > versions = versioningThread.finishUp(); > } catch (IOException ex) { > handleIOException((IOException)ex); > + } catch (ManifoldCFException ex) { > + handleManifoldCFException(ex); > } catch (InterruptedException ex) { > throw new ManifoldCFException(ex.getMessage(), ex, > ManifoldCFException.INTERRUPTED); > } > {code} > ** ExecuteProcessThread > [https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L445] > {code:java} > } catch (InterruptedIOException e) { > t.interrupt(); > throw new ManifoldCFException("Interrupted: " + e.getMessage(), > e, ManifoldCFException.INTERRUPTED); > } catch (IOException e) { > handleIOException(e); > + } catch (ManifoldCFException e) { > + handleManifoldCFException(e); > } > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)