[ 
https://issues.apache.org/jira/browse/CONNECTORS-1731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nguyen Huu Nhat updated CONNECTORS-1731:
----------------------------------------
    Description: 
Hi there,

I would like to suggest the following retry-related improvements:
h3. +*1. Connector name*+

Generic Repository Connector
h3. +*2. Reasons for improvement*+

In the process of ExecuteSeedingThread, DocumentVersionThread and 
ExecuteProcessThread, when HttpClient.execute is executed, there are chances 
that connection gets interrupted and connection error occurs (HTTP status code 
<> 200).
When HTTP status code <> 200, exception of type ManifoldCFException will be 
thrown. However, there is no process to handle this, which leads to job being 
aborted.

As errors relating to connection (HTTP status code <> 200) can be resolved 
automatically, retry should be added for cases like this.
h3. +*3. Improvements*+

Improvement includes the followings:
 * Adding method to handle retry for exception of type ManifoldCFException
 * Calling method to handle ManifoldCFException exception generated when 
executing in threads: ExecuteSeedingThread, DocumentVersionThread, 
ExecuteProcessThread

h3. +*4. Suggested source code (based on release 2.22.1)*+
 * Adding method to handle retry for exception of type ManifoldCFException
[https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L1026]
{code:java}
  /**
   * Function for handling ManifoldCFException exception caused by connection 
error.
   * In case of connection error, ServiceInterruption exception is thrown to 
perform retry.
   * 
   * @param e ManifoldCFException
   * @throws ServiceInterruption
   */
  protected static void handleManifoldCFException(ManifoldCFException e)
    throws ServiceInterruption {
    long currentTime = System.currentTimeMillis();
    throw new ServiceInterruption("Connection error: " + e.getMessage(), e, 
currentTime + 300000L,
      currentTime + 3 * 60 * 60000L, -1, false);
  }
{code}

 * Calling method to handle ManifoldCFException exception generated when 
executing in threads: ExecuteSeedingThread, DocumentVersionThread, 
ExecuteProcessThread
 ** ExecuteSeedingThread
[https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L256]
{code:java}
    } catch (InterruptedException e) {
      t.interrupt();
      throw new ManifoldCFException("Interrupted: " + e.getMessage(), e,
        ManifoldCFException.INTERRUPTED);
+   } catch (ManifoldCFException e) {
+     handleManifoldCFException(e);
    }
    return new Long(seedTime).toString();
{code}

 ** DocumentVersionThread
[https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L304]
{code:java}
      try {
        versions = versioningThread.finishUp();
      } catch (IOException ex) {
        handleIOException((IOException)ex);
+     } catch (ManifoldCFException ex) {
+       handleManifoldCFException(ex);
      } catch (InterruptedException ex) {
        throw new ManifoldCFException(ex.getMessage(), ex, 
ManifoldCFException.INTERRUPTED);
      }
{code}

 ** ExecuteProcessThread
[https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L445]
{code:java}
            } catch (InterruptedIOException e) {
              t.interrupt();
              throw new ManifoldCFException("Interrupted: " + e.getMessage(), 
e, ManifoldCFException.INTERRUPTED);
            } catch (IOException e) {
              handleIOException(e);
+           } catch (ManifoldCFException e) {
+             handleManifoldCFException(e);
            }
{code}

  was:
Hi there,

I would like to suggest the following retry-related improvements:

h3. +*1. Connector name*+

Generic Repository Connector

h3. +*2. Reasons for improvement*+

In the process of ExecuteSeedingThread, DocumentVersionThread and 
ExecuteProcessThread, when HttpClient.execute is executed, there are chances 
that connection gets interrupted and connection error occurs (HTTP status code 
<> 200).
When HTTP status code <> 200, exception of type ManifoldCFException will be 
thrown. However, there is no process to handle this, which leads to job being 
aborted.

As errors relating to connection (HTTP status code <> 200) can be resolved 
automatically, retry should be added for cases like this.

h3. +*3. Improvements*+

Improvement includes the followings:
 * Adding method to handle retry for exception of type ManifoldCFException
 * Calling method to handle  ManifoldCFException exception generated when 
executing in threads: ExecuteSeedingThread, DocumentVersionThread, 
ExecuteProcessThread

h3. +*4. Suggested source code (based on release 2.22.1)*+

 * Adding method to handle retry for exception of type ManifoldCFException
[https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L1026]
{code:java}
  /**
   * Function for handling ManifoldCFException exception caused by connection 
error.
   * In case of connection error, ServiceInterruption exception is thrown to 
perform retry.
   * 
   * @param e ManifoldCFException
   * @throws ServiceInterruption
   */
  protected static void handleManifoldCFException(ManifoldCFException e)
    throws ServiceInterruption {
    long currentTime = System.currentTimeMillis();
    throw new ServiceInterruption("Connection error: " + e.getMessage(), e, 
currentTime + 300000L,
      currentTime + timeToFail * 3 * 60 * 60000L, -1, false);
  }
{code}

 * Calling method to handle  ManifoldCFException exception generated when 
executing in threads: ExecuteSeedingThread, DocumentVersionThread, 
ExecuteProcessThread
 ** ExecuteSeedingThread
[https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L256]
{code:java}
    } catch (InterruptedException e) {
      t.interrupt();
      throw new ManifoldCFException("Interrupted: " + e.getMessage(), e,
        ManifoldCFException.INTERRUPTED);
+   } catch (ManifoldCFException e) {
+     handleManifoldCFException(e);
    }
    return new Long(seedTime).toString();
{code}
 ** DocumentVersionThread
[https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L304]
{code:java}
      try {
        versions = versioningThread.finishUp();
      } catch (IOException ex) {
        handleIOException((IOException)ex);
+     } catch (ManifoldCFException ex) {
+       handleManifoldCFException(ex);
      } catch (InterruptedException ex) {
        throw new ManifoldCFException(ex.getMessage(), ex, 
ManifoldCFException.INTERRUPTED);
      }
{code}
 ** ExecuteProcessThread
[https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L445]
{code:java}
            } catch (InterruptedIOException e) {
              t.interrupt();
              throw new ManifoldCFException("Interrupted: " + e.getMessage(), 
e, ManifoldCFException.INTERRUPTED);
            } catch (IOException e) {
              handleIOException(e);
+           } catch (ManifoldCFException e) {
+             handleManifoldCFException(e);
            }
{code}


> Suggestion for adding handling process for ManifoldCFException in Generic 
> Repository Connector
> ----------------------------------------------------------------------------------------------
>
>                 Key: CONNECTORS-1731
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1731
>             Project: ManifoldCF
>          Issue Type: Improvement
>            Reporter: Nguyen Huu Nhat
>            Assignee: Karl Wright
>            Priority: Major
>         Attachments: patch.txt, patch_update.txt
>
>
> Hi there,
> I would like to suggest the following retry-related improvements:
> h3. +*1. Connector name*+
> Generic Repository Connector
> h3. +*2. Reasons for improvement*+
> In the process of ExecuteSeedingThread, DocumentVersionThread and 
> ExecuteProcessThread, when HttpClient.execute is executed, there are chances 
> that connection gets interrupted and connection error occurs (HTTP status 
> code <> 200).
> When HTTP status code <> 200, exception of type ManifoldCFException will be 
> thrown. However, there is no process to handle this, which leads to job being 
> aborted.
> As errors relating to connection (HTTP status code <> 200) can be resolved 
> automatically, retry should be added for cases like this.
> h3. +*3. Improvements*+
> Improvement includes the followings:
>  * Adding method to handle retry for exception of type ManifoldCFException
>  * Calling method to handle ManifoldCFException exception generated when 
> executing in threads: ExecuteSeedingThread, DocumentVersionThread, 
> ExecuteProcessThread
> h3. +*4. Suggested source code (based on release 2.22.1)*+
>  * Adding method to handle retry for exception of type ManifoldCFException
> [https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L1026]
> {code:java}
>   /**
>    * Function for handling ManifoldCFException exception caused by connection 
> error.
>    * In case of connection error, ServiceInterruption exception is thrown to 
> perform retry.
>    * 
>    * @param e ManifoldCFException
>    * @throws ServiceInterruption
>    */
>   protected static void handleManifoldCFException(ManifoldCFException e)
>     throws ServiceInterruption {
>     long currentTime = System.currentTimeMillis();
>     throw new ServiceInterruption("Connection error: " + e.getMessage(), e, 
> currentTime + 300000L,
>       currentTime + 3 * 60 * 60000L, -1, false);
>   }
> {code}
>  * Calling method to handle ManifoldCFException exception generated when 
> executing in threads: ExecuteSeedingThread, DocumentVersionThread, 
> ExecuteProcessThread
>  ** ExecuteSeedingThread
> [https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L256]
> {code:java}
>     } catch (InterruptedException e) {
>       t.interrupt();
>       throw new ManifoldCFException("Interrupted: " + e.getMessage(), e,
>         ManifoldCFException.INTERRUPTED);
> +   } catch (ManifoldCFException e) {
> +     handleManifoldCFException(e);
>     }
>     return new Long(seedTime).toString();
> {code}
>  ** DocumentVersionThread
> [https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L304]
> {code:java}
>       try {
>         versions = versioningThread.finishUp();
>       } catch (IOException ex) {
>         handleIOException((IOException)ex);
> +     } catch (ManifoldCFException ex) {
> +       handleManifoldCFException(ex);
>       } catch (InterruptedException ex) {
>         throw new ManifoldCFException(ex.getMessage(), ex, 
> ManifoldCFException.INTERRUPTED);
>       }
> {code}
>  ** ExecuteProcessThread
> [https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/generic/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/generic/GenericConnector.java#L445]
> {code:java}
>             } catch (InterruptedIOException e) {
>               t.interrupt();
>               throw new ManifoldCFException("Interrupted: " + e.getMessage(), 
> e, ManifoldCFException.INTERRUPTED);
>             } catch (IOException e) {
>               handleIOException(e);
> +           } catch (ManifoldCFException e) {
> +             handleManifoldCFException(e);
>             }
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to