Nguyen Huu Nhat created CONNECTORS-1730: -------------------------------------------
Summary: Improvement suggestion for retry function in SharedDriveConnector Key: CONNECTORS-1730 URL: https://issues.apache.org/jira/browse/CONNECTORS-1730 Project: ManifoldCF Issue Type: Improvement Reporter: Nguyen Huu Nhat Hi there, I would like to suggest the following retry-related improvements: h3. +*1. Connector name*+ SharedDriveConnector h3. +*2. Preface*+ * When connection to SMB can't be executed, JCIFS connector will fail to connect and exception will occur. JCIFS will attempt to retry, and abort after a certain number of time. * The number of retry is currently controlled by the folloing parameter: ** *retriesRemaining* (hardcode:3): The number of occurence of the same exception for a file or method. If a different exception occurs, this value is reset to 3. ** *totalTries* (hardcode:5): The total number of occurrence of an exception for a file or method. For the two variables above, if *retriesRemaining* becomes 0 or *totalTries* becomes 5 then the job will be aborted. h3. +*3. Reasons for improvement*+ Currently the maximum number of retry is being hardcoded at 3 and 5, respectively. In case connection to file server is unstable, to avoid aborting, I would like to suggest making these values customizable. For implementation, I would like to suggest the following methods: * 1/ Setting retry values in *properties.xml* * 2/ Setting retry values on WebUI of repository connection Between the two methods above, I suggest the first method because of following reasons: * The first method is easier to implement * Although the second method is more user-friendly, there are several issues: ** The config data from the screen will have to be stored in the database (PostgreSQL), resulting in an increased number of fields. ** Consequently, there might be a need to perform DB Migration in case further changes to setting field are needed. ※ According to the above reasons, I will proceed with the first method 'Setting retry values in properties.xml' for the next part of this suggestion. h3. +*4. Improvement*+ Changing source code to read maximum number of retries from *properties.xml* Declare two variables in *properties.xml* to set the maximum number of retry: * `org.apache.manifoldcf.crawler.connectors.sharedrive.consecutivesmbexceptionretrycount` ⇒ Set to `consecutiveSMBExceptionRetryCount` * `org.apache.manifoldcf.crawler.connectors.sharedrive.totalsmbretrycount` ⇒ Set to `totalSMBRetryCount` E.g: {code:xml} <property name="org.apache.manifoldcf.crawler.connectors.sharedrive.consecutivesmbexceptionretrycount" value="3"/> <property name="org.apache.manifoldcf.crawler.connectors.sharedrive.totalsmbretrycount" value="5"/> {code} SharedDriveConnector will load these values from the file and set to two variables within the source code. ※In case these values can't be found from the file or set to an invalid value, default values will be used instead. h3. +*5. Suggested source code (based on release 2.22.1)*+ Target class: org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector [https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/jcifs/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/sharedrive/SharedDriveConnector.java#L103] * Declare two class variables to store the configured values as follows: {code:java} private final static int consecutiveSMBExceptionRetryCount; private final static int totalSMBRetryCount; {code} * Initialize the two variables above with following steps: ** Set the values configured in 'properties.xml' to the two variables above ** If these values weren't configured or invalid, set them to default values of 3 and 5, respectively. [https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/jcifs/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/sharedrive/SharedDriveConnector.java#L106] {code:java} // Static initialization of various system properties. This hopefully takes place // before jcifs is loaded. static { ... int tempConsecutiveSMBExceptionRetryCount = 3; int tempTotalSMBRetryCount = 5; try { tempConsecutiveSMBExceptionRetryCount = ManifoldCF.getIntProperty("org.apache.manifoldcf.crawler.connectors.sharedrive.consecutivesmbexceptionretrycount", tempConsecutiveSMBExceptionRetryCount); } catch (ManifoldCFException e) { Logging.connectors.warn("Invalid property value for " + "org.apache.manifoldcf.crawler.connectors.sharedrive.consecutivesmbexceptionretrycount, must be integer. Setting to default: " + Integer.toString(tempConsecutiveSMBExceptionRetryCount)); } consecutiveSMBExceptionRetryCount = tempConsecutiveSMBExceptionRetryCount; try { tempTotalSMBRetryCount = ManifoldCF.getIntProperty("org.apache.manifoldcf.crawler.connectors.sharedrive.totalsmbretrycount", tempTotalSMBRetryCount); } catch (ManifoldCFException e) { Logging.connectors.warn("Invalid property value for " + "org.apache.manifoldcf.crawler.connectors.sharedrive.totalsmbretrycount, must be integer. Setting to default: " + Integer.toString(tempTotalSMBRetryCount)); } totalSMBRetryCount = tempTotalSMBRetryCount; } {code} * In the following methods, implementation to use [consecutiveSMBExceptionRetryCount], [totalSMBRetryCount] initialized above instead of hardcoded values is needed: ** fileExists(SmbFile file) ** fileIsDirectory(SmbFile file) ** fileLastModified(SmbFile file) ** fileLength(SmbFile file) ** fileListFiles(SmbFile file, SmbFileFilter filter) ** getFileInputStream(SmbFile file) ** getFileSecurity(SmbFile file, boolean useSIDs) ** getFileShareSecurity(SmbFile file, boolean useSIDs) ** getFileType(SmbFile file) * Change to *fileExists()* example [https://github.com/apache/manifoldcf/blob/release-2.22.1/connectors/jcifs/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/sharedrive/SharedDriveConnector.java#L2176] {code:java} protected static boolean fileExists(SmbFile file) throws SmbException { int totalTries = 0; - int retriesRemaining = 3; + int retriesRemaining = consecutiveSMBExceptionRetryCount; SmbException currentException = null; - while (retriesRemaining > 0 && totalTries < 5) + while (retriesRemaining > 0 && totalTries < totalSMBRetryCount) { retriesRemaining--; totalTries++; {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)