[ https://issues.apache.org/jira/browse/CONNECTORS-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17349815#comment-17349815 ]
Karl Wright commented on CONNECTORS-1668: ----------------------------------------- The logic for path rules is as follows: {code} if (sn.getType().equals("pathrule")) { // New-style rule. // Here's the trick: We do what the first matching rule tells us to do. String pathMatch = sn.getAttributeValue("match"); String action = sn.getAttributeValue("action"); String ruleType = sn.getAttributeValue("type"); // First, find out if we match EXACTLY. if (checkMatch(libraryPath,0,pathMatch)) { // If this is true, the type also has to match if the rule is to apply. if (ruleType.equals("library")) { if (Logging.connectors.isDebugEnabled()) Logging.connectors.debug("SharePoint: Library '"+libraryPath+"' exactly matched rule path '"+pathMatch+"'"); if (action.equals("include")) { // For include rules, partial match is good enough to proceed. if (Logging.connectors.isDebugEnabled()) Logging.connectors.debug("SharePoint: Including library '"+libraryPath+"'"); return true; } if (Logging.connectors.isDebugEnabled()) Logging.connectors.debug("SharePoint: Excluding library '"+libraryPath+"'"); return false; } } else if (ruleType.equals("file") && checkPartialPathMatch(libraryPath,0,pathMatch,1) && action.equals("include")) { if (Logging.connectors.isDebugEnabled()) Logging.connectors.debug("SharePoint: Library '"+libraryPath+"' partially matched file rule path '"+pathMatch+"' - including"); return true; } else if (ruleType.equals("folder") && checkPartialPathMatch(libraryPath,0,pathMatch,1) && action.equals("include")) { if (Logging.connectors.isDebugEnabled()) Logging.connectors.debug("SharePoint: Library '"+libraryPath+"' partially matched folder rule path '"+pathMatch+"' - including"); return true; } } } {code} I need to see the rule type; as you can see, to include a library, you need a library rule, and to include a site, you need a site rule. The checkMatch() method does this: {code} /** Recursive worker method for checkMatch. Returns 'true' if there is a path that consumes both * strings in their entirety in a matched way. *@param caseSensitive is true if file names are case sensitive. *@param sourceMatch is the source string (w/o wildcards) *@param match is the match string (w/wildcards) *@return true if there is a match. */ protected static boolean checkMatch(boolean caseSensitive, String sourceMatch, String match) {code} The partial path match method looks like this: {code} protected static boolean checkPartialPathMatch( String sourceMatch, int sourceIndex, String match, int requiredExtraPathSections ) { // The partial match must be of a complete path, with at least a specified number of trailing path components possible in what remains. // Path components can include everything but the "/" character itself. // // The match string is the one containing the wildcards. Both the "*" wildcard and the "?" wildcard will match a "/", which is intended but is why this // matcher is a little tricky to write. // // Note also that it is OK to return "true" more than strictly necessary, but it is never OK to return "false" incorrectly. // This is a partial path match. That means that we don't have to completely use up the match string, but what's left on the match string after the source // string is used up MUST either be capable of being null, or be capable of starting with a "/"integral path sections, and MUST include at least n of these sections. // {code} If you look at the code, you will note there's quite a bit of debug logging around path matching. The basic point though is that the entire match string must be consumed for the full match, meaning that anything that is not a wildcard MUST match, and for a partial match there must be at least N sections left over after the match is entirely consumed. To summarize: (1) You need a Site rule to include a site. (2) You need a Library rule to include a library. > Use of Wild Characters in SharePoint Connector. > ----------------------------------------------- > > Key: CONNECTORS-1668 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1668 > Project: ManifoldCF > Issue Type: Bug > Components: SharePoint connector > Affects Versions: ManifoldCF 2.16 > Reporter: Shashank Dwivedi > Assignee: Karl Wright > Priority: Major > Fix For: ManifoldCF 2.16 > > Attachments: image-2021-05-23-00-36-45-378.png > > Original Estimate: 48h > Remaining Estimate: 48h > > Hi, > My SharePoint site is of the following *Format* : > -*Projects(root)* > -*Project 1* > -Project Library > -Folder 1 > -Folder 2 ... Folder N > -*Project 2 ... Project N* > -Project Library > -Folder 1 .. Folder N > We have the *Projects(root site)* in this fashion from Project 1 to *Project > N(20000)*, where N is a *large number.* I wish to process all files present > inside the *Project Library folder* of all the projects. > So, as a Path rule I am currently supplying "*Projects/**/*Project Library/* > *". There is no space between / and * in the last. > However, this is *not working out*. It is also pulling documents inside > *Folder 1, Folder2,..Folder N.* I want it to Process files only inside > Project Library. > Please suggest me the right way to accomplish this Task. > I could not identify any suggestion regarding the same in the End user > Documentation. > -- This message was sent by Atlassian Jira (v8.3.4#803005)