[jira] [Commented] (CONNECTORS-1549) Include and exclude rules order lost
[ https://issues.apache.org/jira/browse/CONNECTORS-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16656073#comment-16656073 ] Karl Wright commented on CONNECTORS-1549: - I found the issue and have attached a patch. Thanks! > Include and exclude rules order lost > > > Key: CONNECTORS-1549 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1549 > Project: ManifoldCF > Issue Type: Bug > Components: API, JCIFS connector >Affects Versions: ManifoldCF 2.11 >Reporter: Julien Massiera >Assignee: Karl Wright >Priority: Critical > Attachments: image-2018-10-18-18-28-14-547.png, > image-2018-10-18-18-33-01-577.png, image-2018-10-18-18-34-01-542.png > > > The include and exclude rules that can be defined in the job configuration > for the JCIFS connector can be combined and the defined order is really > important. > The problem is that when one retrieve the job configuration as a json object > through the API, the include and exclude rules are splitted in two diffrent > arrays instead of one (one for each type of rule). So, the order is > completely lost when one try to recreate the job thanks to the API and the > JSON object. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1549) Include and exclude rules order lost
[ https://issues.apache.org/jira/browse/CONNECTORS-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16655986#comment-16655986 ] Karl Wright commented on CONNECTORS-1549: - Hi [~julienFL] Sorry for the delay. First note that you can always use the order-preserving form even if MCF outputs the JSON in the other "sugary" form. So this should unblock you. Second, I'm looking at the code that generates the output in Configuration.java: {code} // The new JSON parser uses hash order for object keys. So it isn't good enough to just detect that there's an // intermingling. Instead we need to the existence of more that one key; that implies that we need to do order preservation. String lastChildType = null; boolean needAlternate = false; int i = 0; while (i < getChildCount()) { ConfigurationNode child = findChild(i++); String key = child.getType(); List list = childMap.get(key); if (list == null) { // We found no existing list, so create one list = new ArrayList(); childMap.put(key,list); childList.add(key); } // Key order comes into play when we have elements of different types within the same child. if (lastChildType != null && !lastChildType.equals(key)) { needAlternate = true; break; } list.add(child); lastChildType = key; } if (needAlternate) { // Can't use the array representation. We'll need to start do a _children_ object, and enumerate // each child. So, the JSON will look like: // :{_attribute_:xxx,_children_:[{_type_:, ...},{_type_:, ...}, ...]} ... {code} The (needAlternate) clause is the one that writes the specification in the verbose form. The logic seems like it would detect any time there's a subtree with a different key under a given level and set "needAlternate". I'll stare at it some more but right now I'm having trouble seeing how this fails. > Include and exclude rules order lost > > > Key: CONNECTORS-1549 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1549 > Project: ManifoldCF > Issue Type: Bug > Components: API, JCIFS connector >Affects Versions: ManifoldCF 2.11 >Reporter: Julien Massiera >Assignee: Karl Wright >Priority: Critical > Attachments: image-2018-10-18-18-28-14-547.png, > image-2018-10-18-18-33-01-577.png, image-2018-10-18-18-34-01-542.png > > > The include and exclude rules that can be defined in the job configuration > for the JCIFS connector can be combined and the defined order is really > important. > The problem is that when one retrieve the job configuration as a json object > through the API, the include and exclude rules are splitted in two diffrent > arrays instead of one (one for each type of rule). So, the order is > completely lost when one try to recreate the job thanks to the API and the > JSON object. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1549) Include and exclude rules order lost
[ https://issues.apache.org/jira/browse/CONNECTORS-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16655562#comment-16655562 ] Julien Massiera commented on CONNECTORS-1549: - After some tests, I noticed that the problem only happens when exclude rules are all defined BEFORE include rules. For example, here is my original job configuration : !image-2018-10-18-18-34-01-542.png! and here is the extract of the JSON generated for the API for this job : {code:java} "startpoint": { "include": { "_attribute_filespec": "*", "_value_": "", "_attribute_type": "file" }, "_attribute_path": "ocr", "_value_": "", "exclude": [ { "_attribute_filespec": "*.pst", "_value_": "", "_attribute_type": "file" }, { "_attribute_filespec": "*", "_value_": "", "_attribute_type": "directory" } ] } {code} When re-creating the job thanks to the same JSON here is the new job configuration: !image-2018-10-18-18-33-01-577.png! When executing my original job, the pst files are correctly filtered, but when executing the job created from the generated JSON, they are not excluded from the process. However, if I create a job that combines include and exclude rules, the JSON generated by the API uses a different format for the filters : {code:java} "startpoint": { "_children_": [ { "_type_": "include", "_attribute_filespec": "/subfolder1/", "_value_": "", "_attribute_type": "directory" }, { "_type_": "exclude", "_attribute_filespec": "*", "_value_": "", "_attribute_type": "directory" }, { "_type_": "include", "_attribute_filespec": "*", "_value_": "", "_attribute_type": "file" } ], "_attribute_path": "ocr", "_value_": "" }{code} In that case, the order is respected and the job behavior is what I expect. > Include and exclude rules order lost > > > Key: CONNECTORS-1549 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1549 > Project: ManifoldCF > Issue Type: Bug > Components: API, JCIFS connector >Affects Versions: ManifoldCF 2.11 >Reporter: Julien Massiera >Assignee: Karl Wright >Priority: Critical > Attachments: image-2018-10-18-18-28-14-547.png, > image-2018-10-18-18-33-01-577.png, image-2018-10-18-18-34-01-542.png > > > The include and exclude rules that can be defined in the job configuration > for the JCIFS connector can be combined and the defined order is really > important. > The problem is that when one retrieve the job configuration as a json object > through the API, the include and exclude rules are splitted in two diffrent > arrays instead of one (one for each type of rule). So, the order is > completely lost when one try to recreate the job thanks to the API and the > JSON object. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CONNECTORS-1549) Include and exclude rules order lost
[ https://issues.apache.org/jira/browse/CONNECTORS-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16655223#comment-16655223 ] Karl Wright commented on CONNECTORS-1549: - Hi [~julienFL], there was a similar ticket a while back for the file system connector. Let me explain what the solution was and see if you still think there is a problem. (1) The actual internal representation of a Document Specification is XML. (2) For the API, we convert the XML to JSON and back. (3) Because a complete and unambiguous conversion between these formats is quite ugly, we have multiple ways of doing the conversion, so that we allow "syntactic sugar" in the JSON for specific cases where the conversion can be done simply. (4) A while back, there was a bug in the code that determined whether it was possible to use syntactic sugar of the specific kind that would lead to two independent lists for the File System Connector's document specification, so for a while what was *output* when you exported the Job was incorrect, and order would be lost if you re-imported it. The solution was to (a) fix the bug, and (b) get the person using the API to use the correct, unambigious JSON format instead of the "sugary" format. This preserves order. The way to see if this is what you are up against is to create a JCIFS job with a complex rule set that has both inclusions and exclusions. If it looks different than what you are expecting, then try replicating that format when you import via the API. > Include and exclude rules order lost > > > Key: CONNECTORS-1549 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1549 > Project: ManifoldCF > Issue Type: Bug > Components: API, JCIFS connector >Affects Versions: ManifoldCF 2.11 >Reporter: Julien Massiera >Assignee: Karl Wright >Priority: Critical > > The include and exclude rules that can be defined in the job configuration > for the JCIFS connector can be combined and the defined order is really > important. > The problem is that when one retrieve the job configuration as a json object > through the API, the include and exclude rules are splitted in two diffrent > arrays instead of one (one for each type of rule). So, the order is > completely lost when one try to recreate the job thanks to the API and the > JSON object. -- This message was sent by Atlassian JIRA (v7.6.3#76005)