[ https://issues.apache.org/jira/browse/CONNECTORS-1009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Karl Wright reassigned CONNECTORS-1009: --------------------------------------- Assignee: Karl Wright > Cmis Repository Connector does not handle Document updating properly > -------------------------------------------------------------------- > > Key: CONNECTORS-1009 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1009 > Project: ManifoldCF > Issue Type: Bug > Components: CMIS connector > Affects Versions: ManifoldCF 1.7 > Reporter: Prasad Perera > Assignee: Karl Wright > Priority: Minor > Fix For: ManifoldCF 1.7 > > Attachments: std_logs.txt, std_prints.diff > > > As a part of the Fix for CONNECTORS-1004, It seems CmisRepositoryConnector > does not handle document updating properly. > Case Scenario: > * Create a continuous crawling job using CmisRepositoryConnector. > * Update a document on repository end. > * The document keep submitting to OutputConnector at each crawling interval > though it was not updated afterwards. > One possible Fix needed I is : @ CmisRepositoryConnector:processDocument, > activities.ingestDocumentWithException(nodeId, version, documentURI, rd); > The documentURI should point to the old document URI (Now it points to the > latest documentURI discovered and it may seems to confuse document references > ?) > Also, In ECM systems, for example in Alfresco, the documentIDs are formulated > with the version number as well. > Ex: workspace://SpacesStore/8e12a887-3fa8-48d6-8516-5bcfad358ba2;1.0 --> > version 1.0 > workspace://SpacesStore/8e12a887-3fa8-48d6-8516-5bcfad358ba2;1.1 --> version > 1.1 > When we setup a query to crawl a repository folder, we discover content by > referring the child nodes. Because of that, now it seems to queue all the > document versions and submit them to OutputConnector thus producing duplicate > documents at the output (search) side. > Is there a way to avoid this problem ? It will be great if the repository can > just take the latest document version and submit it as an update. -- This message was sent by Atlassian JIRA (v6.2#6252)