[ 
https://issues.apache.org/jira/browse/CONNECTORS-1009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14097785#comment-14097785
 ] 

Karl Wright commented on CONNECTORS-1009:
-----------------------------------------

If it appears that the Alfresco CMIS implementation returns null for the 
version label, then really there's nothing I can do to the CMIS connector to 
change its behavior, unless we can identify another field that we can use as a 
surrogate for the version stamp in all cases where the version label doesn't 
exist.  If the implementation returns "" (empty string), then perhaps we can 
treat that case differently from null.

It does not appear to me, though, that the CMIS connector used with Alfresco 
will be able to do what you request for document identifiers vs version 
identifiers.  Another connector that operates similarly is the FileNet 
connector, FWIW.  So unless we change the behavior of getDocumentVersions(), 
then this ticket will just be closed.

We *really* need a re-implemented Alfresco connector, but that project has been 
problematic for a long time due to omissions and instability in Alfresco's 
native REST API.  There's a ticket open we hoped to have done for 1.7, but 
clearly that's not possible now.
 

> Cmis Repository Connector does not handle Document updating properly
> --------------------------------------------------------------------
>
>                 Key: CONNECTORS-1009
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1009
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: CMIS connector
>            Reporter: Prasad Perera
>            Priority: Minor
>
> As a part of the Fix for CONNECTORS-1004, It seems CmisRepositoryConnector 
> does not handle document updating properly.
> Case Scenario:
> * Create a continuous crawling job using  CmisRepositoryConnector.
> * Update a document on repository end.
> * The document keep submitting to OutputConnector at each crawling interval 
> though it was not updated afterwards.
> One possible Fix needed I is : @ CmisRepositoryConnector:processDocument,
>  activities.ingestDocumentWithException(nodeId, version, documentURI, rd);
> The documentURI should point to the old document URI (Now it points to the 
> latest documentURI discovered and it may seems to confuse document references 
> ?)
> Also, In ECM systems, for example in Alfresco, the documentIDs are formulated 
> with the version number as well.
> Ex: workspace://SpacesStore/8e12a887-3fa8-48d6-8516-5bcfad358ba2;1.0 --> 
> version 1.0
> workspace://SpacesStore/8e12a887-3fa8-48d6-8516-5bcfad358ba2;1.1 --> version 
> 1.1
> When we setup a query to crawl a repository folder, we discover content by 
> referring the child nodes. Because of that, now it seems to queue all the 
> document versions and submit them to OutputConnector thus producing duplicate 
> documents at the output (search) side.
> Is there a way to avoid this problem ? It will be great if the repository can 
> just take the latest document version and submit it as an update.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to