Hello.

I have found what the real problem is.

If you look at the implementation, when the CMIS Output Connector tries to 
create the document, it reads from the contentStream at 
https://github.com/douglascrp/manifoldcf/blob/release-2.10-changed/connectors/cmis/connector/src/main/java/org/apache/manifoldcf/agents/output/cmisoutput/CmisOutputConnector.java#L964

When the document already exists, the CMIS library thows the 
CmisContentAlreadyExistsException, and then, in the catch block, the code tries 
to reuse the contentStream in order to create the new version, as you can see 
at 
https://github.com/douglascrp/manifoldcf/blob/release-2.10-changed/connectors/cmis/connector/src/main/java/org/apache/manifoldcf/agents/output/cmisoutput/CmisOutputConnector.java#L982
This is why the new version ended up as 0 byte, because the input stream has 
already been consumed at this point.

As the XThreadInputStream does not allow to mark and reset the Stream, I could 
not find a way to "reset" it in the catch block

The solution I found to avoid this was to check if the file already exists at 
the destination, before trying to create it, but this is not good for 
performance, as for most of the times, the document is a new one, and checking 
for it will make the process waaaaay slower.

The fix is available here 
https://github.com/douglascrp/manifoldcf/commit/03684f97688f21963b7a06e3c8dd71c120d50c91

I am not merging it yet because I want to wait for your opinion on this, as 
maybe there could be a better way to deal with this input stream issue.

Please, let me know if you have any idea about this, as the original way to 
deal with the process was way faster, and I would want to avoid my fix because 
of this.

Thank you in advance.

On 2018/10/04 17:09:00, "Douglas C. R. Paes (JIRA)" <[email protected]> wrote: 
> Douglas C. R. Paes created CONNECTORS-1541:
> ----------------------------------------------
> 
>              Summary: Documents updated in Google Drive are send with 0 byte 
> to CMIS Output Connector
>                  Key: CONNECTORS-1541
>                  URL: https://issues.apache.org/jira/browse/CONNECTORS-1541
>              Project: ManifoldCF
>           Issue Type: Bug
>           Components: Framework core
>     Affects Versions: ManifoldCF 2.10
>             Reporter: Douglas C. R. Paes
> 
> 
> When dealing with migration process, like when using the CMIS Output 
> Connector to ingest content into an ECM (Alfresco in my case), I noticed that 
> when a document is updated inside Google Drive, the engine is able to detect 
> the change and put it into the queue to be updated into the output.
> 
> By using the CMIS Output Connector, the document is versioned into Alfresco, 
> but this new version is always created as a 0 byte file.
> 
> 
> 
> --
> This message was sent by Atlassian JIRA
> (v7.6.3#76005)
> 

Reply via email to