[ https://issues.apache.org/jira/browse/CONNECTORS-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16633333#comment-16633333 ]
James Thomas commented on CONNECTORS-1532: ------------------------------------------ {quote}I have attached a patch for you to try. Please let me know if it addresses the folder issue {quote} [~kwri...@metacarta.com] I don't see any significant change in behaviour using the same repro steps as comment-16623888. The file is still shown on the file system at full size after the second run of the job. Here are the log and file system details of this run: {code:java} INFO 2018-09-30T09:34:38.276Z (Startup thread) - Preparing non-continuous non-partial, either MODEL_ALL or fromBeginningOfTime, 1538299958323 for run: prepareFullScan INFO 2018-09-30T09:35:58.444Z (Startup thread) - Preparing incremental scan for 1538299958323: prepareIncrementalScan -rw-r--r--. 1 root root 27 Sep 30 11:34 drl?versionLabel=CURRENT&objectId=090000018000fc6b -rw-r--r--. 1 root root 27 Sep 30 11:36 drl?versionLabel=CURRENT&objectId=090000018000fc6b{code} I went on and repeated the repro in the same state just to see what might happen (essentially the same), then I reset seeding on the job and ran it again. Here's the file system and logs for that: {code:java} INFO 2018-09-30T09:48:51.303Z (Startup thread) - Preparing incremental scan for 1538299958323: prepareIncrementalScan INFO 2018-09-30T09:50:06.483Z (Startup thread) - Preparing incremental scan for 1538299958323: prepareIncrementalScan INFO 2018-09-30T09:51:06.800Z (Startup thread) - Preparing non-continuous non-partial, either MODEL_ALL or fromBeginningOfTime, 1538299958323 for run: prepareFullScan $ ### after adding another file -rw-r--r--. 1 root root 27 Sep 30 11:36 drl?versionLabel=CURRENT&objectId=090000018000fc6b -rw-r--r--. 1 root root 27 Sep 30 11:49 drl?versionLabel=CURRENT&objectId=090000018000fc6c $ ### after running the job again -rw-r--r--. 1 root root 27 Sep 30 11:36 drl?versionLabel=CURRENT&objectId=090000018000fc6b -rw-r--r--. 1 root root 27 Sep 30 11:50 drl?versionLabel=CURRENT&objectId=090000018000fc6c $ ## after resetting seeding and running the job -rw-r--r--. 1 root root 0 Sep 30 11:51 drl?versionLabel=CURRENT&objectId=090000018000fc6b -rw-r--r--. 1 root root 0 Sep 30 11:51 drl?versionLabel=CURRENT&objectId=090000018000fc6c {code} So it appears that reseeding can give the desired outcome. FYI, I applied this patch on top of the logging patch from this ticket, which is itself on top of 2.10 patched for #1512, #1517: {code:java} $ wget https://issues.apache.org/jira/secure/attachment/12940883/CONNECTORS-1532.patch $ dos2unix CONNECTORS-1532.patch $ dos2unix connectors/documentum/connector/src/main/java/org/apache/manifoldcf/crawler/connectors/DCTM/DCTM.java $ patch -p0 -i CONNECTORS-1532.patch $ ant build $ find dist -type f -exec ls -l {} \; > /tmp/diff{code} Here's the set of files that changed after I applied your patch: {code:java} $ grep Sep\ 30 /tmp/diff | grep jar -rw-rw-r-- 1 james staff 12544 Sep 30 10:01 dist/connector-lib/mcf-documentum-connector-rmistub.jar -rw-rw-r-- 1 james staff 100864 Sep 30 10:01 dist/connector-lib/mcf-documentum-connector.jar -rw-rw-r-- 1 james staff 6292 Sep 30 10:01 dist/connector-lib/mcf-filenet-connector-rmistub.jar -rw-rw-r-- 1 james staff 3916082 Sep 30 10:02 dist/connector-lib/mcf-meridio-connector.jar -rw-rw-r-- 1 james staff 838582 Sep 30 10:02 dist/connector-lib/mcf-sharepoint-connector.jar -rw-rw-r-- 1 james staff 8567 Sep 30 10:01 dist/processes/documentum-server/lib/mcf-documentum-connector-rmiskel.jar -rw-rw-r-- 1 james staff 4494 Sep 30 10:01 dist/processes/filenet-server/lib/mcf-filenet-connector-rmiskel.jar {code} I stopped my MFC instance and the DM processes, applied the changed files, and restarted DM processes and MFC server. Then attempted the repro I described above. It's possible that I haven't applied the patch correctly. Is there something I can do to check? > Moving a file outside of the job's Paths is not the same as deleting it > ----------------------------------------------------------------------- > > Key: CONNECTORS-1532 > URL: https://issues.apache.org/jira/browse/CONNECTORS-1532 > Project: ManifoldCF > Issue Type: Bug > Components: Documentum connector > Affects Versions: ManifoldCF 2.10 > Environment: Manifold 2.10 patched for #1512, #1517 > Reporter: James Thomas > Assignee: Karl Wright > Priority: Major > Fix For: ManifoldCF 2.12 > > Attachments: 2018-09-19_1758.png, CONNECTORS-1532.patch, > logging_patch.diff > > > If I have a MF job which is connecting a specific folder, F, in Documentum to > a File System output then: > 1. deleting files in Documentum shows them as zero size in the file system > 2. moving files out of F does not remove them or zero them in the file system > Note that moving a file from another folder (which the job is not looking at) > to F has the same effect as adding it to F by e.g. importing it in DM or > POSTing it to DM via the REST interface. > Intuitively, I expect that moving a file out of the "view" of the Documentum > connector would have the same effect on the File System as deleting it. (My > model here is of MF synchronising content between the Paths (DM) and the > Output Path (File System) that I have specified in the job.) > Starting point, I have run the MF job to fetch a bunch of files from a folder > - call it F - in DM (i.e. I have configured Paths in the job to be F). This > is what 'ls -l' on the file system looks like: > {code:java} > -rw-r--r--. 1 root i2e 12541 Sep 19 07:21 > drl?versionLabel=CURRENT&objectId=090000018000f7c0 > -rw-r--r--. 1 root i2e 26 Sep 19 07:21 > drl?versionLabel=CURRENT&objectId=090000018000f7be > -rw-r--r--. 1 root i2e 85772 Sep 19 07:21 > drl?versionLabel=CURRENT&objectId=090000018000f7c7 > -rw-r--r--. 1 root i2e 8790 Sep 19 07:21 > drl?versionLabel=CURRENT&objectId=090000018000f7c2 > -rw-r--r--. 1 root i2e 101888 Sep 19 07:21 > drl?versionLabel=CURRENT&objectId=090000018000f7c3 > -rw-r--r--. 1 root i2e 32783 Sep 19 07:21 > drl?versionLabel=CURRENT&objectId=090000018000f7c4 > -rw-r--r--. 1 root i2e 23040 Sep 19 07:22 > drl?versionLabel=CURRENT&objectId=090000018000f7c1 > -rw-r--r--. 1 root i2e 26112 Sep 19 07:22 > drl?versionLabel=CURRENT&objectId=090000018000f7bf{code} > In DM, I delete one of the files in F and it shows as zero size, and the > modification date has changed: > {code:java} > -rw-r--r--. 1 root i2e 12541 Sep 19 07:21 > drl?versionLabel=CURRENT&objectId=090000018000f7c0 > -rw-r--r--. 1 root i2e 26 Sep 19 07:21 > drl?versionLabel=CURRENT&objectId=090000018000f7be > -rw-r--r--. 1 root i2e 8790 Sep 19 07:21 > drl?versionLabel=CURRENT&objectId=090000018000f7c2 > -rw-r--r--. 1 root i2e 101888 Sep 19 07:21 > drl?versionLabel=CURRENT&objectId=090000018000f7c3 > -rw-r--r--. 1 root i2e 32783 Sep 19 07:21 > drl?versionLabel=CURRENT&objectId=090000018000f7c4 > -rw-r--r--. 1 root i2e 23040 Sep 19 07:22 > drl?versionLabel=CURRENT&objectId=090000018000f7c1 > -rw-r--r--. 1 root i2e 26112 Sep 19 07:22 > drl?versionLabel=CURRENT&objectId=090000018000f7bf > -rw-r--r--. 1 root i2e 0 Sep 19 07:23 > drl?versionLabel=CURRENT&objectId=090000018000f7c7{code} > In DM, I move a file from F to another folder. (Right click, add to > clipboard, go to new folder, Edit> Move here). > The file shows as modified (07:25), but is still apparently in F (i.e. in the > Path my MF job is looking at): > {code:java} > -rw-r--r--. 1 root i2e 12541 Sep 19 07:21 > drl?versionLabel=CURRENT&objectId=090000018000f7c0 > -rw-r--r--. 1 root i2e 26 Sep 19 07:21 > drl?versionLabel=CURRENT&objectId=090000018000f7be > -rw-r--r--. 1 root i2e 8790 Sep 19 07:21 > drl?versionLabel=CURRENT&objectId=090000018000f7c2 > -rw-r--r--. 1 root i2e 101888 Sep 19 07:21 > drl?versionLabel=CURRENT&objectId=090000018000f7c3 > -rw-r--r--. 1 root i2e 23040 Sep 19 07:22 > drl?versionLabel=CURRENT&objectId=090000018000f7c1 > -rw-r--r--. 1 root i2e 26112 Sep 19 07:22 > drl?versionLabel=CURRENT&objectId=090000018000f7bf > -rw-r--r--. 1 root i2e 0 Sep 19 07:23 > drl?versionLabel=CURRENT&objectId=090000018000f7c7 > -rw-r--r--. 1 root i2e 32783 Sep 19 07:25 > drl?versionLabel=CURRENT&objectId=090000018000f7c4{code} > In DM, I move a file from another folder to F and it shows up with the > timestamp of the move (07:28): > {code:java} > -rw-r--r--. 1 root i2e 12541 Sep 19 07:21 > drl?versionLabel=CURRENT&objectId=090000018000f7c0 > -rw-r--r--. 1 root i2e 26 Sep 19 07:21 > drl?versionLabel=CURRENT&objectId=090000018000f7be > -rw-r--r--. 1 root i2e 8790 Sep 19 07:21 > drl?versionLabel=CURRENT&objectId=090000018000f7c2 > -rw-r--r--. 1 root i2e 101888 Sep 19 07:21 > drl?versionLabel=CURRENT&objectId=090000018000f7c3 > -rw-r--r--. 1 root i2e 23040 Sep 19 07:22 > drl?versionLabel=CURRENT&objectId=090000018000f7c1 > -rw-r--r--. 1 root i2e 26112 Sep 19 07:22 > drl?versionLabel=CURRENT&objectId=090000018000f7bf > -rw-r--r--. 1 root i2e 0 Sep 19 07:23 > drl?versionLabel=CURRENT&objectId=090000018000f7c7 > -rw-r--r--. 1 root i2e 32783 Sep 19 07:25 > drl?versionLabel=CURRENT&objectId=090000018000f7c4 > -rw-r--r--. 1 root i2e 191513 Sep 19 07:28 > drl?versionLabel=CURRENT&objectId=09000001800045b9{code} > But if I immediately move it out in DM then, again, the timestamp (07:30) > alters but the file apparently remains: > {code:java} > -rw-r--r--. 1 root i2e 12541 Sep 19 07:21 > drl?versionLabel=CURRENT&objectId=090000018000f7c0 > -rw-r--r--. 1 root i2e 26 Sep 19 07:21 > drl?versionLabel=CURRENT&objectId=090000018000f7be > -rw-r--r--. 1 root i2e 8790 Sep 19 07:21 > drl?versionLabel=CURRENT&objectId=090000018000f7c2 > -rw-r--r--. 1 root i2e 101888 Sep 19 07:21 > drl?versionLabel=CURRENT&objectId=090000018000f7c3 > -rw-r--r--. 1 root i2e 23040 Sep 19 07:22 > drl?versionLabel=CURRENT&objectId=090000018000f7c1 > -rw-r--r--. 1 root i2e 26112 Sep 19 07:22 > drl?versionLabel=CURRENT&objectId=090000018000f7bf > -rw-r--r--. 1 root i2e 0 Sep 19 07:23 > drl?versionLabel=CURRENT&objectId=090000018000f7c7 > -rw-r--r--. 1 root i2e 32783 Sep 19 07:25 > drl?versionLabel=CURRENT&objectId=090000018000f7c4 > -rw-r--r--. 1 root i2e 191513 Sep 19 07:30 > drl?versionLabel=CURRENT&objectId=09000001800045b9{code} > In DM, I now delete all visible content in F. The files that were moved out > of F, and are not visible in F in DM, remain on the file system: > {code:java} > -rw-r--r--. 1 root i2e 0 Sep 19 07:23 > drl?versionLabel=CURRENT&objectId=090000018000f7c7 > -rw-r--r--. 1 root i2e 32783 Sep 19 07:25 > drl?versionLabel=CURRENT&objectId=090000018000f7c4 > -rw-r--r--. 1 root i2e 191513 Sep 19 07:30 > drl?versionLabel=CURRENT&objectId=09000001800045b9 > -rw-r--r--. 1 root i2e 0 Sep 19 07:31 > drl?versionLabel=CURRENT&objectId=090000018000f7c2 > -rw-r--r--. 1 root i2e 0 Sep 19 07:31 > drl?versionLabel=CURRENT&objectId=090000018000f7be > -rw-r--r--. 1 root i2e 0 Sep 19 07:31 > drl?versionLabel=CURRENT&objectId=090000018000f7c0 > -rw-r--r--. 1 root i2e 0 Sep 19 07:31 > drl?versionLabel=CURRENT&objectId=090000018000f7c1 > -rw-r--r--. 1 root i2e 0 Sep 19 07:31 > drl?versionLabel=CURRENT&objectId=090000018000f7bf > -rw-r--r--. 1 root i2e 0 Sep 19 07:31 > drl?versionLabel=CURRENT&objectId=090000018000f7c3{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)