Hi folks,

We have been working on an “unofficial” Alfresco connector that currently is 
more or less working for Manifold 1.7. You can check the code here: 
https://github.com/rafaharo/alfresco-webscript-manifold-connector. The 
README.md file is out of date, so please ignore it. Basically, this connector 
is using a client that consumes a set of Alfresco webscritps for dealing with 
content and metadata crawling. Documents seeding is based on Alfresco 
transactions, so the connector keeps asking alfresco for a concrete number of 
transactions until no new transactions are found. The transactions info, among 
others things, indicates if a documents has been deleted so, later, while 
processing the documents, those documents are marked to be deleted.

In the first run, all the available documents identifiers are seeded. In the 
next runs, we thought to seed only those documents affected by new transactions 
(new documents, any change at any level or deletions). And this is what is 
happening right now: for example, if there is not new transactions, any 
document is seeded and the whole index is purged (all the previous indexed 
documents are deleted).

My question is: is this a normal behavior ? How can we avoid it? Is there any 
configuration option for the jobs? We have read about minimal and complete 
runs, but it is still not clear for us.

Thanks a lot!
Cheers,
Rafa


Reply via email to