Maciej Lizewski created CONNECTORS-685:
------------------------------------------

             Summary: we need possibility to mark all documents in job to 
reingestion
                 Key: CONNECTORS-685
                 URL: https://issues.apache.org/jira/browse/CONNECTORS-685
             Project: ManifoldCF
          Issue Type: Bug
          Components: Framework core
            Reporter: Maciej Lizewski


consider case: You have connector with model MODEL_ADD_CHANGE_DELETE (we have 
notificantion about every action with ingested documents). This job is somehow 
configured to fetch objects for two categories (category A and B). During its 
work - documents from both categories are ingested.

Now - administrator changes configuration of this job removing 'category B', 
which means only documents from category A should be ingested. But since the 
model is MODEL_ADD_CHANGE_DELETE - existing documents from category B will 
never be reingested to check if they are valid. Seeding will not return 
documents from deleted category because there is no configuration entry for 
such category any more and there is no possibility to check what was the 
configuration BEFORE change...

possible solutions:
1. for connectors with MODEL_ADD_CHANGE_DELETE after every configuration change 
all existing documents should be marked to reingest (it is up to 
getDocumentVersions function to properly check if documents should be ingested 
and mark them with NULL version)
2. ISeedingActivity should be extended to handle 'reingestAllIngestedDocuments' 
- then seeding function will detect configuration changes (startTime will be 0) 
and decide whether to reingest all documents or not.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to