Hi Julien,

First, ALL models rely on individual existence checks for documents.  That
is, when your connector fetches a deleted document, the framework has to be
told that the document is gone, or it will not be removed.  There is no
"discovery" process for deleted documents other than seeding (and only when
the model includes _DELETE).

The upshot of this is that IF your seeding method does not return all
documents that have been removed THEN it cannot be a _DELETE model.

I hope this helps.

Karl


On Sat, Feb 29, 2020 at 8:10 AM <julien.massi...@francelabs.com> wrote:

> Hi dev community,
>
>
>
> I am trying to develop a connector for an API that exposes a hierarchical
> arborescence of documents: each document can have children documents.
>
> During the init crawl, the child documents are referenced in the MCF
> connector through the method
> activities.addDocumentRefenrece(childDocumentIdentifier,
> parentDocumentIdentifier, parentDataNames, parentDataValues)
>
> The API is able to provide delta modifications/deletions from a provided
> date but, when a document that has children is deleted, the API only
> returns
> the id of the document, not its children. On the MCF connector side, I
> thought that, as I have referenced the children, by deleting the parent
> document all its children would be deleted with it, but it appears that it
> is not the case.
>
> So my question is : did I miss something ? Is there another way to perform
> delta deletions ? Unfortunately if I don't find a way to solve this issue,
> I
> will not be able to take advantage of the delta feature and thus I will
> have
> to use the "add_modify" connector type and test every id on a delta crawl
> to
> figure out which ids are missing. This would be a huge loss of
> performances.
>
>
>
> Regards,
>
> Julien Massiera
>
>

Reply via email to