Re: Technical question on repo connector dev
Yes, that is what I suggest. Karl On Sat, Oct 5, 2019 at 8:42 AM wrote: > Hi Karl, > > Thanks for the answer. > > Is your suggestion something like : > > processDocuments(...) { > > if(documentIdentifier.isURI) { > jsonDocs = getJsonDocsFromURI(documentIdentifier) > jsonDocs.foreach(jsonDoc -> { > String jsonDocID = "jsonDoc+" + > jsonDoc.toJsonString(); > activities.addDocumentReference(jsonDocID); > }) > } else if(documentIdentifier.isJsonDoc) { > jsonDoc = getJsonDoc(documentIdentifier) > jsonDocVersion = jsonDoc.getVersion() > jsonDocUri = jsonDoc.getUri(); > > if(activities.checkDocumentNeedsReindexing(documentIdentifier, > jsonDocVersion)) { > > activities.ingestDocumentWithException(documentIdentifier, jsonDoc, > jsonDocUri) > } > } > } > > ? > > Julien > > -Message d'origine----- > De : Karl Wright > Envoyé : vendredi 4 octobre 2019 21:07 > À : dev > Objet : Re: Technical question on repo connector dev > > Hi Julien, > > The checkDocumentNeedsReindexing() method is meant to be used inside > processDocuments() for the specific document you are checking. So you can > convert your URI to a set of JSON documents, if the document identifier is > a URI, But you will probably want to put the actual data for the document > in carrydown information. You will need to also create some kind of > non-URI document ID too. > > Karl > > > On Fri, Oct 4, 2019 at 1:36 PM wrote: > > > Hi, > > > > > > > > I am facing a simple technical case that I am not sure how to deal > > with, concerning the development of a repository connector. > > > > > > > > I want to develop a repo connector using the ADD_CHANGE_DELETE model > > that will normally add seed documents, and each seed document will > > produce several documents. > > The problem is that each produced document from a seed doc is > > instantly ingest-able and does not need to be processed. > > > > > > > > The use case here is that the addSeedDocuments method will call an API > > that will provide several URIs (seeds). > > > > In the processDocuments method, each URI provides a JSON array > > containing JSON objects and those JSON objects are meant to become > > repository documents and ingested. > > So the logic would be to use the activities.addDocumentReference for > > each JSON object before I can use the > > activities.checkDocumentNeedsReindexing > > (each JSON object has an id and a version field) and then ingest the > > document. But by doing this, I am afraid that the processDocuments > > method will be called with those newly referenced docs while they do > > not need to be processed. > > > > > > > > Any suggestion about how to deal with this use case is welcome. > > > > > > > > Thanks, > > Julien > > > > > >
RE: Technical question on repo connector dev
Hi Karl, Thanks for the answer. Is your suggestion something like : processDocuments(...) { if(documentIdentifier.isURI) { jsonDocs = getJsonDocsFromURI(documentIdentifier) jsonDocs.foreach(jsonDoc -> { String jsonDocID = "jsonDoc+" + jsonDoc.toJsonString(); activities.addDocumentReference(jsonDocID); }) } else if(documentIdentifier.isJsonDoc) { jsonDoc = getJsonDoc(documentIdentifier) jsonDocVersion = jsonDoc.getVersion() jsonDocUri = jsonDoc.getUri(); if(activities.checkDocumentNeedsReindexing(documentIdentifier, jsonDocVersion)) { activities.ingestDocumentWithException(documentIdentifier, jsonDoc, jsonDocUri) } } } ? Julien -Message d'origine- De : Karl Wright Envoyé : vendredi 4 octobre 2019 21:07 À : dev Objet : Re: Technical question on repo connector dev Hi Julien, The checkDocumentNeedsReindexing() method is meant to be used inside processDocuments() for the specific document you are checking. So you can convert your URI to a set of JSON documents, if the document identifier is a URI, But you will probably want to put the actual data for the document in carrydown information. You will need to also create some kind of non-URI document ID too. Karl On Fri, Oct 4, 2019 at 1:36 PM wrote: > Hi, > > > > I am facing a simple technical case that I am not sure how to deal > with, concerning the development of a repository connector. > > > > I want to develop a repo connector using the ADD_CHANGE_DELETE model > that will normally add seed documents, and each seed document will > produce several documents. > The problem is that each produced document from a seed doc is > instantly ingest-able and does not need to be processed. > > > > The use case here is that the addSeedDocuments method will call an API > that will provide several URIs (seeds). > > In the processDocuments method, each URI provides a JSON array > containing JSON objects and those JSON objects are meant to become > repository documents and ingested. > So the logic would be to use the activities.addDocumentReference for > each JSON object before I can use the > activities.checkDocumentNeedsReindexing > (each JSON object has an id and a version field) and then ingest the > document. But by doing this, I am afraid that the processDocuments > method will be called with those newly referenced docs while they do > not need to be processed. > > > > Any suggestion about how to deal with this use case is welcome. > > > > Thanks, > Julien > >
Re: Technical question on repo connector dev
Hi Julien, The checkDocumentNeedsReindexing() method is meant to be used inside processDocuments() for the specific document you are checking. So you can convert your URI to a set of JSON documents, if the document identifier is a URI, But you will probably want to put the actual data for the document in carrydown information. You will need to also create some kind of non-URI document ID too. Karl On Fri, Oct 4, 2019 at 1:36 PM wrote: > Hi, > > > > I am facing a simple technical case that I am not sure how to deal with, > concerning the development of a repository connector. > > > > I want to develop a repo connector using the ADD_CHANGE_DELETE model that > will normally add seed documents, and each seed document will produce > several documents. > The problem is that each produced document from a seed doc is instantly > ingest-able and does not need to be processed. > > > > The use case here is that the addSeedDocuments method will call an API that > will provide several URIs (seeds). > > In the processDocuments method, each URI provides a JSON array containing > JSON objects and those JSON objects are meant to become repository > documents > and ingested. > So the logic would be to use the activities.addDocumentReference for each > JSON object before I can use the activities.checkDocumentNeedsReindexing > (each JSON object has an id and a version field) and then ingest the > document. But by doing this, I am afraid that the processDocuments method > will be called with those newly referenced docs while they do not need to > be > processed. > > > > Any suggestion about how to deal with this use case is welcome. > > > > Thanks, > Julien > >
Technical question on repo connector dev
Hi, I am facing a simple technical case that I am not sure how to deal with, concerning the development of a repository connector. I want to develop a repo connector using the ADD_CHANGE_DELETE model that will normally add seed documents, and each seed document will produce several documents. The problem is that each produced document from a seed doc is instantly ingest-able and does not need to be processed. The use case here is that the addSeedDocuments method will call an API that will provide several URIs (seeds). In the processDocuments method, each URI provides a JSON array containing JSON objects and those JSON objects are meant to become repository documents and ingested. So the logic would be to use the activities.addDocumentReference for each JSON object before I can use the activities.checkDocumentNeedsReindexing (each JSON object has an id and a version field) and then ingest the document. But by doing this, I am afraid that the processDocuments method will be called with those newly referenced docs while they do not need to be processed. Any suggestion about how to deal with this use case is welcome. Thanks, Julien