Re: Job Multiple Outputs

2019-09-10 Thread julien.massiera
Thanks for your answer Karl. I was unsure about that concerning the output connections but it is still the same pipeline after all. Message d'origine De : Karl Wright Date : 10/09/2019 20:08 (GMT+01:00) À : user@manifoldcf.apache.org Objet : Re: Job Multiple Outputs Hi

Re: Job Multiple Outputs

2019-09-10 Thread Karl Wright
Hi Julien, You must understand that a job with a complex pipeline is really not running N independent jobs; it's running ONE job. Every document is processed through the pipeline only once. The pipeline may have faster components and slower components; doesn't matter; the document takes the sum

Re: Job Multiple Outputs

2019-09-10 Thread Julien Massiera
Ok, so to be sure I understood what you are saying: suppose a job with two output connections and one of the outputs is twice time faster than the other one to index documents. At a given time t, both of the outputs will have indexed the same amount of documents, no matter if one output is

Re: Job Multiple Outputs

2019-09-10 Thread Karl Wright
The output connection contract is that a request to index is made to the connector, and the connector returns when it is done. When there are multiple output connections, these are each handed a copy of the document, one after the other, and told to index it. This is all done by one worker

Job Multiple Outputs

2019-09-10 Thread Julien Massiera
Hi, I would like to have an explanation about the behavior of a job when several outputs are configured. My main question is : for each output, how is the docs ingestion managed ? More precisely, are the ingest processes synchronized or not ? (in other words, is the ingestion of the next