Re: Job Multiple Outputs

Julien Massiera Tue, 10 Sep 2019 09:48:23 -0700

Ok, so to be sure I understood what you are saying:

suppose a job with two output connections and one of the outputs istwice time faster than the other one to index documents. At a given timet, both of the outputs will have indexed the same amount of documents,no matter if one output is faster than the other one.In other words : The fastest output will not have indexed all thecrawled documents meanwhile the second one will still have half of themto index.


Am I wrong ?

On 10/09/2019 18:09, Karl Wright wrote:

The output connection contract is that a request to index is made tothe connector, and the connector returns when it is done.When there are multiple output connections, these are each handed acopy of the document, one after the other, and told to index it. Thisis all done by one worker thread. Multiple worker threads are notused for multiple outputs of the same document.
The framework is smart enough to not hand a document to a connector ifit hasn't changed (according to how the connector computes theconnector-specific output version string).
Karl
On Tue, Sep 10, 2019 at 11:00 AM Julien Massiera<julien.massi...@francelabs.com<mailto:julien.massi...@francelabs.com>> wrote:
    Hi,

    I would like to have an explanation about the behavior of a job when
    several outputs are configured. My main question is : for each
    output,
    how is the docs ingestion managed ? More precisely, are the ingest
    processes synchronized or not ? (in other words, is the ingestion
    of the
    next document waiting for the current ingestion to be completed
    for both
    outputs ?). But also, if one output is configured to send a commit at
    the end of the job, is this commit pending until the last
    ingestion has
    occured in the other output ?

    Thanks for your help,
    Julien

--
Julien MASSIERA
Directeur développement produit
France Labs – Les experts du Search
Datafari – Vainqueur du trophée Big Data 2018 au Digital Innovation Makers 
Summit
www.francelabs.com

Re: Job Multiple Outputs

Reply via email to