Yes, no problem at all. I meant that the DoFn is "attached" to a pipeline.

Regards
JB

On 10/16/2017 08:25 AM, Derek Hao Hu wrote:
I believe a worker can execute multiple instances (i.e. threads) of a DoFn.

Derek

On Sun, Oct 15, 2017 at 10:46 PM, Jean-Baptiste Onofré <j...@nanthrax.net <mailto:j...@nanthrax.net>> wrote:

    Hi,

    Correct, @setup is used when bootstrapping the DoFn, @StartBundle is called
    for a set of data (bundle), @ProcessElement is for each element in the
    bundle/collection, @FinishBundle at the end of the dataset (bundle),
    @Teardown is called when the DoFn is "removed".

    A DoFn is per pipeline.

    Regards
    JB


    On 10/16/2017 07:31 AM, Jacob Marble wrote:

        (there might be documentation on this that I didn't find; if so a link
        is sufficient)

        Good evening, this is just a check on my understanding. It looks like an
        instance of a given DoFn goes through this lifecycle. Am I correct?

        - constructor
        - @Setup (once)
            - @StartBundle (zero to many times)
              - @ProcessContext (zero to many times)
            - @FinishBundle
        - @Teardown (once)

        Can any of these steps be called concurrently? (I believe no)
        Can one worker execute multiple instances of a DoFn? (I believe yes)

        Thank you,

        Jacob


-- Jean-Baptiste Onofré
    jbono...@apache.org <mailto:jbono...@apache.org>
    http://blog.nanthrax.net
    Talend - http://www.talend.com




--
Derek Hao Hu

Software Engineer | Snapchat
Snap Inc.

--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

Reply via email to