Re: DoFn setup/teardown sequence

2017-10-16 Thread Jean-Baptiste Onofré

Yes, no problem at all. I meant that the DoFn is "attached" to a pipeline.

Regards
JB

On 10/16/2017 08:25 AM, Derek Hao Hu wrote:

I believe a worker can execute multiple instances (i.e. threads) of a DoFn.

Derek

On Sun, Oct 15, 2017 at 10:46 PM, Jean-Baptiste Onofré > wrote:


Hi,

Correct, @setup is used when bootstrapping the DoFn, @StartBundle is called
for a set of data (bundle), @ProcessElement is for each element in the
bundle/collection, @FinishBundle at the end of the dataset (bundle),
@Teardown is called when the DoFn is "removed".

A DoFn is per pipeline.

Regards
JB


On 10/16/2017 07:31 AM, Jacob Marble wrote:

(there might be documentation on this that I didn't find; if so a link
is sufficient)

Good evening, this is just a check on my understanding. It looks like an
instance of a given DoFn goes through this lifecycle. Am I correct?

- constructor
- @Setup (once)
    - @StartBundle (zero to many times)
      - @ProcessContext (zero to many times)
    - @FinishBundle
- @Teardown (once)

Can any of these steps be called concurrently? (I believe no)
Can one worker execute multiple instances of a DoFn? (I believe yes)

Thank you,

Jacob


-- 
Jean-Baptiste Onofré

jbono...@apache.org 
http://blog.nanthrax.net
Talend - http://www.talend.com




--
Derek Hao Hu

Software Engineer | Snapchat
Snap Inc.


--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com


Re: DoFn setup/teardown sequence

2017-10-16 Thread Derek Hao Hu
I believe a worker can execute multiple instances (i.e. threads) of a DoFn.

Derek

On Sun, Oct 15, 2017 at 10:46 PM, Jean-Baptiste Onofré 
wrote:

> Hi,
>
> Correct, @setup is used when bootstrapping the DoFn, @StartBundle is
> called for a set of data (bundle), @ProcessElement is for each element in
> the bundle/collection, @FinishBundle at the end of the dataset (bundle),
> @Teardown is called when the DoFn is "removed".
>
> A DoFn is per pipeline.
>
> Regards
> JB
>
>
> On 10/16/2017 07:31 AM, Jacob Marble wrote:
>
>> (there might be documentation on this that I didn't find; if so a link is
>> sufficient)
>>
>> Good evening, this is just a check on my understanding. It looks like an
>> instance of a given DoFn goes through this lifecycle. Am I correct?
>>
>> - constructor
>> - @Setup (once)
>>- @StartBundle (zero to many times)
>>  - @ProcessContext (zero to many times)
>>- @FinishBundle
>> - @Teardown (once)
>>
>> Can any of these steps be called concurrently? (I believe no)
>> Can one worker execute multiple instances of a DoFn? (I believe yes)
>>
>> Thank you,
>>
>> Jacob
>>
>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>



-- 
Derek Hao Hu

Software Engineer | Snapchat
Snap Inc.


Re: DoFn setup/teardown sequence

2017-10-15 Thread Jean-Baptiste Onofré

Hi,

Correct, @setup is used when bootstrapping the DoFn, @StartBundle is called for 
a set of data (bundle), @ProcessElement is for each element in the 
bundle/collection, @FinishBundle at the end of the dataset (bundle), @Teardown 
is called when the DoFn is "removed".


A DoFn is per pipeline.

Regards
JB

On 10/16/2017 07:31 AM, Jacob Marble wrote:
(there might be documentation on this that I didn't find; if so a link is 
sufficient)


Good evening, this is just a check on my understanding. It looks like an 
instance of a given DoFn goes through this lifecycle. Am I correct?


- constructor
- @Setup (once)
   - @StartBundle (zero to many times)
     - @ProcessContext (zero to many times)
   - @FinishBundle
- @Teardown (once)

Can any of these steps be called concurrently? (I believe no)
Can one worker execute multiple instances of a DoFn? (I believe yes)

Thank you,

Jacob


--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com