I would go with creating JIRAs and PRs directly since this doesn't seem to be contentious since you have received feedback from a few folks and they are all suggesting the same thing.
On Sun, Oct 27, 2019 at 9:27 PM jincheng sun <[email protected]> wrote: > Hi all, > > Thanks a lot for your feedback. It seems that we have reached consensus > that both "Approach 2" and "Approach 3" are needed. "Approach 3" is a good > supplement for "Approach 2" and I also prefer "Approach 2" and "Approach 3" > for now. > > Do we need to vote on this discussion or I can create JIRAs and submit the > PRs directly? > > Best, > Jincheng > > Luke Cwik <[email protected]> 于2019年10月26日周六 上午4:01写道: > >> Approach 3 is about caching the bundle descriptor forever but tearing >> down a "live" instance of the DoFns at some SDK chosen arbitrary point in >> time. This way if a future ProcessBundleRequest comes in, a new "live" >> instance can be constructed. >> Approach 2 is still needed so that when the workers are being >> shutdown all the "live" instances are torn down. >> >> On Fri, Oct 25, 2019 at 12:56 PM Robert Burke <[email protected]> wrote: >> >>> Approach 2 isn't incompatible with approach 3. 3 simple sets down >>> convention/configuration for the conditions when the SDK will do this after >>> process bundle has completed. >>> >>> >>> >>> On Fri, Oct 25, 2019, 12:34 PM Robert Bradshaw <[email protected]> >>> wrote: >>> >>>> I think we'll still need approach (2) for when the pipeline finishes >>>> and a runner is tearing down workers. >>>> >>>> On Fri, Oct 25, 2019 at 10:36 AM Maximilian Michels <[email protected]> >>>> wrote: >>>> > >>>> > Hi Jincheng, >>>> > >>>> > Thanks for bringing this up and capturing the ideas in the doc. >>>> > >>>> > Intuitively, I would have also considered adding a new Proto message >>>> for >>>> > the teardown, but I think the idea to trigger this logic when the SDK >>>> > Harness evicts process bundle descriptors is more elegant. >>>> > >>>> > Thanks, >>>> > Max >>>> > >>>> > On 25.10.19 17:23, Luke Cwik wrote: >>>> > > I like approach 3 since it doesn't add additional complexity to the >>>> API >>>> > > and individual SDKs can choose to implement any clean-up strategy >>>> they >>>> > > want or none at all which is the simplest. >>>> > > >>>> > > On Thu, Oct 24, 2019 at 8:46 PM jincheng sun < >>>> [email protected] >>>> > > <mailto:[email protected]>> wrote: >>>> > > >>>> > > Hi, >>>> > > >>>> > > Thanks for your comments in doc, I have add Approach 3 which you >>>> > > mentioned! @Luke >>>> > > >>>> > > For now, we should do a decision for Approach 3 and Approach 1. >>>> > > Detail can be found in doc [1] >>>> > > >>>> > > Welcome anyone's feedback :) >>>> > > >>>> > > Regards, >>>> > > Jincheng >>>> > > >>>> > > [1] >>>> > > >>>> https://docs.google.com/document/d/1sCgy9VQPf9zVXKRquK8P6N4x7aB62GEO8ozkujRSHZg/edit?usp=sharing >>>> > > >>>> > > jincheng sun <[email protected] >>>> > > <mailto:[email protected]>> 于2019年10月25日周五 上午10:40写道: >>>> > > >>>> > > Hi, >>>> > > >>>> > > Functionally capable of `abort`, but it will be called at >>>> the >>>> > > end of operator. So, I prefer `dispose` semantics. i.e., all >>>> > > normal logic has been executed. >>>> > > >>>> > > Best, >>>> > > Jincheng >>>> > > >>>> > > Harsh Vardhan <[email protected] <mailto: >>>> [email protected]>> >>>> > > 于2019年10月23日周三 上午12:14写道: >>>> > > >>>> > > Would approach 1 be akin to abort semantics? >>>> > > >>>> > > On Mon, Oct 21, 2019 at 8:01 PM jincheng sun >>>> > > <[email protected] <mailto: >>>> [email protected]>> >>>> > > wrote: >>>> > > >>>> > > Hi Luke, >>>> > > >>>> > > Thanks a lot for your reply. Since it allows to >>>> share >>>> > > one SDK harness between multiple executable stages, >>>> the >>>> > > control service termination may occur much later >>>> than >>>> > > the completion of an executable stage. This is the >>>> main >>>> > > reason I prefer runners to control the teardown of >>>> DoFns. >>>> > > >>>> > > Regarding to "SDK harnesses can terminate instances >>>> any >>>> > > time they want and start new instances anytime as >>>> > > well.", personally I think it's not conflict with >>>> the >>>> > > proposed Approach 1 as the SDK harness could decide >>>> what >>>> > > to do when receiving the teardown request. It could >>>> do >>>> > > nothing if the DoFns has already been teared down >>>> and >>>> > > could also tear down the DoFns if needed. >>>> > > >>>> > > What do you think? >>>> > > >>>> > > Best, >>>> > > Jincheng >>>> > > >>>> > > Luke Cwik <[email protected] <mailto: >>>> [email protected]>> >>>> > > 于2019年10月22日周二 上午2:05写道: >>>> > > >>>> > > Approach 2 is currently the suggested >>>> approach[1] >>>> > > for DoFn's to shutdown. >>>> > > Note that SDK harnesses can terminate instances >>>> any >>>> > > time they want and start new instances anytime >>>> as well. >>>> > > >>>> > > Why do you want to expose this logic so that >>>> Runners >>>> > > could control it? >>>> > > >>>> > > 1: >>>> > > >>>> https://docs.google.com/document/d/1n6s3BOxOPct3uF4UgbbI9O9rpdiKWFH9R6mtVmR7xp0/edit# >>>> > > >>>> > > On Mon, Oct 21, 2019 at 4:27 AM jincheng sun >>>> > > <[email protected] >>>> > > <mailto:[email protected]>> wrote: >>>> > > >>>> > > Hi, >>>> > > I found that in `SdkHarness` do not stop >>>> the >>>> > > `SdkWorker` when finish. We should add the >>>> > > logic for stop the `SdkWorker` in >>>> `SdkHarness`. >>>> > > More detail can be found [1]. >>>> > > >>>> > > There are two approaches to solve this >>>> issue: >>>> > > >>>> > > Approach 1: We can add a Fn API for >>>> teardown >>>> > > purpose and the runner will teardown a >>>> specific >>>> > > bundle descriptor via this teardown Fn API >>>> > > during disposing. >>>> > > Approach 2: The control service termination >>>> > > could be seen as a signal and once SDK >>>> harness >>>> > > receives this signal, the teardown of the >>>> bundle >>>> > > descriptor will be performed. >>>> > > >>>> > > More detail can be found in [2]. >>>> > > >>>> > > As the Approach 2, SDK harness could be >>>> shared >>>> > > between multiple executable stages. The >>>> control >>>> > > service termination only occurs when all the >>>> > > executable stages sharing the same SDK >>>> harness >>>> > > finished. This means that the teardown of >>>> DoFns >>>> > > may not be executed immediately after an >>>> > > executable stage is finished. >>>> > > >>>> > > So, I prefer Approach 1. Welcome any >>>> feedback :) >>>> > > >>>> > > Best, >>>> > > Jincheng >>>> > > >>>> > > [1] >>>> > > >>>> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/worker/sdk_worker.py >>>> > > [2] >>>> > > >>>> https://docs.google.com/document/d/1sCgy9VQPf9zVXKRquK8P6N4x7aB62GEO8ozkujRSHZg/edit?usp=sharing >>>> > > >>>> > > -- >>>> > > >>>> > > Got feedback? go/harsh-feedback >>>> <https://goto.google.com/harsh-feedback> >>>> > > <https://goto.google.com/harsh-feedback> >>>> > > >>>> >>>
