I agree Reuven. But leaking in a source doesnt give any guarantee regarding
the execution since it will depends the runner and current API will not
provide you that feature. Using a reference counting state can work better
but would require a sdf migration (and will hit runner support issues :().


Le jeu. 2 août 2018 05:39, Reuven Lax <[email protected]> a écrit :

> Hi Romain,
>
> Andrew's example actually wouldn't work for that. With Google Cloud
> Pub/Sub (the example source he referenced), if there is no subscription to
> a topic, all publishes to that topic are dropped on the floor; if you don't
> want to lose data, your are expected to keep the subscription around
> continuously. In this example, leaking a subscription is probably
> preferable to losing date (especially since Pub/Sub itself garbage collects
> subscriptions that have been inactive for a long time).
>
> The answer might be that Beam does not have a good lifecycle story here,
> and something needs to be built.
>
> Reuven
>
> On Tue, Jul 31, 2018 at 10:04 PM Romain Manni-Bucau <[email protected]>
> wrote:
>
>> Hi Andrew,
>>
>> IIRC sources should clean up their resources per method since they dont
>> have a better lifecycle. Readers can create anything longer and release it
>> at close time.
>>
>>
>> Le mer. 1 août 2018 00:31, Andrew Pilloud <[email protected]> a écrit :
>>
>>> Some of our IOs create external resources that need to be cleaned up
>>> when a pipeline is terminated. It looks like the
>>> org.apache.beam.sdk.io.UnboundedSource interface is called on creation, but
>>> there is no call for cleanup. For example, PubsubIO creates a Pubsub
>>> subcription in createReader()/split() and it should be deleted at shutdown.
>>> Does anyone have ideas on how I might make this happen?
>>>
>>> (I filed https://issues.apache.org/jira/browse/BEAM-5051 tracking the
>>> PubSub specific issue.)
>>>
>>> Andrew
>>>
>>

Reply via email to