We should be using PTransform Labels (aka Names), not ids, for naming
state. This is why the names must line up when doing, for example, a
Dataflow update operation with Stateful DoFns.

(And, yes, if the user does not specify the transform name, and it is
autogenerated differently, this will be an error. This is why we throw
exceptions in the SDK if a name is re-used rather than just appending
a counter or similar.)


On Fri, Aug 21, 2020 at 4:12 PM Ke Wu <[email protected]> wrote:
>
> If user does not explicitly specify transform name, in which case a 
> autogenerated name will be used when generating the unique id, does it mean, 
> the id could change when the pipeline changes, such as adding extra 
> transforms etc?
>
> On Aug 21, 2020, at 11:43 AM, Luke Cwik <[email protected]> wrote:
>
> The DoFn is associated with a PTransform and in the pipeline proto there is a 
> unique id associated with each PTransform. You can use that to generate a 
> composite key (ptransformid, stateid) which will be unique within the 
> pipeline.
>
> On Fri, Aug 21, 2020 at 11:26 AM Ke Wu <[email protected]> wrote:
>>
>> Thank you Reuven for the confirmation. Do you know what is the recommended 
>> way for underlying runners to distinguish same state id in different DoFn(s)?
>>
>> On Aug 21, 2020, at 10:27 AM, Reuven Lax <[email protected]> wrote:
>>
>> StateId is scoped to the DoFn. You can use the same string in different 
>> DoFns for completely different states.
>>
>> On Fri, Aug 21, 2020 at 10:21 AM Ke Wu <[email protected]> wrote:
>>>
>>> Hello everyone,
>>>
>>> After reading through Stateful processing with Apache Beam and 
>>> DoFn.StateId, I understand that each state id must be unique and must be 
>>> the same type at least in the same DoFn, however, it does not explicitly 
>>> mention whether or not it is expected and supported that the same state id 
>>> to be declared in different DoFn(s). If Yes, is the state supposed to be a 
>>> shared state or is supposed to completed separate, therefore it could even 
>>> be different types. If No, it seems that the validation in Beam SDK only 
>>> validates uniqueness in the same DoFn.
>>>
>>> Thanks,
>>> Ke
>>
>>
>

Reply via email to