I know about TestStream and I am using it, but, for example, I want to test
a use case that the timer callback is being called once the watermark
passes the set time in the timer. Like in this test [1] for example, I want
to be able to have something like assert bag_state == None at the end of
the test. Is this possible? As most of the tests from that module are
returning specific values from time callbacks and then the tests assert
that those values are being returned, but in a real use case, you don't
necessarily return values from timer callbacks.

Another use case is when the time is set only in specific scenarios, how
can I test what the timer value is?

Hope it makes sense what I am describing.

[1]
https://github.com/apache/beam/blob/8e217ea0d1f383ef5033ef507b14d01edf9c67e6/sdks/python/apache_beam/transforms/userstate_test.py#L487

On Wed, Dec 1, 2021 at 7:21 PM Luke Cwik <lc...@google.com> wrote:

> That should have been "TestStream [2, 3, 4]"
>
> On Wed, Dec 1, 2021 at 9:20 AM Luke Cwik <lc...@google.com> wrote:
>
>> There is some good information about testing in the Apache Beam
>> documentation[1] about how you want to test the transforms/pipeline instead
>> of the DoFn.
>>
>> For your use case, TestStream [1, 2, 3] is your best bet combined with
>> the above advice about transform/pipeline level testing. TestStream is used
>> to simulate ingestion of data and allows control of watermark and
>> processing time advancement.
>>
>> 1: https://beam.apache.org/documentation/pipelines/test-your-pipeline/
>> 2: https://beam.apache.org/blog/test-stream/
>> 3:
>> https://medium.com/@asitkovets/testing-in-apache-beam-part-2-stream-2a9950ba2bc7
>> 4:
>> https://github.com/apache/beam/blob/8e217ea0d1f383ef5033ef507b14d01edf9c67e6/sdks/python/apache_beam/transforms/deduplicate_test.py#L109
>>
>>
>> On Wed, Dec 1, 2021 at 1:07 AM Tudor Plugaru <tu...@gorgias.com> wrote:
>>
>>> Hi,
>>> What is the best approach in unit testing a stateful DoFn? I've looked
>>> over the userstate_test.py in Beam repo, but those examples do not really
>>> apply to our case. In those tests, the DoFn used for testing are returning
>>> values from timer callbacks which does not really happen in reality.
>>> I am more interested in testing if a timer was triggered after the
>>> watermark advanced, or what is the state bag content at a specific time.
>>>
>>> Actually it would really be nice to have some kind of documentation
>>> regarding testing and best practices in writing unit/integration tests for
>>> Beam pipelines.
>>>
>>> Thanks,
>>> Tudor
>>>
>>

Reply via email to