WatermarkReporterParam is about reporting the watermark. The main usecase
is for SplittableDoFns to be able to report the data watermark.

The watermark is per input and output of a DoFn. Also each bundle being
processed has its local watermarks while the runner computes the global
watermark. The runners watermark could be per key, or key range or global
across all keys.

There is no runner agnostic way to read the watermark today. Is there a
usecase you are targeting that would help from having access to the
watermark (also, which watermark?)?


On Tue, Apr 9, 2019 at 1:28 PM Pablo Estrada <[email protected]> wrote:

> I am experimenting with state / timers in Python. As I look at the
> DoFnProcessParams[1], I see that it's possible for a DoFn to receive
> several arguments (e.g. Timers, Side Inputs, etc). Also the Watermark via
> WatermarkReporterParam.
>
> I see that this parameter is not handled by runners when filling up the
> arguments for a DoFn[2][3]. So, as far as I can tell, DoFns are not
> currently able to get the watermark.
>
> Is this a bug, or is it intentional? Perhaps there's another way to find
> out the watermark for a DoFn?
>
> Best
> -P.
>
> [1]
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/transforms/core.py#L381-L390
>
> [2]
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/common.py#L477-L488
> [3]
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/common.py#L605-L620
>

Reply via email to