Yes, those are the ones I was referring to.

On Tue, Apr 9, 2019 at 4:08 PM Pablo Estrada <[email protected]> wrote:

> Yup! : ) - I think
>
> On Tue, Apr 9, 2019 at 3:52 PM Brian Hulette <[email protected]> wrote:
>
>> Are these the blog posts?
>>
>> https://beam.apache.org/blog/2017/02/13/stateful-processing.html
>> https://beam.apache.org/blog/2017/08/28/timely-processing.html
>>
>> On Tue, Apr 9, 2019 at 3:41 PM Pablo Estrada <[email protected]> wrote:
>>
>>> soooounds good. Thanks guys <3
>>>
>>> On Tue, Apr 9, 2019 at 3:19 PM Lukasz Cwik <[email protected]> wrote:
>>>
>>>> UnboundedSources and SplittableDoFns report watermarks which the runner
>>>> uses to compute how much the watermark could advance if it processed some
>>>> outstanding work. But it is always upto the runner to choose when the
>>>> watermark advances. The runner could process each work item in watermark
>>>> priority order and advance the watermark in small increments or could
>>>> process many work items and then advance the watermark a lot. (Note that
>>>> the BoundedSources API doesn't allow for reporting the watermark and it
>>>> starts at Beam's concept of START OF TIME and advances in one step to
>>>> Beam's concept of END OF TIME).
>>>>
>>>> You might be able to write what you want with an event based timer.
>>>> Kenn wrote (2?) blog posts on state and timers that have some pretty good
>>>> explanations and examples.
>>>>
>>>> On Tue, Apr 9, 2019 at 2:27 PM Pablo Estrada <[email protected]>
>>>> wrote:
>>>>
>>>>> hi Luke,
>>>>> thanks for the prompt reply: )
>>>>>
>>>>> That makes sense. I think I'll go back to my cave to read a bunch
>>>>> about streaming. : )
>>>>>
>>>>> I was looking for this to try to write a sequence generator for Python
>>>>> in streaming, and I was trying to debug what was going on. I was trying to
>>>>> allow the DoFn to receive a watermark reported by the upstream source. 
>>>>> (...
>>>>> does that answer "which watermark?"... I am not sure that it does... but
>>>>> maybe..).
>>>>>
>>>>> Do you think it's a reasonable use case for DoFns to know what the
>>>>> upstream watermark is?
>>>>> I hope that makes at least a some sense... : )
>>>>>
>>>>> If it doesn't make sense, feel free to ignore, and I'll go do my
>>>>> readings.
>>>>> Thanks!
>>>>> -P.
>>>>>
>>>>> On Tue, Apr 9, 2019 at 1:44 PM Lukasz Cwik <[email protected]> wrote:
>>>>>
>>>>>> WatermarkReporterParam is about reporting the watermark. The main
>>>>>> usecase is for SplittableDoFns to be able to report the data watermark.
>>>>>>
>>>>>> The watermark is per input and output of a DoFn. Also each bundle
>>>>>> being processed has its local watermarks while the runner computes the
>>>>>> global watermark. The runners watermark could be per key, or key range or
>>>>>> global across all keys.
>>>>>>
>>>>>> There is no runner agnostic way to read the watermark today. Is there
>>>>>> a usecase you are targeting that would help from having access to the
>>>>>> watermark (also, which watermark?)?
>>>>>>
>>>>>>
>>>>>> On Tue, Apr 9, 2019 at 1:28 PM Pablo Estrada <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> I am experimenting with state / timers in Python. As I look at the
>>>>>>> DoFnProcessParams[1], I see that it's possible for a DoFn to receive
>>>>>>> several arguments (e.g. Timers, Side Inputs, etc). Also the Watermark 
>>>>>>> via
>>>>>>> WatermarkReporterParam.
>>>>>>>
>>>>>>> I see that this parameter is not handled by runners when filling up
>>>>>>> the arguments for a DoFn[2][3]. So, as far as I can tell, DoFns are not
>>>>>>> currently able to get the watermark.
>>>>>>>
>>>>>>> Is this a bug, or is it intentional? Perhaps there's another way to
>>>>>>> find out the watermark for a DoFn?
>>>>>>>
>>>>>>> Best
>>>>>>> -P.
>>>>>>>
>>>>>>> [1]
>>>>>>> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/transforms/core.py#L381-L390
>>>>>>>
>>>>>>> [2]
>>>>>>> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/common.py#L477-L488
>>>>>>> [3]
>>>>>>> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/common.py#L605-L620
>>>>>>>
>>>>>>

Reply via email to