ea of the new DoFn is that it is easy to do the
>>>>> construction-time analysis and reject the invalid pipeline. It is actually
>>>>> runn
;>>> >> >>>
>>>> >>>>>>>> >> >>> I am in the camp where we should only support merging
>>>> state (either naturally via things like bags or via combiners). I believe
>>>> that having the wrapper that Brian sugges
occur. I also like treating everything like a combiner because it
>>> will give us a lot reuse of combiner implementations across all the places
>>> they could be used and will be especially useful when w
just assumed
>> there wasn't anything to be concerned about. So it was shocking to learn
>> that there is this dangerous edge-case.
>> >>>
gt;>> >> >>>>
> >>>>>>>> >> >>>> On Fri, Apr 26, 2019 at 2:29 AM Robert Bradshaw <
> rober...@google.com> wrote:
> >>>>>>>> >> >>>>>
> >>>>>>>> &
bout
>>>>>>>> >> >>>> and I just assumed there wasn't anything to be concerned
>>>>>>>> >> >>>> about. So it was shocking to learn that there is this
>>>>>>>> >> >>>>
gt; us a lot reuse of combiner implementations across all the places they
>>>>>>>> could
>>>>>>>> be used and will be especially useful when we start exposin
gt;>>>>> towards ValueState since it was easy to think about and I just assumed
>>>>>>> there wasn't anything to be concerned about. So it was shocking to learn
>>>>>>> that there is this dangerous edge-case.
>>>>>>> >> >>>>
&g
;>>>>>> more than the merging as well. As a new Beam user, I immediately
>>>>>>> gravitated
>>>>>>> towards ValueState since it was easy to think about and I just assumed
>>>>>&
> >> >>>> Brian
>>>>>> >> >>>>
>>>>>> >> >>>>
>>>>>> >> >>>>
>>>>>> >> >>>> On Fri, Apr 26, 2019 at 2:29 AM Robert Bradshaw <
>>>>>> rober...@goo
t;>> that
>>>>> >> >>>>> would be returned when the timer fires. (It's in the FnAPI,
>>>>> but not
>>>>> >> >>>>> the SDKs yet.)
>>>>> >> >>>>>
>>>>> >> >>
t;> >> >>>>> > But I've come to feel there is a mismatch. On the one hand,
>>>> ParDo() is a way to drop to a lower level and write logic
>>>> that does not fit a more general computational pattern, re
more
>>> direct control over how state from windows gets merged. An of course we
>>> don't even have a design for timers - you would need some kind of timestamp
>>> CombineFn but I think setting/unsetting timers manually makes more sense.
>>> Especially considering the trickiness around merging windows in t
gt;> continuum (the indexing example falling towards the high end).
>> >> >>>>>
>> >> >>>>> Actually, the merging questions bother me less than how easy it
>> is to
>> >> >>>>> accidentally clobber previous values.
; On Thu, Apr 25, 2019 at 5:49 PM Reza Rokni
> wrote:
> >> >>>>> >>
> >> >>>>> >> +1 on the metadata use case.
> >> >>>>> >>
> >> >>>>> >> For performance reasons the Timer API
gt;> - Robert
>> >>>>>
>> >>>>> > On Thu, Apr 25, 2019 at 5:49 PM Reza Rokni wrote:
>> >>>>> >>
>> >>>>> >> +1 on the metadata use case.
>> >>>>&
t; >>
> >>>>> >> On Fri, 26 Apr 2019 at 00:38, Reuven Lax
> wrote:
> >>>>> >>>
> >>>>> >>> I see examples of people using ValueState that I think are not
> captured CombiningState. For example, one common one is users who set
; On Fri, 26 Apr 2019 at 00:38, Reuven Lax wrote:
>>>>> >>>
>>>>> >>> I see examples of people using ValueState that I think are not
>>>>> >>> captured CombiningState. For example, one common one is users who set
>>>>> >>> a
people using ValueState that I think are not
>>>>> captured CombiningState. For example, one common one is users who set a
>>>>> timer and then record the timestamp of that timer in a ValueState. In
>>>>> general when you store state that is metada
re state that is metadata about other state you store,
>>>> then ValueState will usually make more sense than CombiningState.
>>>> >>>
>>>> >>> On Thu, Apr 25, 2019 at 9:32 AM Brian Hulette
>>>> wrote:
>>>> >>>>
>>>
:
>>> >>>>
>>> >>>> Currently the Python SDK does not make ValueState available to
>>> users. My initial inclination was to go ahead and implement it there to be
>>> consistent with Java, but Robert bring
gt;> ValueState has an inherent race condition for out of order data, and a lot
>> of it's use cases can actually be implemented with a CombiningState instead.
>> >>>>
>> >>>> It seems to me that at the very least we should discourage the use
>>
; ValueState by noting the danger in the documentation and preferring
> CombiningState in examples, and perhaps we should go further and deprecate
> it in Java and not implement it in python. Either way I think we should be
> consistent between Java and Python.
> >>>>
> &g
t;>>> CombiningState in examples, and perhaps we should go further and deprecate
>>>> it in Java and not implement it in python. Either way I think we should be
>>>> consistent between Java and Python.
>>>>
>>>> I'm curious what people think ab
m curious what people think about this, are there use cases that we
>>> really need to keep ValueState around for?
>>>
>>> Brian
>>>
>>> -- Forwarded message -
>>> From: Robert Bradshaw
>>> Date: Thu, Apr 25, 2019,
need to keep ValueState around for?
>>
>> Brian
>>
>> -- Forwarded message -
>> From: Robert Bradshaw
>> Date: Thu, Apr 25, 2019, 08:31
>> Subject: Re: [docs] Python State & Timers
>> To: dev
>>
>>
>>
>>
his, are there use cases that we
> really need to keep ValueState around for?
>
> Brian
>
> -- Forwarded message -
> From: Robert Bradshaw
> Date: Thu, Apr 25, 2019, 08:31
> Subject: Re: [docs] Python State & Timers
> To: dev
>
>
>
>
> On
019, 08:31
Subject: Re: [docs] Python State & Timers
To: dev
On Thu, Apr 25, 2019, 5:26 PM Maximilian Michels wrote:
> Completely agree that CombiningState is nicer in this example. Users may
> still want to use ValueState when there is nothing to combine.
I've always had t
Thanks for the great discussion on this! I hadn't even thought about the
potential race condition for ValueState. It does seem like we should be
encouraging people to use a CombiningState if at all possible.
It sounds like this thread is resolved (we know how to add python examples
to the stateful
On Thu, Apr 25, 2019, 5:26 PM Maximilian Michels wrote:
> Completely agree that CombiningState is nicer in this example. Users may
> still want to use ValueState when there is nothing to combine.
I've always had trouble coming up with any good examples of this.
Also,
> users already know Value
Completely agree that CombiningState is nicer in this example. Users may
still want to use ValueState when there is nothing to combine. Also,
users already know ValueState from the Java SDK.
On 25.04.19 17:12, Robert Bradshaw wrote:
On Thu, Apr 25, 2019 at 4:58 PM Maximilian Michels wrote:
On Thu, Apr 25, 2019 at 4:58 PM Maximilian Michels wrote:
>
> I forgot to give an example, just to clarify for others:
>
> > What was the specific example that was less natural?
>
> Basically every time we use ListState to express ValueState, e.g.
>
>next_index, = list(state.read()) or [0]
>
>
I forgot to give an example, just to clarify for others:
What was the specific example that was less natural?
Basically every time we use ListState to express ValueState, e.g.
next_index, = list(state.read()) or [0]
Taken from:
https://github.com/apache/beam/pull/8363/files#diff-ba1a2aed9
https://github.com/apache/beam/pull/8402
On Thu, Apr 25, 2019 at 4:26 PM Robert Bradshaw wrote:
>
> Oh, this is for the indexing example.
>
> I actually think using CombiningState is more cleaner than ValueState.
>
> https://github.com/apache/beam/blob/release-2.12.0/sdks/python/apache_beam/runne
Oh, this is for the indexing example.
I actually think using CombiningState is more cleaner than ValueState.
https://github.com/apache/beam/blob/release-2.12.0/sdks/python/apache_beam/runners/portability/fn_api_runner_test.py#L262
(The fact that one must specify the accumulator coder is, however
The desire was to avoid the implicit disallowed combination wart in
Python (until we could make sense of it), and also ValueState could be
surprising with respect to older values overwriting newer ones. What
was the specific example that was less natural?
On Thu, Apr 25, 2019 at 3:01 PM Maximilian
@Pablo: Thanks for following up with the PR! :)
@Brian: I was wondering about this as well. It makes the Python state
code a bit unnatural. I'd suggest to add a ValueState wrapper around
ListState/CombiningState.
@Robert: Like Reuven pointed out, we can disallow ValueState for merging
window
Pablo, Kenneth and I have a new blog ready for publication which covers how
to create a "looping timer" it allows for default values to be created in a
window when no incoming elements exists. We just need to clear a few bits
before publication, but would be great to have that also include a python
Well state is still not implemented for merging windows even for Java
(though I believe the idea was to disallow ValueState there).
On Wed, Apr 24, 2019 at 1:11 PM Robert Bradshaw wrote:
> It was unclear what the semantics were for ValueState for merging
> windows. (It's also a bit weird as it's
It was unclear what the semantics were for ValueState for merging
windows. (It's also a bit weird as it's inherently a race condition
wrt element ordering, unlike Bag and CombineState, though you can
always implement it as a CombineState that always returns the latest
value which is a bit more expl
That's a great idea! I thought about this too after those posts came up on
the list recently. I started to look into it, but I noticed that there's
actually no implementation of ValueState in userstate. Is there a reason
for that? I started to work on a patch to add it but I was just curious if
the
It would probably be pretty easy to add the corresponding code snippets to the
docs as well.
It's probably a bit more work because there is no section dedicated to
state/timer yet in the documentation. Tracked here:
https://jira.apache.org/jira/browse/BEAM-2472
I've been going over this to
I've been going over this topic a bit. I'll add the snippets next week, if
that's fine by y'all.
Best
-P.
On Thu, Apr 11, 2019 at 5:27 AM Robert Bradshaw wrote:
> That's a great idea! It would probably be pretty easy to add the
> corresponding code snippets to the docs as well.
>
> On Thu, Apr 1
That's a great idea! It would probably be pretty easy to add the
corresponding code snippets to the docs as well.
On Thu, Apr 11, 2019 at 2:00 PM Maximilian Michels wrote:
>
> Hi everyone,
>
> The Python SDK still lacks documentation on state and timers.
>
> As a first step, what do you think abo
Hi everyone,
The Python SDK still lacks documentation on state and timers.
As a first step, what do you think about updating these two blog posts
with the corresponding Python code?
https://beam.apache.org/blog/2017/02/13/stateful-processing.html
https://beam.apache.org/blog/2017/08/28/timely
45 matches
Mail list logo