I see examples of people using ValueState that I think are not captured
CombiningState. For example, one common one is users who set a timer and
then record the timestamp of that timer in a ValueState. In general when
you store state that is metadata about other state you store, then
ValueState will usually make more sense than CombiningState.

On Thu, Apr 25, 2019 at 9:32 AM Brian Hulette <bhule...@google.com> wrote:

> Currently the Python SDK does not make ValueState available to users. My
> initial inclination was to go ahead and implement it there to be consistent
> with Java, but Robert brings up a great point here that ValueState has an
> inherent race condition for out of order data, and a lot of it's use cases
> can actually be implemented with a CombiningState instead.
>
> It seems to me that at the very least we should discourage the use of
> ValueState by noting the danger in the documentation and preferring
> CombiningState in examples, and perhaps we should go further and deprecate
> it in Java and not implement it in python. Either way I think we should be
> consistent between Java and Python.
>
> I'm curious what people think about this, are there use cases that we
> really need to keep ValueState around for?
>
> Brian
>
> ---------- Forwarded message ---------
> From: Robert Bradshaw <rober...@google.com>
> Date: Thu, Apr 25, 2019, 08:31
> Subject: Re: [docs] Python State & Timers
> To: dev <dev@beam.apache.org>
>
>
>
>
> On Thu, Apr 25, 2019, 5:26 PM Maximilian Michels <m...@apache.org> wrote:
>
>> Completely agree that CombiningState is nicer in this example. Users may
>> still want to use ValueState when there is nothing to combine.
>
>
> I've always had trouble coming up with any good examples of this.
>
> Also,
>> users already know ValueState from the Java SDK.
>>
>
> Maybe we should deprecate that :)
>
>
> On 25.04.19 17:12, Robert Bradshaw wrote:
>> > On Thu, Apr 25, 2019 at 4:58 PM Maximilian Michels <m...@apache.org>
>> wrote:
>> >>
>> >> I forgot to give an example, just to clarify for others:
>> >>
>> >>> What was the specific example that was less natural?
>> >>
>> >> Basically every time we use ListState to express ValueState, e.g.
>> >>
>> >>     next_index, = list(state.read()) or [0]
>> >>
>> >> Taken from:
>> >>
>> https://github.com/apache/beam/pull/8363/files#diff-ba1a2aed98079ccce869cd660ca9d97dR301
>> >
>> > Yes, ListState is much less natural here. I think generally
>> > CombiningValue is often a better replacement. E.g. the Java example
>> > reads
>> >
>> >
>> > public void processElement(
>> >        ProcessContext context, @StateId("index") ValueState<Integer>
>> index) {
>> >      int current = firstNonNull(index.read(), 0);
>> >      context.output(KV.of(current, context.element()));
>> >      index.write(current+1);
>> > }
>> >
>> >
>> > which is replaced with bag state
>> >
>> >
>> > def process(self, element, state=DoFn.StateParam(INDEX_STATE)):
>> >      next_index, = list(state.read()) or [0]
>> >      yield (element, next_index)
>> >      state.clear()
>> >      state.add(next_index + 1)
>> >
>> >
>> > whereas CombiningState would be more natural (than ListState, and
>> > arguably than even ValueState), giving
>> >
>> >
>> > def process(self, element, index=DoFn.StateParam(INDEX_STATE)):
>> >      yield element, index.read()
>> >      index.add(1)
>> >
>> >
>> >
>> >
>> >>
>> >> -Max
>> >>
>> >> On 25.04.19 16:40, Robert Bradshaw wrote:
>> >>> https://github.com/apache/beam/pull/8402
>> >>>
>> >>> On Thu, Apr 25, 2019 at 4:26 PM Robert Bradshaw <rober...@google.com>
>> wrote:
>> >>>>
>> >>>> Oh, this is for the indexing example.
>> >>>>
>> >>>> I actually think using CombiningState is more cleaner than
>> ValueState.
>> >>>>
>> >>>>
>> https://github.com/apache/beam/blob/release-2.12.0/sdks/python/apache_beam/runners/portability/fn_api_runner_test.py#L262
>> >>>>
>> >>>> (The fact that one must specify the accumulator coder is, however,
>> >>>> unfortunate. We should probably infer that if we can.)
>> >>>>
>> >>>> On Thu, Apr 25, 2019 at 4:19 PM Robert Bradshaw <rober...@google.com>
>> wrote:
>> >>>>>
>> >>>>> The desire was to avoid the implicit disallowed combination wart in
>> >>>>> Python (until we could make sense of it), and also ValueState could
>> be
>> >>>>> surprising with respect to older values overwriting newer ones. What
>> >>>>> was the specific example that was less natural?
>> >>>>>
>> >>>>> On Thu, Apr 25, 2019 at 3:01 PM Maximilian Michels <m...@apache.org>
>> wrote:
>> >>>>>>
>> >>>>>> @Pablo: Thanks for following up with the PR! :)
>> >>>>>>
>> >>>>>> @Brian: I was wondering about this as well. It makes the Python
>> state
>> >>>>>> code a bit unnatural. I'd suggest to add a ValueState wrapper
>> around
>> >>>>>> ListState/CombiningState.
>> >>>>>>
>> >>>>>> @Robert: Like Reuven pointed out, we can disallow ValueState for
>> merging
>> >>>>>> windows with state.
>> >>>>>>
>> >>>>>> @Reza: Great. Let's make sure it has Python examples out of the
>> box.
>> >>>>>> Either Pablo or me could help there.
>> >>>>>>
>> >>>>>> Thanks,
>> >>>>>> Max
>> >>>>>>
>> >>>>>> On 25.04.19 04:14, Reza Ardeshir Rokni wrote:
>> >>>>>>> Pablo, Kenneth and I have a new blog ready for publication which
>> covers
>> >>>>>>> how to create a "looping timer" it allows for default values to be
>> >>>>>>> created in a window when no incoming elements exists. We just
>> need to
>> >>>>>>> clear a few bits before publication, but would be great to have
>> that
>> >>>>>>> also include a python example, I wrote it in java...
>> >>>>>>>
>> >>>>>>> Cheers
>> >>>>>>>
>> >>>>>>> Reza
>> >>>>>>>
>> >>>>>>> On Thu, 25 Apr 2019 at 04:34, Reuven Lax <re...@google.com
>> >>>>>>> <mailto:re...@google.com>> wrote:
>> >>>>>>>
>> >>>>>>>       Well state is still not implemented for merging windows
>> even for
>> >>>>>>>       Java (though I believe the idea was to disallow ValueState
>> there).
>> >>>>>>>
>> >>>>>>>       On Wed, Apr 24, 2019 at 1:11 PM Robert Bradshaw <
>> rober...@google.com
>> >>>>>>>       <mailto:rober...@google.com>> wrote:
>> >>>>>>>
>> >>>>>>>           It was unclear what the semantics were for ValueState
>> for merging
>> >>>>>>>           windows. (It's also a bit weird as it's inherently a
>> race condition
>> >>>>>>>           wrt element ordering, unlike Bag and CombineState,
>> though you can
>> >>>>>>>           always implement it as a CombineState that always
>> returns the latest
>> >>>>>>>           value which is a bit more explicit about the dangers
>> here.)
>> >>>>>>>
>> >>>>>>>           On Wed, Apr 24, 2019 at 10:08 PM Brian Hulette
>> >>>>>>>           <bhule...@google.com <mailto:bhule...@google.com>>
>> wrote:
>> >>>>>>>            >
>> >>>>>>>            > That's a great idea! I thought about this too after
>> those
>> >>>>>>>           posts came up on the list recently. I started to look
>> into it,
>> >>>>>>>           but I noticed that there's actually no implementation of
>> >>>>>>>           ValueState in userstate. Is there a reason for that? I
>> started
>> >>>>>>>           to work on a patch to add it but I was just curious if
>> there was
>> >>>>>>>           some reason it was omitted that I should be aware of.
>> >>>>>>>            >
>> >>>>>>>            > We could certainly replicate the example without
>> ValueState
>> >>>>>>>           by using BagState and clearing it before each write,
>> but it
>> >>>>>>>           would be nice if we could draw a direct parallel.
>> >>>>>>>            >
>> >>>>>>>            > Brian
>> >>>>>>>            >
>> >>>>>>>            > On Fri, Apr 12, 2019 at 7:05 AM Maximilian Michels
>> >>>>>>>           <m...@apache.org <mailto:m...@apache.org>> wrote:
>> >>>>>>>            >>
>> >>>>>>>            >> > It would probably be pretty easy to add the
>> corresponding
>> >>>>>>>           code snippets to the docs as well.
>> >>>>>>>            >>
>> >>>>>>>            >> It's probably a bit more work because there is no
>> section
>> >>>>>>>           dedicated to
>> >>>>>>>            >> state/timer yet in the documentation. Tracked here:
>> >>>>>>>            >> https://jira.apache.org/jira/browse/BEAM-2472
>> >>>>>>>            >>
>> >>>>>>>            >> > I've been going over this topic a bit. I'll add
>> the
>> >>>>>>>           snippets next week, if that's fine by y'all.
>> >>>>>>>            >>
>> >>>>>>>            >> That would be great. The blog posts are a great way
>> to get
>> >>>>>>>           started with
>> >>>>>>>            >> state/timers.
>> >>>>>>>            >>
>> >>>>>>>            >> Thanks,
>> >>>>>>>            >> Max
>> >>>>>>>            >>
>> >>>>>>>            >> On 11.04.19 20:21, Pablo Estrada wrote:
>> >>>>>>>            >> > I've been going over this topic a bit. I'll add
>> the
>> >>>>>>>           snippets next week,
>> >>>>>>>            >> > if that's fine by y'all.
>> >>>>>>>            >> > Best
>> >>>>>>>            >> > -P.
>> >>>>>>>            >> >
>> >>>>>>>            >> > On Thu, Apr 11, 2019 at 5:27 AM Robert Bradshaw
>> >>>>>>>           <rober...@google.com <mailto:rober...@google.com>
>> >>>>>>>            >> > <mailto:rober...@google.com <mailto:
>> rober...@google.com>>>
>> >>>>>>>           wrote:
>> >>>>>>>            >> >
>> >>>>>>>            >> >     That's a great idea! It would probably be
>> pretty easy
>> >>>>>>>           to add the
>> >>>>>>>            >> >     corresponding code snippets to the docs as
>> well.
>> >>>>>>>            >> >
>> >>>>>>>            >> >     On Thu, Apr 11, 2019 at 2:00 PM Maximilian
>> Michels
>> >>>>>>>           <m...@apache.org <mailto:m...@apache.org>
>> >>>>>>>            >> >     <mailto:m...@apache.org 
>> >>>>>>> <mailto:m...@apache.org>>>
>> wrote:
>> >>>>>>>            >> >      >
>> >>>>>>>            >> >      > Hi everyone,
>> >>>>>>>            >> >      >
>> >>>>>>>            >> >      > The Python SDK still lacks documentation
>> on state
>> >>>>>>>           and timers.
>> >>>>>>>            >> >      >
>> >>>>>>>            >> >      > As a first step, what do you think about
>> updating
>> >>>>>>>           these two blog
>> >>>>>>>            >> >     posts
>> >>>>>>>            >> >      > with the corresponding Python code?
>> >>>>>>>            >> >      >
>> >>>>>>>            >> >      >
>> >>>>>>>
>> https://beam.apache.org/blog/2017/02/13/stateful-processing.html
>> >>>>>>>            >> >      >
>> >>>>>>>
>> https://beam.apache.org/blog/2017/08/28/timely-processing.html
>> >>>>>>>            >> >      >
>> >>>>>>>            >> >      > Thanks,
>> >>>>>>>            >> >      > Max
>> >>>>>>>            >> >
>> >>>>>>>
>>
>

Reply via email to