Thanks John.

Agree. I like the first approach as well, with StreamsConfig flag passing
by via ProcessorContext.

Another positive effect with "reverse parameters" is that in the case of
`fetch(keyFrom, keyTo, timeFrom, timeTo)` users can decide _which_ pair to
flip, whether with `ReadDirection` enum it apply to both.

The only issue I've found while reviewing the KIP is that `all()` won't fit
within this approach.

We could remove it from the KIP and argue that for WindowStore,
`fetchAll(0, Long.MAX_VALUE)` can be used to get all in reverse order, and
for KeyValueStore, no ordering guarantees are provided.

If there is consensus with this changes, I will go and update the KIP.

On Thu, May 21, 2020 at 3:33 PM John Roesler <vvcep...@apache.org> wrote:

> Hi Jorge,
>
> Thanks for that idea. I agree, a feature flag would protect anyone
> who may be depending on the current behavior.
>
> It seems better to locate the feature flag in the initialization logic of
> the store, rather than have a method on the "live" store that changes
> its behavior on the fly.
>
> It seems like there are two options here, one is to add a new config:
>
> StreamsConfig.ENABLE_BACKWARDS_ITERATION =
>   "enable.backwards.iteration
>
> Or we can add a feature flag in Materialized, like
>
> Materialized.enableBackwardsIteration()
>
> I think I'd personally lean toward the config, for the following reason.
> The concern that Sophie raised is that someone's program may depend
> on the existing contract of getting an empty iterator. We don't want to
> switch behavior when they aren't expecting it, so we provide them a
> config to assert that they _are_ expecting the new behavior, which
> means they take responsibility for updating their code to expect the new
> behavior.
>
> There doesn't seem to be a reason to offer a choice of behaviors on a
> per-query, or per-store basis. We just want people to be not surprised
> by this change in general.
>
> What do you think?
> Thanks,
> -John
>
> On Wed, May 20, 2020, at 17:37, Jorge Quilcate wrote:
> > Thank you both for the great feedback.
> >
> > I like the "fancy" proposal :), and how it removes the need for
> > additional API methods. And with a feature flag on `StateStore`,
> > disabled by default, should no break current users.
> >
> > The only side-effect I can think of is that: by moving the flag upwards,
> > all later operations become affected; which might be ok for most (all?)
> > cases. I can't think of an scenario where this would be an issue, just
> > want to point this out.
> >
> > If moving to this approach, I'd like to check if I got this right before
> > updating the KIP:
> >
> > - only `StateStore` will change by having a new method:
> > `backwardIteration()`, `false` by default to keep things compatible.
> > - then all `*Stores` will have to update their implementation based on
> > this flag.
> >
> >
> > On 20/05/2020 21:02, Sophie Blee-Goldman wrote:
> > >> There's no possibility that someone could be relying
> > >> on iterating over that range in increasing order, because that's not
> what
> > >> happens. However, they could indeed be relying on getting an empty
> > > iterator
> > >
> > > I just meant that they might be relying on the assumption that the
> range
> > > query
> > > will never return results with decreasing keys. The empty iterator
> wouldn't
> > > break that contract, but of course a surprise reverse iterator would.
> > >
> > > FWIW I actually am in favor of automatically converting to a reverse
> > > iterator,
> > > I just thought we should consider whether this should be off by
> default or
> > > even possible to disable at all.
> > >
> > > On Tue, May 19, 2020 at 7:42 PM John Roesler <vvcep...@apache.org>
> wrote:
> > >
> > >> Thanks for the response, Sophie,
> > >>
> > >> I wholeheartedly agree we should take as much into account as possible
> > >> up front, rather than regretting our decisions later. I actually do
> share
> > >> your vague sense of worry, which was what led me to say initially
> that I
> > >> thought my counterproposal might be "too fancy". Sometimes, it's
> better
> > >> to be explicit instead of "elegant", if we think more people will be
> > >> confused
> > >> than not.
> > >>
> > >> I really don't think that there's any danger of "relying on a bug"
> here,
> > >> although
> > >> people certainly could be relying on current behavior. One thing to be
> > >> clear
> > >> about (which I just left a more detailed comment in KAFKA-8159 about)
> is
> > >> that
> > >> when we say something like key1 > key2, this ordering is defined by
> the
> > >> serde's output and nothing else.
> > >>
> > >> Currently, thanks to your fix in
> https://github.com/apache/kafka/pull/6521
> > >> ,
> > >> the store contract is that for range scans, if from > to, then the
> store
> > >> must
> > >> return an empty iterator. There's no possibility that someone could be
> > >> relying
> > >> on iterating over that range in increasing order, because that's not
> what
> > >> happens. However, they could indeed be relying on getting an empty
> > >> iterator.
> > >>
> > >> My counterproposal was to actually change this contract to say that
> the
> > >> store
> > >> must return an iterator over the keys in that range, but in the
> reverse
> > >> order.
> > >> So, in addition to considering whether this idea is "too fancy" (aka
> > >> confusing),
> > >> we should also consider the likelihood of breaking an existing
> program with
> > >> this behavior/contract change.
> > >>
> > >> To echo your clarification, I'm also not advocating strongly in favor
> of my
> > >> proposal. I just wanted to present it for consideration alongside
> Jorge's
> > >> original one.
> > >>
> > >> Thanks for raising these very good points,
> > >> -John
> > >>
> > >> On Tue, May 19, 2020, at 20:49, Sophie Blee-Goldman wrote:
> > >>>> Rather than working around it, I think we should just fix it
> > >>> Now *that's* a "fancy" idea :P
> > >>>
> > >>> That was my primary concern, although I do have a vague sense of
> worry
> > >>> that we might be allowing users to get into trouble without
> realizing it.
> > >>> For example if their custom serdes suffer a similar bug as the above,
> > >>> and/or
> > >>> they rely on getting results in increasing order (of the keys) even
> when
> > >>> to < from. Maybe they're relying on the fact that the range query
> returns
> > >>> nothing in that case.
> > >>>
> > >>> Not sure if that qualifies as relying on a bug or not, but in that
> past
> > >>> we've
> > >>> taken the stance that we should not break compatibility even if the
> user
> > >>> was relying on bugs or unintentional behavior.
> > >>>
> > >>> Just to clarify I'm not advocating strongly against this proposal,
> just
> > >>> laying
> > >>> out some considerations we should take into account. At the end of
> the
> > >> day
> > >>> we should do what's right rather than maintain compatibility with
> > >> existing
> > >>> bugs, but sometimes there's a reasonable middle ground.
> > >>>
> > >>> On Tue, May 19, 2020 at 6:15 PM John Roesler <vvcep...@apache.org>
> > >> wrote:
> > >>>> Thanks Sophie,
> > >>>>
> > >>>> Woah, that’s a nasty bug. Rather than working around it, I think we
> > >> should
> > >>>> just fix it. I’ll leave some comments on the Jira.
> > >>>>
> > >>>> It doesn’t seem like it should be this KIP’s concern that some
> serdes
> > >>>> might be incorrectly written.
> > >>>>
> > >>>> Were there other practical concerns that you had in mind?
> > >>>>
> > >>>> Thanks,
> > >>>> John
> > >>>>
> > >>>> On Tue, May 19, 2020, at 19:10, Sophie Blee-Goldman wrote:
> > >>>>> I like this "fancy idea" to just flip the to/from bytes but I think
> > >> there
> > >>>>> are some practical limitations to implementing this. In particular
> > >>>>> I'm thinking about this issue
> > >>>>> <https://issues.apache.org/jira/browse/KAFKA-8159> with the
> built-in
> > >>>> signed
> > >>>>> number serdes.
> > >>>>>
> > >>>>> This trick would actually fix the problem for negative-negative
> > >> queries
> > >>>>> (ie where to & from are negative) but would cause undetectable
> > >>>>> incorrect results for negative-positive queries. For example, say
> you
> > >>>>> call #range with from = -1 and to = 1, using the Short serdes. The
> > >>>>> serialized bytes for that are
> > >>>>>
> > >>>>> from = 1111111111111111
> > >>>>> to = 0000000000000001
> > >>>>>
> > >>>>> so we would end up flipping those and iterating over all keys from
> > >>>>> 0000000000000001 to 1111111111111111. Iterating in lexicographical
> > >>>>> order means we would iterate over every key in the space *except*
> for
> > >>>>> 0, but 0 is actually the *only* other key we meant to be included
> in
> > >> the
> > >>>>> range query.
> > >>>>>
> > >>>>> Currently we just log a warning and return an empty iterator when
> > >>>>> to < from, which is obviously also incorrect but feels slightly
> more
> > >>>>> palatable. If we start automatically converting to reverse queries
> we
> > >>>>> can't even log a warning in this case unless we wanted to log a
> > >> warning
> > >>>>> every time, which would be weird to do for a valid usage of a new
> > >>>>> feature.
> > >>>>>
> > >>>>> All that said, I still like the idea overall. Off the top of my
> head
> > >> I
> > >>>> guess
> > >>>>> we could add a store config to enable/disable automatic reverse
> > >>>> iteration,
> > >>>>> which is off by default?
> > >>>>>
> > >>>>> Thanks for the KIP! This will be a nice addition
> > >>>>>
> > >>>>> Sophie
> > >>>>>
> > >>>>>
> > >>>>> On Tue, May 19, 2020 at 3:21 PM John Roesler <vvcep...@apache.org>
> > >>>> wrote:
> > >>>>>> Hi there Jorge,
> > >>>>>>
> > >>>>>> Thanks for the KIP!
> > >>>>>>
> > >>>>>> I think this feature sounds very reasonable.
> > >>>>>>
> > >>>>>> I'm not 100% sure if this is "too fancy", but what do you think
> > >>>>>> about avoiding the enum by instead allowing people to flip
> > >>>>>> the "from" and "to" endpoints? I.e., reading from "A" to "Z" would
> > >>>>>> be a forward scan, and from "Z" to "A" would be a backward one?
> > >>>>>>
> > >>>>>> Thanks,
> > >>>>>> -John
> > >>>>>>
> > >>>>>> On Tue, May 19, 2020, at 16:20, Jorge Quilcate wrote:
> > >>>>>>> Hi everyone,
> > >>>>>>>
> > >>>>>>> I would like to start the discussion for KIP-617:
> > >>>>>>>
> > >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-617%3A+Allow+Kafka+Streams+State+Stores+to+be+iterated+backwards
> > >>>>>>> Looking forward to your feedback.
> > >>>>>>>
> > >>>>>>> Thanks!
> > >>>>>>> Jorge.
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> Attachments:
> > >>>>>>> * 0x5F2C6E22064982DF.asc
> >
> >
> > Attachments:
> > * 0x5F2C6E22064982DF.asc
>

Reply via email to