@John: I can see there are some deprecations in there as well. Will update
the KIP.

Thanks,
Jorge


On Thu, Jul 2, 2020 at 3:29 PM Jorge Esteban Quilcate Otoya <
quilcate.jo...@gmail.com> wrote:

> Thanks John.
>
> > it looks like there’s a revision error in which two methods are proposed
> for SessionStore, but seem like they should be in ReadOnlySessionStore. Do
> I read that right?
>
> Yes, I've opted to keep the new methods alongside the existing ones. In
> the case of SessionStore, `findSessions` are in `SessionStore`, and `fetch`
> are in `ReadOnlySessionStore`. If it makes more sense, I can move all of
> them to ReadOnlySessionStore.
> Let me know what you think.
>
> Thanks,
> Jorge.
>
> On Thu, Jul 2, 2020 at 2:36 PM John Roesler <vvcep...@apache.org> wrote:
>
>> Hi Jorge,
>>
>> Thanks for the update. I think this is a good plan.
>>
>> I just took a look at the KIP again, and it looks like there’s a revision
>> error in which two methods are proposed for SessionStore, but seem like
>> they should be in ReadOnlySessionStore. Do I read that right?
>>
>> Otherwise, I’m happy with your proposal.
>>
>> Thanks,
>> John
>>
>> On Wed, Jul 1, 2020, at 17:01, Jorge Esteban Quilcate Otoya wrote:
>> > Quick update: KIP is updated with latest changes now.
>> > Will leave it open this week while working on the PR.
>> >
>> > Hope to open a new vote thread over the next few days if no additional
>> > feedback is provided.
>> >
>> > Cheers,
>> > Jorge.
>> >
>> > On Mon, Jun 29, 2020 at 11:30 AM Jorge Esteban Quilcate Otoya <
>> > quilcate.jo...@gmail.com> wrote:
>> >
>> > > Thanks, John!
>> > >
>> > > Make sense to reconsider the current approach. I was heading in a
>> similar
>> > > direction while drafting the implementation. Metered, Caching, and
>> other
>> > > layers will also have to get duplicated to build up new methods in
>> `Stores`
>> > > factory, and class casting issues would appear on stores created by
>> DSL.
>> > >
>> > > I will draft a proposal with new methods (move methods from proposed
>> > > interfaces to existing ones) with default implementation in a KIP
>> update
>> > > and wait for Matthias to chime in to validate this approach.
>> > >
>> > > Jorge.
>> > >
>> > >
>> > > On Sat, Jun 27, 2020 at 4:01 PM John Roesler <vvcep...@apache.org>
>> wrote:
>> > >
>> > >> Hi Jorge,
>> > >>
>> > >> Sorry for my silence, I've been absorbed with the 2.6 and 2.5.1
>> releases.
>> > >>
>> > >> The idea to separate the new methods into "mixin" interfaces seems
>> > >> like a good one, but as we've discovered in KIP-614, it doesn't work
>> > >> out that way in practice. The problem is that the store
>> implementations
>> > >> are just the base layer that get composed with other layers in
>> Streams
>> > >> before they can be accessed in the DSL. This is extremely subtle, so
>> > >> I'm going to put everyone to sleep with a detailed explanation:
>> > >>
>> > >> For example, this is the mechanism by which all KeyValueStore
>> > >> implementations get added to Streams:
>> > >> org.apache.kafka.streams.state.internals.KeyValueStoreBuilder#build
>> > >> return new MeteredKeyValueStore<>(
>> > >>   maybeWrapCaching(maybeWrapLogging(storeSupplier.get())),
>> > >>   storeSupplier.metricsScope(),
>> > >>   time,
>> > >>   keySerde,
>> > >>   valueSerde
>> > >> );
>> > >>
>> > >> In the DSL, the store that a processor gets from the context would be
>> > >> the result of this composition. So even if the storeSupplier.get()
>> returns
>> > >> a store that implements the "reverse" interface, when you try to use
>> it
>> > >> from a processor like:
>> > >> org.apache.kafka.streams.kstream.ValueTransformerWithKey#init
>> > >> ReadOnlyBackwardWindowStore<K, V> store =
>> > >>   (ReadOnlyBackwardWindowStore<K, V>) context.getStateStore(..)
>> > >>
>> > >> You'd just get a ClassCastException because it's actually a
>> > >> MeteredKeyValueStore, which doesn't implement
>> > >> ReadOnlyBackwardWindowStore.
>> > >>
>> > >> The only way to make this work would be to make the Metered,
>> > >> Caching, and Logging layers also implement the new interfaces,
>> > >> but this effectively forces implementations to also implement
>> > >> the interface. Otherwise, the intermediate layers would have to
>> > >> cast the store in each method, like this:
>> > >> MeteredWindowStore#backwardFetch {
>> > >>   ((ReadOnlyBackwardWindowStore<K, V>) innerStore).backwardFetch(..)
>> > >> }
>> > >>
>> > >> And then if the implementation doesn't "opt in" by implementing
>> > >> the interface, you'd get a ClassCastException, not when you get the
>> > >> store, but when you try to use it.
>> > >>
>> > >> The fact that we get ClassCastExceptions no matter which way we
>> > >> turn here indicates that we're really not getting any benefit from
>> the
>> > >> type system, which makes the extra interfaces seem not worth all the
>> > >> code involved.
>> > >>
>> > >> Where we landed in KIP-614 is that, unless we want to completely
>> > >> revamp the way that StateStores work in the DSL, you might as
>> > >> well just add the new methods to the existing interfaces. To prevent
>> > >> compilation errors, we can add default implementations that throw
>> > >> UnsupportedOperationException. If a store doesn't opt in by
>> > >> implementing the methods, you'd get an UnsupportedOperationException,
>> > >> which seems no worse, and maybe better, than the ClassCastException
>> > >> you'd get if we go with the "mixin interface" approach.
>> > >>
>> > >> A quick note: This entire discussion focuses on the DSL. If you're
>> just
>> > >> using the Processor API by directly adding the a custom store to the
>> > >> Topology:
>> > >> org.apache.kafka.streams.Topology#addStateStore
>> > >> and then retrieving it in the processor via:
>> > >> org.apache.kafka.streams.processor.ProcessorContext#getStateStore
>> > >> in
>> > >> org.apache.kafka.streams.processor.Processor#init
>> > >>
>> > >> Then, you can both register and retrieve _any_ StateStore
>> implementation.
>> > >> There's no need to use KeyValueStore or any other built-in interface.
>> > >> In other words, KeyValueStore and company are only part of the DSL,
>> > >> not the PAPI. So, discussions about the build-in store interfaces
>> are only
>> > >> really relevant in the context of the DSL, Transformers, and
>> Materialized.
>> > >>
>> > >> So, in conclusion, I'd really recommend just adding any new methods
>> to
>> > >> the existing store interfaces. We might be able to revamp the API in
>> the
>> > >> future to support mixins, but it's a much larger scope of work than
>> this
>> > >> KIP.
>> > >> A more minor comment is that we don't need to add Deprecated variants
>> > >> of new methods.
>> > >>
>> > >> Thanks again, and once again, I'm sorry I tuned out and didn't offer
>> this
>> > >> feedback before you revised the KIP.
>> > >> -John
>> > >>
>> > >>
>> > >>
>> > >>
>> > >> On Mon, Jun 22, 2020, at 06:11, Jorge Esteban Quilcate Otoya wrote:
>> > >> > Hi everyone,
>> > >> >
>> > >> > I've updated the KIP, applying Matthias' feedback regarding
>> interface
>> > >> > hierarchy.
>> > >> >
>> > >> > Also, following the last email, I think we can consider reverse
>> > >> operations
>> > >> > on KeyValue range as well, as implementation supports lexicographic
>> > >> order.
>> > >> >
>> > >> > I considered different naming between Key-based ranges and
>> Time-based
>> > >> > ranges, and mitigate confusion when fetching keys and time ranges
>> as
>> > >> > WindowStore does:
>> > >> >
>> > >> > Key-based ranges: reverseRange(), reverseAll()
>> > >> > Time-based ranges: backwardFetch()
>> > >> >
>> > >> > Then, key-based changes apply to KeyValueStore, and time-based
>> changes
>> > >> to
>> > >> > Window and Session stores.
>> > >> >
>> > >> > Let me know if you have any questions.
>> > >> >
>> > >> > Thanks,
>> > >> > Jorge.
>> > >> >
>> > >> >
>> > >> > On Tue, Jun 16, 2020 at 12:47 AM Jorge Esteban Quilcate Otoya <
>> > >> > quilcate.jo...@gmail.com> wrote:
>> > >> >
>> > >> > > Hi everyone, sorry for the late reply.
>> > >> > >
>> > >> > > Thanks Matthias for your feedback. I think it makes sense to
>> > >> reconsider
>> > >> > > the current design based on your input.
>> > >> > >
>> > >> > > After digging deeper into the current implementation, I'd like to
>> > >> bring my
>> > >> > > current understanding to be double-checked as it might be
>> redefining
>> > >> the
>> > >> > > KIP's scope:
>> > >> > >
>> > >> > > 1. There are 2 ranges been exposed by different stores:
>> > >> > >
>> > >> > > a. Key Range
>> > >> > > b. Timestamp Range
>> > >> > >
>> > >> > > So far, we have discussed covering both.
>> > >> > >
>> > >> > > 2. Key Range functions do not provide ordering guarantees by
>> design:
>> > >> > >
>> > >> > > ```ReadOnlyKeyValueStore.java
>> > >> > >     /**
>> > >> > >      * Get an iterator over a given range of keys. This iterator
>> must
>> > >> be
>> > >> > > closed after use.
>> > >> > >      * The returned iterator must be safe from {@link
>> > >> > > java.util.ConcurrentModificationException}s
>> > >> > >      * and must not return null values. No ordering guarantees
>> are
>> > >> > > provided.
>> > >> > >      * ...
>> > >> > >      */
>> > >> > >      KeyValueIterator<K, V> range(K from, K to);
>> > >> > > ```
>> > >> > >
>> > >> > > Therefore, I'd propose removing Key range operations from the
>> scope.
>> > >> > >
>> > >> > > 3. Timestamp Range operations happen at the SegmentsStore level
>> > >> (internal)
>> > >> > > API
>> > >> > >
>> > >> > > AFAICT, Segments wrappers handle all Timestamp ranges queries.
>> > >> > >
>> > >> > > I'd propose extending `Segments#segments(timeFrom, timeTo,
>> backwards)`
>> > >> > > with a flag for backwards operations.
>> > >> > >
>> > >> > > As segments returned will be processed backwards, I'm not
>> extending
>> > >> > > KeyValueStores to query each segment backwards as previous point
>> 2.
>> > >> > >
>> > >> > > 4. Extend WindowStores implementations with a new
>> > >> > > WindowBackwardStore/ReadOnlyBackwardStore:
>> > >> > >
>> > >> > > ```java
>> > >> > > public interface ReadOnlyWindowBackwardStore<K, V> {
>> > >> > >     WindowStoreIterator<V> backwardFetch(K key, Instant from,
>> Instant
>> > >> to)
>> > >> > > throws IllegalArgumentException;
>> > >> > >
>> > >> > >     KeyValueIterator<Windowed<K>, V> backwardFetch(K from, K to,
>> > >> Instant
>> > >> > > fromTime, Instant toTime)
>> > >> > >         throws IllegalArgumentException;
>> > >> > >
>> > >> > >     KeyValueIterator<Windowed<K>, V> backwardFetchAll(Instant
>> from,
>> > >> > > Instant to) throws IllegalArgumentException;
>> > >> > > ```
>> > >> > >
>> > >> > > 5. SessionStore is a bit different as it has fetch/find sessions
>> > >> spread
>> > >> > > between SessionStore and ReadOnlySessionStore.
>> > >> > >
>> > >> > > I'd propose a new interface `SessionBackwardStore` to expose
>> backward
>> > >> find
>> > >> > > operations:
>> > >> > >
>> > >> > > ```java
>> > >> > > public interface SessionBackwardStore<K, AGG> {
>> > >> > >     KeyValueIterator<Windowed<K>, AGG>
>> backwardFindSessions(final K
>> > >> key,
>> > >> > > final long earliestSessionEndTime, final long
>> latestSessionStartTime);
>> > >> > >
>> > >> > >     KeyValueIterator<Windowed<K>, AGG>
>> backwardFindSessions(final K
>> > >> > > keyFrom, final K keyTo, final long earliestSessionEndTime, final
>> long
>> > >> > > latestSessionStartTime);
>> > >> > > }
>> > >> > > ```
>> > >> > >
>> > >> > > If this understanding is correct I'll proceed to update the KIP
>> based
>> > >> on
>> > >> > > this.
>> > >> > >
>> > >> > > Looking forward to your feedback.
>> > >> > >
>> > >> > > Thanks,
>> > >> > > Jorge.
>> > >> > >
>> > >> > > On Fri, May 29, 2020 at 3:32 AM Matthias J. Sax <
>> mj...@apache.org>
>> > >> wrote:
>> > >> > >
>> > >> > >> Hey,
>> > >> > >>
>> > >> > >> Sorry that I am late to the game. I am not 100% convinced about
>> the
>> > >> > >> current proposal. Using a new config as feature flag seems to be
>> > >> rather
>> > >> > >> "nasty" to me, and flipping from/to is a little bit too fancy
>> for my
>> > >> > >> personal taste.
>> > >> > >>
>> > >> > >> I agree, that the original proposal using a "ReadDirection"
>> enum is
>> > >> not
>> > >> > >> ideal either.
>> > >> > >>
>> > >> > >> Thus, I would like to put out a new idea: We could add a new
>> > >> interface
>> > >> > >> that offers new methods that return revers iterators.
>> > >> > >>
>> > >> > >> The KIP already proposes to add `reverseAll()` and it seems
>> backward
>> > >> > >> incompatible to just add this method to `ReadOnlyKeyValueStore`
>> and
>> > >> > >> `ReadOnlyWindowStore`. I don't think we could provide a useful
>> > >> default
>> > >> > >> implementation for custom stores and thus either break
>> compatibility
>> > >> or
>> > >> > >> need add a default that just throws an exception. Neither seems
>> to
>> > >> be a
>> > >> > >> good option.
>> > >> > >>
>> > >> > >> Using a new interface avoid this issue and allows users
>> implementing
>> > >> > >> custom stores to opt-in by adding the interface to their stores.
>> > >> > >> Furthermore, we don't need any config. In the end, we
>> encapsulte the
>> > >> > >> change into the store, and our runtime is agnostic to it (as it
>> > >> should
>> > >> > >> be).
>> > >> > >>
>> > >> > >> The hierarchy becomes a little complex (but uses would not
>> really see
>> > >> > >> the complexity):
>> > >> > >>
>> > >> > >> // exsiting
>> > >> > >> ReadOnlyKeyValueStore
>> > >> > >> KeyValueStore extend StateStore, ReadOnlyKeyValueStore
>> > >> > >>
>> > >> > >>
>> > >> > >> // helper interface; users don't care
>> > >> > >> // need similar ones for other stores
>> > >> > >> ReverseReadOnlyKeyValueStore {
>> > >> > >>     KeyValueIterator<K, V> reverseRange(K from, K to);
>> > >> > >>     KeyValueIterator<K, V> reverseAll();
>> > >> > >> }
>> > >> > >>
>> > >> > >>
>> > >> > >> // two new user facing interfaces for kv-store
>> > >> > >> // need similar ones for other stores
>> > >> > >> ReadOnlyKeyValueStoreWithReverseIterators extends
>> > >> ReadOnlyKeyValueStore,
>> > >> > >> ReverseReadOnlyKeyValueStore
>> > >> > >>
>> > >> > >> KeyValueStoreWithReverseIterators extends KeyValueStore,
>> > >> > >> ReverseReadOnlyKeyValueStore
>> > >> > >>
>> > >> > >>
>> > >> > >> // updated (also internal)
>> > >> > >> // also need to update other built-in stores
>> > >> > >> RocksDB implements KeyValueStoreWithReverseIterators,
>> > >> BulkLoadingStore
>> > >> > >>
>> > >> > >>
>> > >> > >> In the end, user would only care about the two (for kv-case) new
>> > >> > >> interface that offer revers iterator (read only and regular)
>> and can
>> > >> > >> cast stores accordingly in their Processors/Transformers or via
>> IQ.
>> > >> > >>
>> > >> > >>
>> > >> > >> Btw: if we add revers iterator for KeyValue and Window store,
>> should
>> > >> we
>> > >> > >> do the same for Session store?
>> > >> > >>
>> > >> > >>
>> > >> > >>
>> > >> > >> This might be more code to write, but I believe it provides the
>> > >> better
>> > >> > >> user experience. Thoughts?
>> > >> > >>
>> > >> > >>
>> > >> > >>
>> > >> > >> -Matthias
>> > >> > >>
>> > >> > >>
>> > >> > >>
>> > >> > >>
>> > >> > >> On 5/26/20 8:47 PM, John Roesler wrote:
>> > >> > >> > Sorry for my silence, Jorge,
>> > >> > >> >
>> > >> > >> > I've just taken another look, and I'm personally happy with
>> the
>> > >> KIP.
>> > >> > >> >
>> > >> > >> > Thanks,
>> > >> > >> > -John
>> > >> > >> >
>> > >> > >> > On Tue, May 26, 2020, at 16:17, Jorge Esteban Quilcate Otoya
>> wrote:
>> > >> > >> >> If no additional comments, I will proceed to start the a vote
>> > >> thread.
>> > >> > >> >>
>> > >> > >> >> Thanks a lot for your feedback!
>> > >> > >> >>
>> > >> > >> >> On Fri, May 22, 2020 at 9:25 AM Jorge Esteban Quilcate Otoya
>> <
>> > >> > >> >> quilcate.jo...@gmail.com> wrote:
>> > >> > >> >>
>> > >> > >> >>> Thanks Sophie. I like the `reverseAll()` idea.
>> > >> > >> >>>
>> > >> > >> >>> I updated the KIP with your feedback.
>> > >> > >> >>>
>> > >> > >> >>>
>> > >> > >> >>>
>> > >> > >> >>> On Fri, May 22, 2020 at 4:22 AM Sophie Blee-Goldman <
>> > >> > >> sop...@confluent.io>
>> > >> > >> >>> wrote:
>> > >> > >> >>>
>> > >> > >> >>>> Hm, the case of `all()` does seem to present a dilemma in
>> the
>> > >> case of
>> > >> > >> >>>> variable-length keys.
>> > >> > >> >>>>
>> > >> > >> >>>> In the case of fixed-length keys, you can just compute the
>> keys
>> > >> that
>> > >> > >> >>>> correspond
>> > >> > >> >>>> to the maximum and minimum serialized bytes, then perform a
>> > >> `range()`
>> > >> > >> >>>> query
>> > >> > >> >>>> instead of an `all()`. If your keys don't have a
>> well-defined
>> > >> > >> ordering
>> > >> > >> >>>> such
>> > >> > >> >>>> that
>> > >> > >> >>>> you can't determine the MAX_KEY, then you probably don't
>> care
>> > >> about
>> > >> > >> the
>> > >> > >> >>>> iterator order anyway.
>> > >> > >> >>>>
>> > >> > >> >>>>  But with variable-length keys, there is no MAX_KEY. If
>> all your
>> > >> > >> keys were
>> > >> > >> >>>> just
>> > >> > >> >>>> of the form 'a', 'aa', 'aaaaa', 'aaaaaaa' then in fact the
>> only
>> > >> way
>> > >> > >> to
>> > >> > >> >>>> figure out the
>> > >> > >> >>>> maximum key in the store is by using `all()` -- and
>> without a
>> > >> reverse
>> > >> > >> >>>> iterator, you're
>> > >> > >> >>>> doomed to iterate through every single key just to answer
>> that
>> > >> simple
>> > >> > >> >>>> question.
>> > >> > >> >>>>
>> > >> > >> >>>> That said, I still think determining the iterator order
>> based
>> > >> on the
>> > >> > >> >>>> to/from bytes
>> > >> > >> >>>> makes a lot of intuitive sense and gives the API a nice
>> > >> symmetry.
>> > >> > >> What if
>> > >> > >> >>>> we
>> > >> > >> >>>> solved the `all()` problem by just giving `all()` a reverse
>> > >> form to
>> > >> > >> >>>> complement it?
>> > >> > >> >>>> Ie we would have `all()` and `reverseAll()`, or something
>> to
>> > >> that
>> > >> > >> effect.
>> > >> > >> >>>>
>> > >> > >> >>>> On Thu, May 21, 2020 at 3:41 PM Jorge Esteban Quilcate
>> Otoya <
>> > >> > >> >>>> quilcate.jo...@gmail.com> wrote:
>> > >> > >> >>>>
>> > >> > >> >>>>> Thanks John.
>> > >> > >> >>>>>
>> > >> > >> >>>>> Agree. I like the first approach as well, with
>> StreamsConfig
>> > >> flag
>> > >> > >> >>>> passing
>> > >> > >> >>>>> by via ProcessorContext.
>> > >> > >> >>>>>
>> > >> > >> >>>>> Another positive effect with "reverse parameters" is that
>> in
>> > >> the
>> > >> > >> case of
>> > >> > >> >>>>> `fetch(keyFrom, keyTo, timeFrom, timeTo)` users can decide
>> > >> _which_
>> > >> > >> pair
>> > >> > >> >>>> to
>> > >> > >> >>>>> flip, whether with `ReadDirection` enum it apply to both.
>> > >> > >> >>>>>
>> > >> > >> >>>>> The only issue I've found while reviewing the KIP is that
>> > >> `all()`
>> > >> > >> won't
>> > >> > >> >>>> fit
>> > >> > >> >>>>> within this approach.
>> > >> > >> >>>>>
>> > >> > >> >>>>> We could remove it from the KIP and argue that for
>> WindowStore,
>> > >> > >> >>>>> `fetchAll(0, Long.MAX_VALUE)` can be used to get all in
>> reverse
>> > >> > >> order,
>> > >> > >> >>>> and
>> > >> > >> >>>>> for KeyValueStore, no ordering guarantees are provided.
>> > >> > >> >>>>>
>> > >> > >> >>>>> If there is consensus with this changes, I will go and
>> update
>> > >> the
>> > >> > >> KIP.
>> > >> > >> >>>>>
>> > >> > >> >>>>> On Thu, May 21, 2020 at 3:33 PM John Roesler <
>> > >> vvcep...@apache.org>
>> > >> > >> >>>> wrote:
>> > >> > >> >>>>>
>> > >> > >> >>>>>> Hi Jorge,
>> > >> > >> >>>>>>
>> > >> > >> >>>>>> Thanks for that idea. I agree, a feature flag would
>> protect
>> > >> anyone
>> > >> > >> >>>>>> who may be depending on the current behavior.
>> > >> > >> >>>>>>
>> > >> > >> >>>>>> It seems better to locate the feature flag in the
>> > >> initialization
>> > >> > >> >>>> logic of
>> > >> > >> >>>>>> the store, rather than have a method on the "live" store
>> that
>> > >> > >> changes
>> > >> > >> >>>>>> its behavior on the fly.
>> > >> > >> >>>>>>
>> > >> > >> >>>>>> It seems like there are two options here, one is to add
>> a new
>> > >> > >> config:
>> > >> > >> >>>>>>
>> > >> > >> >>>>>> StreamsConfig.ENABLE_BACKWARDS_ITERATION =
>> > >> > >> >>>>>>   "enable.backwards.iteration
>> > >> > >> >>>>>>
>> > >> > >> >>>>>> Or we can add a feature flag in Materialized, like
>> > >> > >> >>>>>>
>> > >> > >> >>>>>> Materialized.enableBackwardsIteration()
>> > >> > >> >>>>>>
>> > >> > >> >>>>>> I think I'd personally lean toward the config, for the
>> > >> following
>> > >> > >> >>>> reason.
>> > >> > >> >>>>>> The concern that Sophie raised is that someone's program
>> may
>> > >> depend
>> > >> > >> >>>>>> on the existing contract of getting an empty iterator. We
>> > >> don't
>> > >> > >> want
>> > >> > >> >>>> to
>> > >> > >> >>>>>> switch behavior when they aren't expecting it, so we
>> provide
>> > >> them a
>> > >> > >> >>>>>> config to assert that they _are_ expecting the new
>> behavior,
>> > >> which
>> > >> > >> >>>>>> means they take responsibility for updating their code to
>> > >> expect
>> > >> > >> the
>> > >> > >> >>>> new
>> > >> > >> >>>>>> behavior.
>> > >> > >> >>>>>>
>> > >> > >> >>>>>> There doesn't seem to be a reason to offer a choice of
>> > >> behaviors
>> > >> > >> on a
>> > >> > >> >>>>>> per-query, or per-store basis. We just want people to be
>> not
>> > >> > >> surprised
>> > >> > >> >>>>>> by this change in general.
>> > >> > >> >>>>>>
>> > >> > >> >>>>>> What do you think?
>> > >> > >> >>>>>> Thanks,
>> > >> > >> >>>>>> -John
>> > >> > >> >>>>>>
>> > >> > >> >>>>>> On Wed, May 20, 2020, at 17:37, Jorge Quilcate wrote:
>> > >> > >> >>>>>>> Thank you both for the great feedback.
>> > >> > >> >>>>>>>
>> > >> > >> >>>>>>> I like the "fancy" proposal :), and how it removes the
>> need
>> > >> for
>> > >> > >> >>>>>>> additional API methods. And with a feature flag on
>> > >> `StateStore`,
>> > >> > >> >>>>>>> disabled by default, should no break current users.
>> > >> > >> >>>>>>>
>> > >> > >> >>>>>>> The only side-effect I can think of is that: by moving
>> the
>> > >> flag
>> > >> > >> >>>>> upwards,
>> > >> > >> >>>>>>> all later operations become affected; which might be ok
>> for
>> > >> most
>> > >> > >> >>>> (all?)
>> > >> > >> >>>>>>> cases. I can't think of an scenario where this would be
>> an
>> > >> issue,
>> > >> > >> >>>> just
>> > >> > >> >>>>>>> want to point this out.
>> > >> > >> >>>>>>>
>> > >> > >> >>>>>>> If moving to this approach, I'd like to check if I got
>> this
>> > >> right
>> > >> > >> >>>>> before
>> > >> > >> >>>>>>> updating the KIP:
>> > >> > >> >>>>>>>
>> > >> > >> >>>>>>> - only `StateStore` will change by having a new method:
>> > >> > >> >>>>>>> `backwardIteration()`, `false` by default to keep things
>> > >> > >> compatible.
>> > >> > >> >>>>>>> - then all `*Stores` will have to update their
>> implementation
>> > >> > >> based
>> > >> > >> >>>> on
>> > >> > >> >>>>>>> this flag.
>> > >> > >> >>>>>>>
>> > >> > >> >>>>>>>
>> > >> > >> >>>>>>> On 20/05/2020 21:02, Sophie Blee-Goldman wrote:
>> > >> > >> >>>>>>>>> There's no possibility that someone could be relying
>> > >> > >> >>>>>>>>> on iterating over that range in increasing order,
>> because
>> > >> that's
>> > >> > >> >>>> not
>> > >> > >> >>>>>> what
>> > >> > >> >>>>>>>>> happens. However, they could indeed be relying on
>> getting
>> > >> an
>> > >> > >> >>>> empty
>> > >> > >> >>>>>>>> iterator
>> > >> > >> >>>>>>>>
>> > >> > >> >>>>>>>> I just meant that they might be relying on the
>> assumption
>> > >> that
>> > >> > >> the
>> > >> > >> >>>>>> range
>> > >> > >> >>>>>>>> query
>> > >> > >> >>>>>>>> will never return results with decreasing keys. The
>> empty
>> > >> > >> iterator
>> > >> > >> >>>>>> wouldn't
>> > >> > >> >>>>>>>> break that contract, but of course a surprise reverse
>> > >> iterator
>> > >> > >> >>>> would.
>> > >> > >> >>>>>>>>
>> > >> > >> >>>>>>>> FWIW I actually am in favor of automatically
>> converting to a
>> > >> > >> >>>> reverse
>> > >> > >> >>>>>>>> iterator,
>> > >> > >> >>>>>>>> I just thought we should consider whether this should
>> be
>> > >> off by
>> > >> > >> >>>>>> default or
>> > >> > >> >>>>>>>> even possible to disable at all.
>> > >> > >> >>>>>>>>
>> > >> > >> >>>>>>>> On Tue, May 19, 2020 at 7:42 PM John Roesler <
>> > >> > >> vvcep...@apache.org
>> > >> > >> >>>>>
>> > >> > >> >>>>>> wrote:
>> > >> > >> >>>>>>>>
>> > >> > >> >>>>>>>>> Thanks for the response, Sophie,
>> > >> > >> >>>>>>>>>
>> > >> > >> >>>>>>>>> I wholeheartedly agree we should take as much into
>> account
>> > >> as
>> > >> > >> >>>>> possible
>> > >> > >> >>>>>>>>> up front, rather than regretting our decisions later.
>> I
>> > >> actually
>> > >> > >> >>>> do
>> > >> > >> >>>>>> share
>> > >> > >> >>>>>>>>> your vague sense of worry, which was what led me to
>> say
>> > >> > >> initially
>> > >> > >> >>>>>> that I
>> > >> > >> >>>>>>>>> thought my counterproposal might be "too fancy".
>> > >> Sometimes, it's
>> > >> > >> >>>>>> better
>> > >> > >> >>>>>>>>> to be explicit instead of "elegant", if we think more
>> > >> people
>> > >> > >> >>>> will be
>> > >> > >> >>>>>>>>> confused
>> > >> > >> >>>>>>>>> than not.
>> > >> > >> >>>>>>>>>
>> > >> > >> >>>>>>>>> I really don't think that there's any danger of
>> "relying
>> > >> on a
>> > >> > >> >>>> bug"
>> > >> > >> >>>>>> here,
>> > >> > >> >>>>>>>>> although
>> > >> > >> >>>>>>>>> people certainly could be relying on current
>> behavior. One
>> > >> thing
>> > >> > >> >>>> to
>> > >> > >> >>>>> be
>> > >> > >> >>>>>>>>> clear
>> > >> > >> >>>>>>>>> about (which I just left a more detailed comment in
>> > >> KAFKA-8159
>> > >> > >> >>>>> about)
>> > >> > >> >>>>>> is
>> > >> > >> >>>>>>>>> that
>> > >> > >> >>>>>>>>> when we say something like key1 > key2, this ordering
>> is
>> > >> defined
>> > >> > >> >>>> by
>> > >> > >> >>>>>> the
>> > >> > >> >>>>>>>>> serde's output and nothing else.
>> > >> > >> >>>>>>>>>
>> > >> > >> >>>>>>>>> Currently, thanks to your fix in
>> > >> > >> >>>>>> https://github.com/apache/kafka/pull/6521
>> > >> > >> >>>>>>>>> ,
>> > >> > >> >>>>>>>>> the store contract is that for range scans, if from >
>> to,
>> > >> then
>> > >> > >> >>>> the
>> > >> > >> >>>>>> store
>> > >> > >> >>>>>>>>> must
>> > >> > >> >>>>>>>>> return an empty iterator. There's no possibility that
>> > >> someone
>> > >> > >> >>>> could
>> > >> > >> >>>>> be
>> > >> > >> >>>>>>>>> relying
>> > >> > >> >>>>>>>>> on iterating over that range in increasing order,
>> because
>> > >> that's
>> > >> > >> >>>> not
>> > >> > >> >>>>>> what
>> > >> > >> >>>>>>>>> happens. However, they could indeed be relying on
>> getting
>> > >> an
>> > >> > >> >>>> empty
>> > >> > >> >>>>>>>>> iterator.
>> > >> > >> >>>>>>>>>
>> > >> > >> >>>>>>>>> My counterproposal was to actually change this
>> contract to
>> > >> say
>> > >> > >> >>>> that
>> > >> > >> >>>>>> the
>> > >> > >> >>>>>>>>> store
>> > >> > >> >>>>>>>>> must return an iterator over the keys in that range,
>> but
>> > >> in the
>> > >> > >> >>>>>> reverse
>> > >> > >> >>>>>>>>> order.
>> > >> > >> >>>>>>>>> So, in addition to considering whether this idea is
>> "too
>> > >> fancy"
>> > >> > >> >>>> (aka
>> > >> > >> >>>>>>>>> confusing),
>> > >> > >> >>>>>>>>> we should also consider the likelihood of breaking an
>> > >> existing
>> > >> > >> >>>>>> program with
>> > >> > >> >>>>>>>>> this behavior/contract change.
>> > >> > >> >>>>>>>>>
>> > >> > >> >>>>>>>>> To echo your clarification, I'm also not advocating
>> > >> strongly in
>> > >> > >> >>>>> favor
>> > >> > >> >>>>>> of my
>> > >> > >> >>>>>>>>> proposal. I just wanted to present it for
>> consideration
>> > >> > >> alongside
>> > >> > >> >>>>>> Jorge's
>> > >> > >> >>>>>>>>> original one.
>> > >> > >> >>>>>>>>>
>> > >> > >> >>>>>>>>> Thanks for raising these very good points,
>> > >> > >> >>>>>>>>> -John
>> > >> > >> >>>>>>>>>
>> > >> > >> >>>>>>>>> On Tue, May 19, 2020, at 20:49, Sophie Blee-Goldman
>> wrote:
>> > >> > >> >>>>>>>>>>> Rather than working around it, I think we should
>> just
>> > >> fix it
>> > >> > >> >>>>>>>>>> Now *that's* a "fancy" idea :P
>> > >> > >> >>>>>>>>>>
>> > >> > >> >>>>>>>>>> That was my primary concern, although I do have a
>> vague
>> > >> sense
>> > >> > >> of
>> > >> > >> >>>>>> worry
>> > >> > >> >>>>>>>>>> that we might be allowing users to get into trouble
>> > >> without
>> > >> > >> >>>>>> realizing it.
>> > >> > >> >>>>>>>>>> For example if their custom serdes suffer a similar
>> bug
>> > >> as the
>> > >> > >> >>>>> above,
>> > >> > >> >>>>>>>>>> and/or
>> > >> > >> >>>>>>>>>> they rely on getting results in increasing order (of
>> the
>> > >> keys)
>> > >> > >> >>>> even
>> > >> > >> >>>>>> when
>> > >> > >> >>>>>>>>>> to < from. Maybe they're relying on the fact that the
>> > >> range
>> > >> > >> >>>> query
>> > >> > >> >>>>>> returns
>> > >> > >> >>>>>>>>>> nothing in that case.
>> > >> > >> >>>>>>>>>>
>> > >> > >> >>>>>>>>>> Not sure if that qualifies as relying on a bug or
>> not,
>> > >> but in
>> > >> > >> >>>> that
>> > >> > >> >>>>>> past
>> > >> > >> >>>>>>>>>> we've
>> > >> > >> >>>>>>>>>> taken the stance that we should not break
>> compatibility
>> > >> even if
>> > >> > >> >>>> the
>> > >> > >> >>>>>> user
>> > >> > >> >>>>>>>>>> was relying on bugs or unintentional behavior.
>> > >> > >> >>>>>>>>>>
>> > >> > >> >>>>>>>>>> Just to clarify I'm not advocating strongly against
>> this
>> > >> > >> >>>> proposal,
>> > >> > >> >>>>>> just
>> > >> > >> >>>>>>>>>> laying
>> > >> > >> >>>>>>>>>> out some considerations we should take into account.
>> At
>> > >> the end
>> > >> > >> >>>> of
>> > >> > >> >>>>>> the
>> > >> > >> >>>>>>>>> day
>> > >> > >> >>>>>>>>>> we should do what's right rather than maintain
>> > >> compatibility
>> > >> > >> >>>> with
>> > >> > >> >>>>>>>>> existing
>> > >> > >> >>>>>>>>>> bugs, but sometimes there's a reasonable middle
>> ground.
>> > >> > >> >>>>>>>>>>
>> > >> > >> >>>>>>>>>> On Tue, May 19, 2020 at 6:15 PM John Roesler <
>> > >> > >> >>>> vvcep...@apache.org>
>> > >> > >> >>>>>>>>> wrote:
>> > >> > >> >>>>>>>>>>> Thanks Sophie,
>> > >> > >> >>>>>>>>>>>
>> > >> > >> >>>>>>>>>>> Woah, that’s a nasty bug. Rather than working
>> around it,
>> > >> I
>> > >> > >> >>>> think
>> > >> > >> >>>>> we
>> > >> > >> >>>>>>>>> should
>> > >> > >> >>>>>>>>>>> just fix it. I’ll leave some comments on the Jira.
>> > >> > >> >>>>>>>>>>>
>> > >> > >> >>>>>>>>>>> It doesn’t seem like it should be this KIP’s concern
>> > >> that some
>> > >> > >> >>>>>> serdes
>> > >> > >> >>>>>>>>>>> might be incorrectly written.
>> > >> > >> >>>>>>>>>>>
>> > >> > >> >>>>>>>>>>> Were there other practical concerns that you had in
>> mind?
>> > >> > >> >>>>>>>>>>>
>> > >> > >> >>>>>>>>>>> Thanks,
>> > >> > >> >>>>>>>>>>> John
>> > >> > >> >>>>>>>>>>>
>> > >> > >> >>>>>>>>>>> On Tue, May 19, 2020, at 19:10, Sophie Blee-Goldman
>> > >> wrote:
>> > >> > >> >>>>>>>>>>>> I like this "fancy idea" to just flip the to/from
>> bytes
>> > >> but I
>> > >> > >> >>>>> think
>> > >> > >> >>>>>>>>> there
>> > >> > >> >>>>>>>>>>>> are some practical limitations to implementing
>> this. In
>> > >> > >> >>>>> particular
>> > >> > >> >>>>>>>>>>>> I'm thinking about this issue
>> > >> > >> >>>>>>>>>>>> <https://issues.apache.org/jira/browse/KAFKA-8159>
>> > >> with the
>> > >> > >> >>>>>> built-in
>> > >> > >> >>>>>>>>>>> signed
>> > >> > >> >>>>>>>>>>>> number serdes.
>> > >> > >> >>>>>>>>>>>>
>> > >> > >> >>>>>>>>>>>> This trick would actually fix the problem for
>> > >> > >> >>>> negative-negative
>> > >> > >> >>>>>>>>> queries
>> > >> > >> >>>>>>>>>>>> (ie where to & from are negative) but would cause
>> > >> > >> undetectable
>> > >> > >> >>>>>>>>>>>> incorrect results for negative-positive queries.
>> For
>> > >> example,
>> > >> > >> >>>> say
>> > >> > >> >>>>>> you
>> > >> > >> >>>>>>>>>>>> call #range with from = -1 and to = 1, using the
>> Short
>> > >> > >> serdes.
>> > >> > >> >>>>> The
>> > >> > >> >>>>>>>>>>>> serialized bytes for that are
>> > >> > >> >>>>>>>>>>>>
>> > >> > >> >>>>>>>>>>>> from = 1111111111111111
>> > >> > >> >>>>>>>>>>>> to = 0000000000000001
>> > >> > >> >>>>>>>>>>>>
>> > >> > >> >>>>>>>>>>>> so we would end up flipping those and iterating
>> over
>> > >> all keys
>> > >> > >> >>>>> from
>> > >> > >> >>>>>>>>>>>> 0000000000000001 to 1111111111111111. Iterating in
>> > >> > >> >>>>> lexicographical
>> > >> > >> >>>>>>>>>>>> order means we would iterate over every key in the
>> space
>> > >> > >> >>>> *except*
>> > >> > >> >>>>>> for
>> > >> > >> >>>>>>>>>>>> 0, but 0 is actually the *only* other key we meant
>> to be
>> > >> > >> >>>> included
>> > >> > >> >>>>>> in
>> > >> > >> >>>>>>>>> the
>> > >> > >> >>>>>>>>>>>> range query.
>> > >> > >> >>>>>>>>>>>>
>> > >> > >> >>>>>>>>>>>> Currently we just log a warning and return an empty
>> > >> iterator
>> > >> > >> >>>> when
>> > >> > >> >>>>>>>>>>>> to < from, which is obviously also incorrect but
>> feels
>> > >> > >> >>>> slightly
>> > >> > >> >>>>>> more
>> > >> > >> >>>>>>>>>>>> palatable. If we start automatically converting to
>> > >> reverse
>> > >> > >> >>>>> queries
>> > >> > >> >>>>>> we
>> > >> > >> >>>>>>>>>>>> can't even log a warning in this case unless we
>> wanted
>> > >> to log
>> > >> > >> >>>> a
>> > >> > >> >>>>>>>>> warning
>> > >> > >> >>>>>>>>>>>> every time, which would be weird to do for a valid
>> > >> usage of a
>> > >> > >> >>>> new
>> > >> > >> >>>>>>>>>>>> feature.
>> > >> > >> >>>>>>>>>>>>
>> > >> > >> >>>>>>>>>>>> All that said, I still like the idea overall. Off
>> the
>> > >> top of
>> > >> > >> >>>> my
>> > >> > >> >>>>>> head
>> > >> > >> >>>>>>>>> I
>> > >> > >> >>>>>>>>>>> guess
>> > >> > >> >>>>>>>>>>>> we could add a store config to enable/disable
>> automatic
>> > >> > >> >>>> reverse
>> > >> > >> >>>>>>>>>>> iteration,
>> > >> > >> >>>>>>>>>>>> which is off by default?
>> > >> > >> >>>>>>>>>>>>
>> > >> > >> >>>>>>>>>>>> Thanks for the KIP! This will be a nice addition
>> > >> > >> >>>>>>>>>>>>
>> > >> > >> >>>>>>>>>>>> Sophie
>> > >> > >> >>>>>>>>>>>>
>> > >> > >> >>>>>>>>>>>>
>> > >> > >> >>>>>>>>>>>> On Tue, May 19, 2020 at 3:21 PM John Roesler <
>> > >> > >> >>>>> vvcep...@apache.org>
>> > >> > >> >>>>>>>>>>> wrote:
>> > >> > >> >>>>>>>>>>>>> Hi there Jorge,
>> > >> > >> >>>>>>>>>>>>>
>> > >> > >> >>>>>>>>>>>>> Thanks for the KIP!
>> > >> > >> >>>>>>>>>>>>>
>> > >> > >> >>>>>>>>>>>>> I think this feature sounds very reasonable.
>> > >> > >> >>>>>>>>>>>>>
>> > >> > >> >>>>>>>>>>>>> I'm not 100% sure if this is "too fancy", but
>> what do
>> > >> you
>> > >> > >> >>>> think
>> > >> > >> >>>>>>>>>>>>> about avoiding the enum by instead allowing
>> people to
>> > >> flip
>> > >> > >> >>>>>>>>>>>>> the "from" and "to" endpoints? I.e., reading from
>> "A"
>> > >> to "Z"
>> > >> > >> >>>>> would
>> > >> > >> >>>>>>>>>>>>> be a forward scan, and from "Z" to "A" would be a
>> > >> backward
>> > >> > >> >>>> one?
>> > >> > >> >>>>>>>>>>>>>
>> > >> > >> >>>>>>>>>>>>> Thanks,
>> > >> > >> >>>>>>>>>>>>> -John
>> > >> > >> >>>>>>>>>>>>>
>> > >> > >> >>>>>>>>>>>>> On Tue, May 19, 2020, at 16:20, Jorge Quilcate
>> wrote:
>> > >> > >> >>>>>>>>>>>>>> Hi everyone,
>> > >> > >> >>>>>>>>>>>>>>
>> > >> > >> >>>>>>>>>>>>>> I would like to start the discussion for KIP-617:
>> > >> > >> >>>>>>>>>>>>>>
>> > >> > >> >>>>>>>>>
>> > >> > >> >>>>>>
>> > >> > >> >>>>>
>> > >> > >> >>>>
>> > >> > >>
>> > >>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-617%3A+Allow+Kafka+Streams+State+Stores+to+be+iterated+backwards
>> > >> > >> >>>>>>>>>>>>>> Looking forward to your feedback.
>> > >> > >> >>>>>>>>>>>>>>
>> > >> > >> >>>>>>>>>>>>>> Thanks!
>> > >> > >> >>>>>>>>>>>>>> Jorge.
>> > >> > >> >>>>>>>>>>>>>>
>> > >> > >> >>>>>>>>>>>>>>
>> > >> > >> >>>>>>>>>>>>>> Attachments:
>> > >> > >> >>>>>>>>>>>>>> * 0x5F2C6E22064982DF.asc
>> > >> > >> >>>>>>>
>> > >> > >> >>>>>>>
>> > >> > >> >>>>>>> Attachments:
>> > >> > >> >>>>>>> * 0x5F2C6E22064982DF.asc
>> > >> > >> >>>>>>
>> > >> > >> >>>>>
>> > >> > >> >>>>
>> > >> > >> >>>
>> > >> > >> >>
>> > >> > >>
>> > >> > >>
>> > >> >
>> > >>
>> > >
>> >
>>
>

Reply via email to