@John: I can see there are some deprecations in there as well. Will update the KIP.
Thanks, Jorge On Thu, Jul 2, 2020 at 3:29 PM Jorge Esteban Quilcate Otoya < quilcate.jo...@gmail.com> wrote: > Thanks John. > > > it looks like there’s a revision error in which two methods are proposed > for SessionStore, but seem like they should be in ReadOnlySessionStore. Do > I read that right? > > Yes, I've opted to keep the new methods alongside the existing ones. In > the case of SessionStore, `findSessions` are in `SessionStore`, and `fetch` > are in `ReadOnlySessionStore`. If it makes more sense, I can move all of > them to ReadOnlySessionStore. > Let me know what you think. > > Thanks, > Jorge. > > On Thu, Jul 2, 2020 at 2:36 PM John Roesler <vvcep...@apache.org> wrote: > >> Hi Jorge, >> >> Thanks for the update. I think this is a good plan. >> >> I just took a look at the KIP again, and it looks like there’s a revision >> error in which two methods are proposed for SessionStore, but seem like >> they should be in ReadOnlySessionStore. Do I read that right? >> >> Otherwise, I’m happy with your proposal. >> >> Thanks, >> John >> >> On Wed, Jul 1, 2020, at 17:01, Jorge Esteban Quilcate Otoya wrote: >> > Quick update: KIP is updated with latest changes now. >> > Will leave it open this week while working on the PR. >> > >> > Hope to open a new vote thread over the next few days if no additional >> > feedback is provided. >> > >> > Cheers, >> > Jorge. >> > >> > On Mon, Jun 29, 2020 at 11:30 AM Jorge Esteban Quilcate Otoya < >> > quilcate.jo...@gmail.com> wrote: >> > >> > > Thanks, John! >> > > >> > > Make sense to reconsider the current approach. I was heading in a >> similar >> > > direction while drafting the implementation. Metered, Caching, and >> other >> > > layers will also have to get duplicated to build up new methods in >> `Stores` >> > > factory, and class casting issues would appear on stores created by >> DSL. >> > > >> > > I will draft a proposal with new methods (move methods from proposed >> > > interfaces to existing ones) with default implementation in a KIP >> update >> > > and wait for Matthias to chime in to validate this approach. >> > > >> > > Jorge. >> > > >> > > >> > > On Sat, Jun 27, 2020 at 4:01 PM John Roesler <vvcep...@apache.org> >> wrote: >> > > >> > >> Hi Jorge, >> > >> >> > >> Sorry for my silence, I've been absorbed with the 2.6 and 2.5.1 >> releases. >> > >> >> > >> The idea to separate the new methods into "mixin" interfaces seems >> > >> like a good one, but as we've discovered in KIP-614, it doesn't work >> > >> out that way in practice. The problem is that the store >> implementations >> > >> are just the base layer that get composed with other layers in >> Streams >> > >> before they can be accessed in the DSL. This is extremely subtle, so >> > >> I'm going to put everyone to sleep with a detailed explanation: >> > >> >> > >> For example, this is the mechanism by which all KeyValueStore >> > >> implementations get added to Streams: >> > >> org.apache.kafka.streams.state.internals.KeyValueStoreBuilder#build >> > >> return new MeteredKeyValueStore<>( >> > >> maybeWrapCaching(maybeWrapLogging(storeSupplier.get())), >> > >> storeSupplier.metricsScope(), >> > >> time, >> > >> keySerde, >> > >> valueSerde >> > >> ); >> > >> >> > >> In the DSL, the store that a processor gets from the context would be >> > >> the result of this composition. So even if the storeSupplier.get() >> returns >> > >> a store that implements the "reverse" interface, when you try to use >> it >> > >> from a processor like: >> > >> org.apache.kafka.streams.kstream.ValueTransformerWithKey#init >> > >> ReadOnlyBackwardWindowStore<K, V> store = >> > >> (ReadOnlyBackwardWindowStore<K, V>) context.getStateStore(..) >> > >> >> > >> You'd just get a ClassCastException because it's actually a >> > >> MeteredKeyValueStore, which doesn't implement >> > >> ReadOnlyBackwardWindowStore. >> > >> >> > >> The only way to make this work would be to make the Metered, >> > >> Caching, and Logging layers also implement the new interfaces, >> > >> but this effectively forces implementations to also implement >> > >> the interface. Otherwise, the intermediate layers would have to >> > >> cast the store in each method, like this: >> > >> MeteredWindowStore#backwardFetch { >> > >> ((ReadOnlyBackwardWindowStore<K, V>) innerStore).backwardFetch(..) >> > >> } >> > >> >> > >> And then if the implementation doesn't "opt in" by implementing >> > >> the interface, you'd get a ClassCastException, not when you get the >> > >> store, but when you try to use it. >> > >> >> > >> The fact that we get ClassCastExceptions no matter which way we >> > >> turn here indicates that we're really not getting any benefit from >> the >> > >> type system, which makes the extra interfaces seem not worth all the >> > >> code involved. >> > >> >> > >> Where we landed in KIP-614 is that, unless we want to completely >> > >> revamp the way that StateStores work in the DSL, you might as >> > >> well just add the new methods to the existing interfaces. To prevent >> > >> compilation errors, we can add default implementations that throw >> > >> UnsupportedOperationException. If a store doesn't opt in by >> > >> implementing the methods, you'd get an UnsupportedOperationException, >> > >> which seems no worse, and maybe better, than the ClassCastException >> > >> you'd get if we go with the "mixin interface" approach. >> > >> >> > >> A quick note: This entire discussion focuses on the DSL. If you're >> just >> > >> using the Processor API by directly adding the a custom store to the >> > >> Topology: >> > >> org.apache.kafka.streams.Topology#addStateStore >> > >> and then retrieving it in the processor via: >> > >> org.apache.kafka.streams.processor.ProcessorContext#getStateStore >> > >> in >> > >> org.apache.kafka.streams.processor.Processor#init >> > >> >> > >> Then, you can both register and retrieve _any_ StateStore >> implementation. >> > >> There's no need to use KeyValueStore or any other built-in interface. >> > >> In other words, KeyValueStore and company are only part of the DSL, >> > >> not the PAPI. So, discussions about the build-in store interfaces >> are only >> > >> really relevant in the context of the DSL, Transformers, and >> Materialized. >> > >> >> > >> So, in conclusion, I'd really recommend just adding any new methods >> to >> > >> the existing store interfaces. We might be able to revamp the API in >> the >> > >> future to support mixins, but it's a much larger scope of work than >> this >> > >> KIP. >> > >> A more minor comment is that we don't need to add Deprecated variants >> > >> of new methods. >> > >> >> > >> Thanks again, and once again, I'm sorry I tuned out and didn't offer >> this >> > >> feedback before you revised the KIP. >> > >> -John >> > >> >> > >> >> > >> >> > >> >> > >> On Mon, Jun 22, 2020, at 06:11, Jorge Esteban Quilcate Otoya wrote: >> > >> > Hi everyone, >> > >> > >> > >> > I've updated the KIP, applying Matthias' feedback regarding >> interface >> > >> > hierarchy. >> > >> > >> > >> > Also, following the last email, I think we can consider reverse >> > >> operations >> > >> > on KeyValue range as well, as implementation supports lexicographic >> > >> order. >> > >> > >> > >> > I considered different naming between Key-based ranges and >> Time-based >> > >> > ranges, and mitigate confusion when fetching keys and time ranges >> as >> > >> > WindowStore does: >> > >> > >> > >> > Key-based ranges: reverseRange(), reverseAll() >> > >> > Time-based ranges: backwardFetch() >> > >> > >> > >> > Then, key-based changes apply to KeyValueStore, and time-based >> changes >> > >> to >> > >> > Window and Session stores. >> > >> > >> > >> > Let me know if you have any questions. >> > >> > >> > >> > Thanks, >> > >> > Jorge. >> > >> > >> > >> > >> > >> > On Tue, Jun 16, 2020 at 12:47 AM Jorge Esteban Quilcate Otoya < >> > >> > quilcate.jo...@gmail.com> wrote: >> > >> > >> > >> > > Hi everyone, sorry for the late reply. >> > >> > > >> > >> > > Thanks Matthias for your feedback. I think it makes sense to >> > >> reconsider >> > >> > > the current design based on your input. >> > >> > > >> > >> > > After digging deeper into the current implementation, I'd like to >> > >> bring my >> > >> > > current understanding to be double-checked as it might be >> redefining >> > >> the >> > >> > > KIP's scope: >> > >> > > >> > >> > > 1. There are 2 ranges been exposed by different stores: >> > >> > > >> > >> > > a. Key Range >> > >> > > b. Timestamp Range >> > >> > > >> > >> > > So far, we have discussed covering both. >> > >> > > >> > >> > > 2. Key Range functions do not provide ordering guarantees by >> design: >> > >> > > >> > >> > > ```ReadOnlyKeyValueStore.java >> > >> > > /** >> > >> > > * Get an iterator over a given range of keys. This iterator >> must >> > >> be >> > >> > > closed after use. >> > >> > > * The returned iterator must be safe from {@link >> > >> > > java.util.ConcurrentModificationException}s >> > >> > > * and must not return null values. No ordering guarantees >> are >> > >> > > provided. >> > >> > > * ... >> > >> > > */ >> > >> > > KeyValueIterator<K, V> range(K from, K to); >> > >> > > ``` >> > >> > > >> > >> > > Therefore, I'd propose removing Key range operations from the >> scope. >> > >> > > >> > >> > > 3. Timestamp Range operations happen at the SegmentsStore level >> > >> (internal) >> > >> > > API >> > >> > > >> > >> > > AFAICT, Segments wrappers handle all Timestamp ranges queries. >> > >> > > >> > >> > > I'd propose extending `Segments#segments(timeFrom, timeTo, >> backwards)` >> > >> > > with a flag for backwards operations. >> > >> > > >> > >> > > As segments returned will be processed backwards, I'm not >> extending >> > >> > > KeyValueStores to query each segment backwards as previous point >> 2. >> > >> > > >> > >> > > 4. Extend WindowStores implementations with a new >> > >> > > WindowBackwardStore/ReadOnlyBackwardStore: >> > >> > > >> > >> > > ```java >> > >> > > public interface ReadOnlyWindowBackwardStore<K, V> { >> > >> > > WindowStoreIterator<V> backwardFetch(K key, Instant from, >> Instant >> > >> to) >> > >> > > throws IllegalArgumentException; >> > >> > > >> > >> > > KeyValueIterator<Windowed<K>, V> backwardFetch(K from, K to, >> > >> Instant >> > >> > > fromTime, Instant toTime) >> > >> > > throws IllegalArgumentException; >> > >> > > >> > >> > > KeyValueIterator<Windowed<K>, V> backwardFetchAll(Instant >> from, >> > >> > > Instant to) throws IllegalArgumentException; >> > >> > > ``` >> > >> > > >> > >> > > 5. SessionStore is a bit different as it has fetch/find sessions >> > >> spread >> > >> > > between SessionStore and ReadOnlySessionStore. >> > >> > > >> > >> > > I'd propose a new interface `SessionBackwardStore` to expose >> backward >> > >> find >> > >> > > operations: >> > >> > > >> > >> > > ```java >> > >> > > public interface SessionBackwardStore<K, AGG> { >> > >> > > KeyValueIterator<Windowed<K>, AGG> >> backwardFindSessions(final K >> > >> key, >> > >> > > final long earliestSessionEndTime, final long >> latestSessionStartTime); >> > >> > > >> > >> > > KeyValueIterator<Windowed<K>, AGG> >> backwardFindSessions(final K >> > >> > > keyFrom, final K keyTo, final long earliestSessionEndTime, final >> long >> > >> > > latestSessionStartTime); >> > >> > > } >> > >> > > ``` >> > >> > > >> > >> > > If this understanding is correct I'll proceed to update the KIP >> based >> > >> on >> > >> > > this. >> > >> > > >> > >> > > Looking forward to your feedback. >> > >> > > >> > >> > > Thanks, >> > >> > > Jorge. >> > >> > > >> > >> > > On Fri, May 29, 2020 at 3:32 AM Matthias J. Sax < >> mj...@apache.org> >> > >> wrote: >> > >> > > >> > >> > >> Hey, >> > >> > >> >> > >> > >> Sorry that I am late to the game. I am not 100% convinced about >> the >> > >> > >> current proposal. Using a new config as feature flag seems to be >> > >> rather >> > >> > >> "nasty" to me, and flipping from/to is a little bit too fancy >> for my >> > >> > >> personal taste. >> > >> > >> >> > >> > >> I agree, that the original proposal using a "ReadDirection" >> enum is >> > >> not >> > >> > >> ideal either. >> > >> > >> >> > >> > >> Thus, I would like to put out a new idea: We could add a new >> > >> interface >> > >> > >> that offers new methods that return revers iterators. >> > >> > >> >> > >> > >> The KIP already proposes to add `reverseAll()` and it seems >> backward >> > >> > >> incompatible to just add this method to `ReadOnlyKeyValueStore` >> and >> > >> > >> `ReadOnlyWindowStore`. I don't think we could provide a useful >> > >> default >> > >> > >> implementation for custom stores and thus either break >> compatibility >> > >> or >> > >> > >> need add a default that just throws an exception. Neither seems >> to >> > >> be a >> > >> > >> good option. >> > >> > >> >> > >> > >> Using a new interface avoid this issue and allows users >> implementing >> > >> > >> custom stores to opt-in by adding the interface to their stores. >> > >> > >> Furthermore, we don't need any config. In the end, we >> encapsulte the >> > >> > >> change into the store, and our runtime is agnostic to it (as it >> > >> should >> > >> > >> be). >> > >> > >> >> > >> > >> The hierarchy becomes a little complex (but uses would not >> really see >> > >> > >> the complexity): >> > >> > >> >> > >> > >> // exsiting >> > >> > >> ReadOnlyKeyValueStore >> > >> > >> KeyValueStore extend StateStore, ReadOnlyKeyValueStore >> > >> > >> >> > >> > >> >> > >> > >> // helper interface; users don't care >> > >> > >> // need similar ones for other stores >> > >> > >> ReverseReadOnlyKeyValueStore { >> > >> > >> KeyValueIterator<K, V> reverseRange(K from, K to); >> > >> > >> KeyValueIterator<K, V> reverseAll(); >> > >> > >> } >> > >> > >> >> > >> > >> >> > >> > >> // two new user facing interfaces for kv-store >> > >> > >> // need similar ones for other stores >> > >> > >> ReadOnlyKeyValueStoreWithReverseIterators extends >> > >> ReadOnlyKeyValueStore, >> > >> > >> ReverseReadOnlyKeyValueStore >> > >> > >> >> > >> > >> KeyValueStoreWithReverseIterators extends KeyValueStore, >> > >> > >> ReverseReadOnlyKeyValueStore >> > >> > >> >> > >> > >> >> > >> > >> // updated (also internal) >> > >> > >> // also need to update other built-in stores >> > >> > >> RocksDB implements KeyValueStoreWithReverseIterators, >> > >> BulkLoadingStore >> > >> > >> >> > >> > >> >> > >> > >> In the end, user would only care about the two (for kv-case) new >> > >> > >> interface that offer revers iterator (read only and regular) >> and can >> > >> > >> cast stores accordingly in their Processors/Transformers or via >> IQ. >> > >> > >> >> > >> > >> >> > >> > >> Btw: if we add revers iterator for KeyValue and Window store, >> should >> > >> we >> > >> > >> do the same for Session store? >> > >> > >> >> > >> > >> >> > >> > >> >> > >> > >> This might be more code to write, but I believe it provides the >> > >> better >> > >> > >> user experience. Thoughts? >> > >> > >> >> > >> > >> >> > >> > >> >> > >> > >> -Matthias >> > >> > >> >> > >> > >> >> > >> > >> >> > >> > >> >> > >> > >> On 5/26/20 8:47 PM, John Roesler wrote: >> > >> > >> > Sorry for my silence, Jorge, >> > >> > >> > >> > >> > >> > I've just taken another look, and I'm personally happy with >> the >> > >> KIP. >> > >> > >> > >> > >> > >> > Thanks, >> > >> > >> > -John >> > >> > >> > >> > >> > >> > On Tue, May 26, 2020, at 16:17, Jorge Esteban Quilcate Otoya >> wrote: >> > >> > >> >> If no additional comments, I will proceed to start the a vote >> > >> thread. >> > >> > >> >> >> > >> > >> >> Thanks a lot for your feedback! >> > >> > >> >> >> > >> > >> >> On Fri, May 22, 2020 at 9:25 AM Jorge Esteban Quilcate Otoya >> < >> > >> > >> >> quilcate.jo...@gmail.com> wrote: >> > >> > >> >> >> > >> > >> >>> Thanks Sophie. I like the `reverseAll()` idea. >> > >> > >> >>> >> > >> > >> >>> I updated the KIP with your feedback. >> > >> > >> >>> >> > >> > >> >>> >> > >> > >> >>> >> > >> > >> >>> On Fri, May 22, 2020 at 4:22 AM Sophie Blee-Goldman < >> > >> > >> sop...@confluent.io> >> > >> > >> >>> wrote: >> > >> > >> >>> >> > >> > >> >>>> Hm, the case of `all()` does seem to present a dilemma in >> the >> > >> case of >> > >> > >> >>>> variable-length keys. >> > >> > >> >>>> >> > >> > >> >>>> In the case of fixed-length keys, you can just compute the >> keys >> > >> that >> > >> > >> >>>> correspond >> > >> > >> >>>> to the maximum and minimum serialized bytes, then perform a >> > >> `range()` >> > >> > >> >>>> query >> > >> > >> >>>> instead of an `all()`. If your keys don't have a >> well-defined >> > >> > >> ordering >> > >> > >> >>>> such >> > >> > >> >>>> that >> > >> > >> >>>> you can't determine the MAX_KEY, then you probably don't >> care >> > >> about >> > >> > >> the >> > >> > >> >>>> iterator order anyway. >> > >> > >> >>>> >> > >> > >> >>>> But with variable-length keys, there is no MAX_KEY. If >> all your >> > >> > >> keys were >> > >> > >> >>>> just >> > >> > >> >>>> of the form 'a', 'aa', 'aaaaa', 'aaaaaaa' then in fact the >> only >> > >> way >> > >> > >> to >> > >> > >> >>>> figure out the >> > >> > >> >>>> maximum key in the store is by using `all()` -- and >> without a >> > >> reverse >> > >> > >> >>>> iterator, you're >> > >> > >> >>>> doomed to iterate through every single key just to answer >> that >> > >> simple >> > >> > >> >>>> question. >> > >> > >> >>>> >> > >> > >> >>>> That said, I still think determining the iterator order >> based >> > >> on the >> > >> > >> >>>> to/from bytes >> > >> > >> >>>> makes a lot of intuitive sense and gives the API a nice >> > >> symmetry. >> > >> > >> What if >> > >> > >> >>>> we >> > >> > >> >>>> solved the `all()` problem by just giving `all()` a reverse >> > >> form to >> > >> > >> >>>> complement it? >> > >> > >> >>>> Ie we would have `all()` and `reverseAll()`, or something >> to >> > >> that >> > >> > >> effect. >> > >> > >> >>>> >> > >> > >> >>>> On Thu, May 21, 2020 at 3:41 PM Jorge Esteban Quilcate >> Otoya < >> > >> > >> >>>> quilcate.jo...@gmail.com> wrote: >> > >> > >> >>>> >> > >> > >> >>>>> Thanks John. >> > >> > >> >>>>> >> > >> > >> >>>>> Agree. I like the first approach as well, with >> StreamsConfig >> > >> flag >> > >> > >> >>>> passing >> > >> > >> >>>>> by via ProcessorContext. >> > >> > >> >>>>> >> > >> > >> >>>>> Another positive effect with "reverse parameters" is that >> in >> > >> the >> > >> > >> case of >> > >> > >> >>>>> `fetch(keyFrom, keyTo, timeFrom, timeTo)` users can decide >> > >> _which_ >> > >> > >> pair >> > >> > >> >>>> to >> > >> > >> >>>>> flip, whether with `ReadDirection` enum it apply to both. >> > >> > >> >>>>> >> > >> > >> >>>>> The only issue I've found while reviewing the KIP is that >> > >> `all()` >> > >> > >> won't >> > >> > >> >>>> fit >> > >> > >> >>>>> within this approach. >> > >> > >> >>>>> >> > >> > >> >>>>> We could remove it from the KIP and argue that for >> WindowStore, >> > >> > >> >>>>> `fetchAll(0, Long.MAX_VALUE)` can be used to get all in >> reverse >> > >> > >> order, >> > >> > >> >>>> and >> > >> > >> >>>>> for KeyValueStore, no ordering guarantees are provided. >> > >> > >> >>>>> >> > >> > >> >>>>> If there is consensus with this changes, I will go and >> update >> > >> the >> > >> > >> KIP. >> > >> > >> >>>>> >> > >> > >> >>>>> On Thu, May 21, 2020 at 3:33 PM John Roesler < >> > >> vvcep...@apache.org> >> > >> > >> >>>> wrote: >> > >> > >> >>>>> >> > >> > >> >>>>>> Hi Jorge, >> > >> > >> >>>>>> >> > >> > >> >>>>>> Thanks for that idea. I agree, a feature flag would >> protect >> > >> anyone >> > >> > >> >>>>>> who may be depending on the current behavior. >> > >> > >> >>>>>> >> > >> > >> >>>>>> It seems better to locate the feature flag in the >> > >> initialization >> > >> > >> >>>> logic of >> > >> > >> >>>>>> the store, rather than have a method on the "live" store >> that >> > >> > >> changes >> > >> > >> >>>>>> its behavior on the fly. >> > >> > >> >>>>>> >> > >> > >> >>>>>> It seems like there are two options here, one is to add >> a new >> > >> > >> config: >> > >> > >> >>>>>> >> > >> > >> >>>>>> StreamsConfig.ENABLE_BACKWARDS_ITERATION = >> > >> > >> >>>>>> "enable.backwards.iteration >> > >> > >> >>>>>> >> > >> > >> >>>>>> Or we can add a feature flag in Materialized, like >> > >> > >> >>>>>> >> > >> > >> >>>>>> Materialized.enableBackwardsIteration() >> > >> > >> >>>>>> >> > >> > >> >>>>>> I think I'd personally lean toward the config, for the >> > >> following >> > >> > >> >>>> reason. >> > >> > >> >>>>>> The concern that Sophie raised is that someone's program >> may >> > >> depend >> > >> > >> >>>>>> on the existing contract of getting an empty iterator. We >> > >> don't >> > >> > >> want >> > >> > >> >>>> to >> > >> > >> >>>>>> switch behavior when they aren't expecting it, so we >> provide >> > >> them a >> > >> > >> >>>>>> config to assert that they _are_ expecting the new >> behavior, >> > >> which >> > >> > >> >>>>>> means they take responsibility for updating their code to >> > >> expect >> > >> > >> the >> > >> > >> >>>> new >> > >> > >> >>>>>> behavior. >> > >> > >> >>>>>> >> > >> > >> >>>>>> There doesn't seem to be a reason to offer a choice of >> > >> behaviors >> > >> > >> on a >> > >> > >> >>>>>> per-query, or per-store basis. We just want people to be >> not >> > >> > >> surprised >> > >> > >> >>>>>> by this change in general. >> > >> > >> >>>>>> >> > >> > >> >>>>>> What do you think? >> > >> > >> >>>>>> Thanks, >> > >> > >> >>>>>> -John >> > >> > >> >>>>>> >> > >> > >> >>>>>> On Wed, May 20, 2020, at 17:37, Jorge Quilcate wrote: >> > >> > >> >>>>>>> Thank you both for the great feedback. >> > >> > >> >>>>>>> >> > >> > >> >>>>>>> I like the "fancy" proposal :), and how it removes the >> need >> > >> for >> > >> > >> >>>>>>> additional API methods. And with a feature flag on >> > >> `StateStore`, >> > >> > >> >>>>>>> disabled by default, should no break current users. >> > >> > >> >>>>>>> >> > >> > >> >>>>>>> The only side-effect I can think of is that: by moving >> the >> > >> flag >> > >> > >> >>>>> upwards, >> > >> > >> >>>>>>> all later operations become affected; which might be ok >> for >> > >> most >> > >> > >> >>>> (all?) >> > >> > >> >>>>>>> cases. I can't think of an scenario where this would be >> an >> > >> issue, >> > >> > >> >>>> just >> > >> > >> >>>>>>> want to point this out. >> > >> > >> >>>>>>> >> > >> > >> >>>>>>> If moving to this approach, I'd like to check if I got >> this >> > >> right >> > >> > >> >>>>> before >> > >> > >> >>>>>>> updating the KIP: >> > >> > >> >>>>>>> >> > >> > >> >>>>>>> - only `StateStore` will change by having a new method: >> > >> > >> >>>>>>> `backwardIteration()`, `false` by default to keep things >> > >> > >> compatible. >> > >> > >> >>>>>>> - then all `*Stores` will have to update their >> implementation >> > >> > >> based >> > >> > >> >>>> on >> > >> > >> >>>>>>> this flag. >> > >> > >> >>>>>>> >> > >> > >> >>>>>>> >> > >> > >> >>>>>>> On 20/05/2020 21:02, Sophie Blee-Goldman wrote: >> > >> > >> >>>>>>>>> There's no possibility that someone could be relying >> > >> > >> >>>>>>>>> on iterating over that range in increasing order, >> because >> > >> that's >> > >> > >> >>>> not >> > >> > >> >>>>>> what >> > >> > >> >>>>>>>>> happens. However, they could indeed be relying on >> getting >> > >> an >> > >> > >> >>>> empty >> > >> > >> >>>>>>>> iterator >> > >> > >> >>>>>>>> >> > >> > >> >>>>>>>> I just meant that they might be relying on the >> assumption >> > >> that >> > >> > >> the >> > >> > >> >>>>>> range >> > >> > >> >>>>>>>> query >> > >> > >> >>>>>>>> will never return results with decreasing keys. The >> empty >> > >> > >> iterator >> > >> > >> >>>>>> wouldn't >> > >> > >> >>>>>>>> break that contract, but of course a surprise reverse >> > >> iterator >> > >> > >> >>>> would. >> > >> > >> >>>>>>>> >> > >> > >> >>>>>>>> FWIW I actually am in favor of automatically >> converting to a >> > >> > >> >>>> reverse >> > >> > >> >>>>>>>> iterator, >> > >> > >> >>>>>>>> I just thought we should consider whether this should >> be >> > >> off by >> > >> > >> >>>>>> default or >> > >> > >> >>>>>>>> even possible to disable at all. >> > >> > >> >>>>>>>> >> > >> > >> >>>>>>>> On Tue, May 19, 2020 at 7:42 PM John Roesler < >> > >> > >> vvcep...@apache.org >> > >> > >> >>>>> >> > >> > >> >>>>>> wrote: >> > >> > >> >>>>>>>> >> > >> > >> >>>>>>>>> Thanks for the response, Sophie, >> > >> > >> >>>>>>>>> >> > >> > >> >>>>>>>>> I wholeheartedly agree we should take as much into >> account >> > >> as >> > >> > >> >>>>> possible >> > >> > >> >>>>>>>>> up front, rather than regretting our decisions later. >> I >> > >> actually >> > >> > >> >>>> do >> > >> > >> >>>>>> share >> > >> > >> >>>>>>>>> your vague sense of worry, which was what led me to >> say >> > >> > >> initially >> > >> > >> >>>>>> that I >> > >> > >> >>>>>>>>> thought my counterproposal might be "too fancy". >> > >> Sometimes, it's >> > >> > >> >>>>>> better >> > >> > >> >>>>>>>>> to be explicit instead of "elegant", if we think more >> > >> people >> > >> > >> >>>> will be >> > >> > >> >>>>>>>>> confused >> > >> > >> >>>>>>>>> than not. >> > >> > >> >>>>>>>>> >> > >> > >> >>>>>>>>> I really don't think that there's any danger of >> "relying >> > >> on a >> > >> > >> >>>> bug" >> > >> > >> >>>>>> here, >> > >> > >> >>>>>>>>> although >> > >> > >> >>>>>>>>> people certainly could be relying on current >> behavior. One >> > >> thing >> > >> > >> >>>> to >> > >> > >> >>>>> be >> > >> > >> >>>>>>>>> clear >> > >> > >> >>>>>>>>> about (which I just left a more detailed comment in >> > >> KAFKA-8159 >> > >> > >> >>>>> about) >> > >> > >> >>>>>> is >> > >> > >> >>>>>>>>> that >> > >> > >> >>>>>>>>> when we say something like key1 > key2, this ordering >> is >> > >> defined >> > >> > >> >>>> by >> > >> > >> >>>>>> the >> > >> > >> >>>>>>>>> serde's output and nothing else. >> > >> > >> >>>>>>>>> >> > >> > >> >>>>>>>>> Currently, thanks to your fix in >> > >> > >> >>>>>> https://github.com/apache/kafka/pull/6521 >> > >> > >> >>>>>>>>> , >> > >> > >> >>>>>>>>> the store contract is that for range scans, if from > >> to, >> > >> then >> > >> > >> >>>> the >> > >> > >> >>>>>> store >> > >> > >> >>>>>>>>> must >> > >> > >> >>>>>>>>> return an empty iterator. There's no possibility that >> > >> someone >> > >> > >> >>>> could >> > >> > >> >>>>> be >> > >> > >> >>>>>>>>> relying >> > >> > >> >>>>>>>>> on iterating over that range in increasing order, >> because >> > >> that's >> > >> > >> >>>> not >> > >> > >> >>>>>> what >> > >> > >> >>>>>>>>> happens. However, they could indeed be relying on >> getting >> > >> an >> > >> > >> >>>> empty >> > >> > >> >>>>>>>>> iterator. >> > >> > >> >>>>>>>>> >> > >> > >> >>>>>>>>> My counterproposal was to actually change this >> contract to >> > >> say >> > >> > >> >>>> that >> > >> > >> >>>>>> the >> > >> > >> >>>>>>>>> store >> > >> > >> >>>>>>>>> must return an iterator over the keys in that range, >> but >> > >> in the >> > >> > >> >>>>>> reverse >> > >> > >> >>>>>>>>> order. >> > >> > >> >>>>>>>>> So, in addition to considering whether this idea is >> "too >> > >> fancy" >> > >> > >> >>>> (aka >> > >> > >> >>>>>>>>> confusing), >> > >> > >> >>>>>>>>> we should also consider the likelihood of breaking an >> > >> existing >> > >> > >> >>>>>> program with >> > >> > >> >>>>>>>>> this behavior/contract change. >> > >> > >> >>>>>>>>> >> > >> > >> >>>>>>>>> To echo your clarification, I'm also not advocating >> > >> strongly in >> > >> > >> >>>>> favor >> > >> > >> >>>>>> of my >> > >> > >> >>>>>>>>> proposal. I just wanted to present it for >> consideration >> > >> > >> alongside >> > >> > >> >>>>>> Jorge's >> > >> > >> >>>>>>>>> original one. >> > >> > >> >>>>>>>>> >> > >> > >> >>>>>>>>> Thanks for raising these very good points, >> > >> > >> >>>>>>>>> -John >> > >> > >> >>>>>>>>> >> > >> > >> >>>>>>>>> On Tue, May 19, 2020, at 20:49, Sophie Blee-Goldman >> wrote: >> > >> > >> >>>>>>>>>>> Rather than working around it, I think we should >> just >> > >> fix it >> > >> > >> >>>>>>>>>> Now *that's* a "fancy" idea :P >> > >> > >> >>>>>>>>>> >> > >> > >> >>>>>>>>>> That was my primary concern, although I do have a >> vague >> > >> sense >> > >> > >> of >> > >> > >> >>>>>> worry >> > >> > >> >>>>>>>>>> that we might be allowing users to get into trouble >> > >> without >> > >> > >> >>>>>> realizing it. >> > >> > >> >>>>>>>>>> For example if their custom serdes suffer a similar >> bug >> > >> as the >> > >> > >> >>>>> above, >> > >> > >> >>>>>>>>>> and/or >> > >> > >> >>>>>>>>>> they rely on getting results in increasing order (of >> the >> > >> keys) >> > >> > >> >>>> even >> > >> > >> >>>>>> when >> > >> > >> >>>>>>>>>> to < from. Maybe they're relying on the fact that the >> > >> range >> > >> > >> >>>> query >> > >> > >> >>>>>> returns >> > >> > >> >>>>>>>>>> nothing in that case. >> > >> > >> >>>>>>>>>> >> > >> > >> >>>>>>>>>> Not sure if that qualifies as relying on a bug or >> not, >> > >> but in >> > >> > >> >>>> that >> > >> > >> >>>>>> past >> > >> > >> >>>>>>>>>> we've >> > >> > >> >>>>>>>>>> taken the stance that we should not break >> compatibility >> > >> even if >> > >> > >> >>>> the >> > >> > >> >>>>>> user >> > >> > >> >>>>>>>>>> was relying on bugs or unintentional behavior. >> > >> > >> >>>>>>>>>> >> > >> > >> >>>>>>>>>> Just to clarify I'm not advocating strongly against >> this >> > >> > >> >>>> proposal, >> > >> > >> >>>>>> just >> > >> > >> >>>>>>>>>> laying >> > >> > >> >>>>>>>>>> out some considerations we should take into account. >> At >> > >> the end >> > >> > >> >>>> of >> > >> > >> >>>>>> the >> > >> > >> >>>>>>>>> day >> > >> > >> >>>>>>>>>> we should do what's right rather than maintain >> > >> compatibility >> > >> > >> >>>> with >> > >> > >> >>>>>>>>> existing >> > >> > >> >>>>>>>>>> bugs, but sometimes there's a reasonable middle >> ground. >> > >> > >> >>>>>>>>>> >> > >> > >> >>>>>>>>>> On Tue, May 19, 2020 at 6:15 PM John Roesler < >> > >> > >> >>>> vvcep...@apache.org> >> > >> > >> >>>>>>>>> wrote: >> > >> > >> >>>>>>>>>>> Thanks Sophie, >> > >> > >> >>>>>>>>>>> >> > >> > >> >>>>>>>>>>> Woah, that’s a nasty bug. Rather than working >> around it, >> > >> I >> > >> > >> >>>> think >> > >> > >> >>>>> we >> > >> > >> >>>>>>>>> should >> > >> > >> >>>>>>>>>>> just fix it. I’ll leave some comments on the Jira. >> > >> > >> >>>>>>>>>>> >> > >> > >> >>>>>>>>>>> It doesn’t seem like it should be this KIP’s concern >> > >> that some >> > >> > >> >>>>>> serdes >> > >> > >> >>>>>>>>>>> might be incorrectly written. >> > >> > >> >>>>>>>>>>> >> > >> > >> >>>>>>>>>>> Were there other practical concerns that you had in >> mind? >> > >> > >> >>>>>>>>>>> >> > >> > >> >>>>>>>>>>> Thanks, >> > >> > >> >>>>>>>>>>> John >> > >> > >> >>>>>>>>>>> >> > >> > >> >>>>>>>>>>> On Tue, May 19, 2020, at 19:10, Sophie Blee-Goldman >> > >> wrote: >> > >> > >> >>>>>>>>>>>> I like this "fancy idea" to just flip the to/from >> bytes >> > >> but I >> > >> > >> >>>>> think >> > >> > >> >>>>>>>>> there >> > >> > >> >>>>>>>>>>>> are some practical limitations to implementing >> this. In >> > >> > >> >>>>> particular >> > >> > >> >>>>>>>>>>>> I'm thinking about this issue >> > >> > >> >>>>>>>>>>>> <https://issues.apache.org/jira/browse/KAFKA-8159> >> > >> with the >> > >> > >> >>>>>> built-in >> > >> > >> >>>>>>>>>>> signed >> > >> > >> >>>>>>>>>>>> number serdes. >> > >> > >> >>>>>>>>>>>> >> > >> > >> >>>>>>>>>>>> This trick would actually fix the problem for >> > >> > >> >>>> negative-negative >> > >> > >> >>>>>>>>> queries >> > >> > >> >>>>>>>>>>>> (ie where to & from are negative) but would cause >> > >> > >> undetectable >> > >> > >> >>>>>>>>>>>> incorrect results for negative-positive queries. >> For >> > >> example, >> > >> > >> >>>> say >> > >> > >> >>>>>> you >> > >> > >> >>>>>>>>>>>> call #range with from = -1 and to = 1, using the >> Short >> > >> > >> serdes. >> > >> > >> >>>>> The >> > >> > >> >>>>>>>>>>>> serialized bytes for that are >> > >> > >> >>>>>>>>>>>> >> > >> > >> >>>>>>>>>>>> from = 1111111111111111 >> > >> > >> >>>>>>>>>>>> to = 0000000000000001 >> > >> > >> >>>>>>>>>>>> >> > >> > >> >>>>>>>>>>>> so we would end up flipping those and iterating >> over >> > >> all keys >> > >> > >> >>>>> from >> > >> > >> >>>>>>>>>>>> 0000000000000001 to 1111111111111111. Iterating in >> > >> > >> >>>>> lexicographical >> > >> > >> >>>>>>>>>>>> order means we would iterate over every key in the >> space >> > >> > >> >>>> *except* >> > >> > >> >>>>>> for >> > >> > >> >>>>>>>>>>>> 0, but 0 is actually the *only* other key we meant >> to be >> > >> > >> >>>> included >> > >> > >> >>>>>> in >> > >> > >> >>>>>>>>> the >> > >> > >> >>>>>>>>>>>> range query. >> > >> > >> >>>>>>>>>>>> >> > >> > >> >>>>>>>>>>>> Currently we just log a warning and return an empty >> > >> iterator >> > >> > >> >>>> when >> > >> > >> >>>>>>>>>>>> to < from, which is obviously also incorrect but >> feels >> > >> > >> >>>> slightly >> > >> > >> >>>>>> more >> > >> > >> >>>>>>>>>>>> palatable. If we start automatically converting to >> > >> reverse >> > >> > >> >>>>> queries >> > >> > >> >>>>>> we >> > >> > >> >>>>>>>>>>>> can't even log a warning in this case unless we >> wanted >> > >> to log >> > >> > >> >>>> a >> > >> > >> >>>>>>>>> warning >> > >> > >> >>>>>>>>>>>> every time, which would be weird to do for a valid >> > >> usage of a >> > >> > >> >>>> new >> > >> > >> >>>>>>>>>>>> feature. >> > >> > >> >>>>>>>>>>>> >> > >> > >> >>>>>>>>>>>> All that said, I still like the idea overall. Off >> the >> > >> top of >> > >> > >> >>>> my >> > >> > >> >>>>>> head >> > >> > >> >>>>>>>>> I >> > >> > >> >>>>>>>>>>> guess >> > >> > >> >>>>>>>>>>>> we could add a store config to enable/disable >> automatic >> > >> > >> >>>> reverse >> > >> > >> >>>>>>>>>>> iteration, >> > >> > >> >>>>>>>>>>>> which is off by default? >> > >> > >> >>>>>>>>>>>> >> > >> > >> >>>>>>>>>>>> Thanks for the KIP! This will be a nice addition >> > >> > >> >>>>>>>>>>>> >> > >> > >> >>>>>>>>>>>> Sophie >> > >> > >> >>>>>>>>>>>> >> > >> > >> >>>>>>>>>>>> >> > >> > >> >>>>>>>>>>>> On Tue, May 19, 2020 at 3:21 PM John Roesler < >> > >> > >> >>>>> vvcep...@apache.org> >> > >> > >> >>>>>>>>>>> wrote: >> > >> > >> >>>>>>>>>>>>> Hi there Jorge, >> > >> > >> >>>>>>>>>>>>> >> > >> > >> >>>>>>>>>>>>> Thanks for the KIP! >> > >> > >> >>>>>>>>>>>>> >> > >> > >> >>>>>>>>>>>>> I think this feature sounds very reasonable. >> > >> > >> >>>>>>>>>>>>> >> > >> > >> >>>>>>>>>>>>> I'm not 100% sure if this is "too fancy", but >> what do >> > >> you >> > >> > >> >>>> think >> > >> > >> >>>>>>>>>>>>> about avoiding the enum by instead allowing >> people to >> > >> flip >> > >> > >> >>>>>>>>>>>>> the "from" and "to" endpoints? I.e., reading from >> "A" >> > >> to "Z" >> > >> > >> >>>>> would >> > >> > >> >>>>>>>>>>>>> be a forward scan, and from "Z" to "A" would be a >> > >> backward >> > >> > >> >>>> one? >> > >> > >> >>>>>>>>>>>>> >> > >> > >> >>>>>>>>>>>>> Thanks, >> > >> > >> >>>>>>>>>>>>> -John >> > >> > >> >>>>>>>>>>>>> >> > >> > >> >>>>>>>>>>>>> On Tue, May 19, 2020, at 16:20, Jorge Quilcate >> wrote: >> > >> > >> >>>>>>>>>>>>>> Hi everyone, >> > >> > >> >>>>>>>>>>>>>> >> > >> > >> >>>>>>>>>>>>>> I would like to start the discussion for KIP-617: >> > >> > >> >>>>>>>>>>>>>> >> > >> > >> >>>>>>>>> >> > >> > >> >>>>>> >> > >> > >> >>>>> >> > >> > >> >>>> >> > >> > >> >> > >> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-617%3A+Allow+Kafka+Streams+State+Stores+to+be+iterated+backwards >> > >> > >> >>>>>>>>>>>>>> Looking forward to your feedback. >> > >> > >> >>>>>>>>>>>>>> >> > >> > >> >>>>>>>>>>>>>> Thanks! >> > >> > >> >>>>>>>>>>>>>> Jorge. >> > >> > >> >>>>>>>>>>>>>> >> > >> > >> >>>>>>>>>>>>>> >> > >> > >> >>>>>>>>>>>>>> Attachments: >> > >> > >> >>>>>>>>>>>>>> * 0x5F2C6E22064982DF.asc >> > >> > >> >>>>>>> >> > >> > >> >>>>>>> >> > >> > >> >>>>>>> Attachments: >> > >> > >> >>>>>>> * 0x5F2C6E22064982DF.asc >> > >> > >> >>>>>> >> > >> > >> >>>>> >> > >> > >> >>>> >> > >> > >> >>> >> > >> > >> >> >> > >> > >> >> > >> > >> >> > >> > >> > >> >> > > >> > >> >