Hi Jorge, Thanks for the update. I think this is a good plan.
I just took a look at the KIP again, and it looks like there’s a revision error in which two methods are proposed for SessionStore, but seem like they should be in ReadOnlySessionStore. Do I read that right? Otherwise, I’m happy with your proposal. Thanks, John On Wed, Jul 1, 2020, at 17:01, Jorge Esteban Quilcate Otoya wrote: > Quick update: KIP is updated with latest changes now. > Will leave it open this week while working on the PR. > > Hope to open a new vote thread over the next few days if no additional > feedback is provided. > > Cheers, > Jorge. > > On Mon, Jun 29, 2020 at 11:30 AM Jorge Esteban Quilcate Otoya < > quilcate.jo...@gmail.com> wrote: > > > Thanks, John! > > > > Make sense to reconsider the current approach. I was heading in a similar > > direction while drafting the implementation. Metered, Caching, and other > > layers will also have to get duplicated to build up new methods in `Stores` > > factory, and class casting issues would appear on stores created by DSL. > > > > I will draft a proposal with new methods (move methods from proposed > > interfaces to existing ones) with default implementation in a KIP update > > and wait for Matthias to chime in to validate this approach. > > > > Jorge. > > > > > > On Sat, Jun 27, 2020 at 4:01 PM John Roesler <vvcep...@apache.org> wrote: > > > >> Hi Jorge, > >> > >> Sorry for my silence, I've been absorbed with the 2.6 and 2.5.1 releases. > >> > >> The idea to separate the new methods into "mixin" interfaces seems > >> like a good one, but as we've discovered in KIP-614, it doesn't work > >> out that way in practice. The problem is that the store implementations > >> are just the base layer that get composed with other layers in Streams > >> before they can be accessed in the DSL. This is extremely subtle, so > >> I'm going to put everyone to sleep with a detailed explanation: > >> > >> For example, this is the mechanism by which all KeyValueStore > >> implementations get added to Streams: > >> org.apache.kafka.streams.state.internals.KeyValueStoreBuilder#build > >> return new MeteredKeyValueStore<>( > >> maybeWrapCaching(maybeWrapLogging(storeSupplier.get())), > >> storeSupplier.metricsScope(), > >> time, > >> keySerde, > >> valueSerde > >> ); > >> > >> In the DSL, the store that a processor gets from the context would be > >> the result of this composition. So even if the storeSupplier.get() returns > >> a store that implements the "reverse" interface, when you try to use it > >> from a processor like: > >> org.apache.kafka.streams.kstream.ValueTransformerWithKey#init > >> ReadOnlyBackwardWindowStore<K, V> store = > >> (ReadOnlyBackwardWindowStore<K, V>) context.getStateStore(..) > >> > >> You'd just get a ClassCastException because it's actually a > >> MeteredKeyValueStore, which doesn't implement > >> ReadOnlyBackwardWindowStore. > >> > >> The only way to make this work would be to make the Metered, > >> Caching, and Logging layers also implement the new interfaces, > >> but this effectively forces implementations to also implement > >> the interface. Otherwise, the intermediate layers would have to > >> cast the store in each method, like this: > >> MeteredWindowStore#backwardFetch { > >> ((ReadOnlyBackwardWindowStore<K, V>) innerStore).backwardFetch(..) > >> } > >> > >> And then if the implementation doesn't "opt in" by implementing > >> the interface, you'd get a ClassCastException, not when you get the > >> store, but when you try to use it. > >> > >> The fact that we get ClassCastExceptions no matter which way we > >> turn here indicates that we're really not getting any benefit from the > >> type system, which makes the extra interfaces seem not worth all the > >> code involved. > >> > >> Where we landed in KIP-614 is that, unless we want to completely > >> revamp the way that StateStores work in the DSL, you might as > >> well just add the new methods to the existing interfaces. To prevent > >> compilation errors, we can add default implementations that throw > >> UnsupportedOperationException. If a store doesn't opt in by > >> implementing the methods, you'd get an UnsupportedOperationException, > >> which seems no worse, and maybe better, than the ClassCastException > >> you'd get if we go with the "mixin interface" approach. > >> > >> A quick note: This entire discussion focuses on the DSL. If you're just > >> using the Processor API by directly adding the a custom store to the > >> Topology: > >> org.apache.kafka.streams.Topology#addStateStore > >> and then retrieving it in the processor via: > >> org.apache.kafka.streams.processor.ProcessorContext#getStateStore > >> in > >> org.apache.kafka.streams.processor.Processor#init > >> > >> Then, you can both register and retrieve _any_ StateStore implementation. > >> There's no need to use KeyValueStore or any other built-in interface. > >> In other words, KeyValueStore and company are only part of the DSL, > >> not the PAPI. So, discussions about the build-in store interfaces are only > >> really relevant in the context of the DSL, Transformers, and Materialized. > >> > >> So, in conclusion, I'd really recommend just adding any new methods to > >> the existing store interfaces. We might be able to revamp the API in the > >> future to support mixins, but it's a much larger scope of work than this > >> KIP. > >> A more minor comment is that we don't need to add Deprecated variants > >> of new methods. > >> > >> Thanks again, and once again, I'm sorry I tuned out and didn't offer this > >> feedback before you revised the KIP. > >> -John > >> > >> > >> > >> > >> On Mon, Jun 22, 2020, at 06:11, Jorge Esteban Quilcate Otoya wrote: > >> > Hi everyone, > >> > > >> > I've updated the KIP, applying Matthias' feedback regarding interface > >> > hierarchy. > >> > > >> > Also, following the last email, I think we can consider reverse > >> operations > >> > on KeyValue range as well, as implementation supports lexicographic > >> order. > >> > > >> > I considered different naming between Key-based ranges and Time-based > >> > ranges, and mitigate confusion when fetching keys and time ranges as > >> > WindowStore does: > >> > > >> > Key-based ranges: reverseRange(), reverseAll() > >> > Time-based ranges: backwardFetch() > >> > > >> > Then, key-based changes apply to KeyValueStore, and time-based changes > >> to > >> > Window and Session stores. > >> > > >> > Let me know if you have any questions. > >> > > >> > Thanks, > >> > Jorge. > >> > > >> > > >> > On Tue, Jun 16, 2020 at 12:47 AM Jorge Esteban Quilcate Otoya < > >> > quilcate.jo...@gmail.com> wrote: > >> > > >> > > Hi everyone, sorry for the late reply. > >> > > > >> > > Thanks Matthias for your feedback. I think it makes sense to > >> reconsider > >> > > the current design based on your input. > >> > > > >> > > After digging deeper into the current implementation, I'd like to > >> bring my > >> > > current understanding to be double-checked as it might be redefining > >> the > >> > > KIP's scope: > >> > > > >> > > 1. There are 2 ranges been exposed by different stores: > >> > > > >> > > a. Key Range > >> > > b. Timestamp Range > >> > > > >> > > So far, we have discussed covering both. > >> > > > >> > > 2. Key Range functions do not provide ordering guarantees by design: > >> > > > >> > > ```ReadOnlyKeyValueStore.java > >> > > /** > >> > > * Get an iterator over a given range of keys. This iterator must > >> be > >> > > closed after use. > >> > > * The returned iterator must be safe from {@link > >> > > java.util.ConcurrentModificationException}s > >> > > * and must not return null values. No ordering guarantees are > >> > > provided. > >> > > * ... > >> > > */ > >> > > KeyValueIterator<K, V> range(K from, K to); > >> > > ``` > >> > > > >> > > Therefore, I'd propose removing Key range operations from the scope. > >> > > > >> > > 3. Timestamp Range operations happen at the SegmentsStore level > >> (internal) > >> > > API > >> > > > >> > > AFAICT, Segments wrappers handle all Timestamp ranges queries. > >> > > > >> > > I'd propose extending `Segments#segments(timeFrom, timeTo, backwards)` > >> > > with a flag for backwards operations. > >> > > > >> > > As segments returned will be processed backwards, I'm not extending > >> > > KeyValueStores to query each segment backwards as previous point 2. > >> > > > >> > > 4. Extend WindowStores implementations with a new > >> > > WindowBackwardStore/ReadOnlyBackwardStore: > >> > > > >> > > ```java > >> > > public interface ReadOnlyWindowBackwardStore<K, V> { > >> > > WindowStoreIterator<V> backwardFetch(K key, Instant from, Instant > >> to) > >> > > throws IllegalArgumentException; > >> > > > >> > > KeyValueIterator<Windowed<K>, V> backwardFetch(K from, K to, > >> Instant > >> > > fromTime, Instant toTime) > >> > > throws IllegalArgumentException; > >> > > > >> > > KeyValueIterator<Windowed<K>, V> backwardFetchAll(Instant from, > >> > > Instant to) throws IllegalArgumentException; > >> > > ``` > >> > > > >> > > 5. SessionStore is a bit different as it has fetch/find sessions > >> spread > >> > > between SessionStore and ReadOnlySessionStore. > >> > > > >> > > I'd propose a new interface `SessionBackwardStore` to expose backward > >> find > >> > > operations: > >> > > > >> > > ```java > >> > > public interface SessionBackwardStore<K, AGG> { > >> > > KeyValueIterator<Windowed<K>, AGG> backwardFindSessions(final K > >> key, > >> > > final long earliestSessionEndTime, final long latestSessionStartTime); > >> > > > >> > > KeyValueIterator<Windowed<K>, AGG> backwardFindSessions(final K > >> > > keyFrom, final K keyTo, final long earliestSessionEndTime, final long > >> > > latestSessionStartTime); > >> > > } > >> > > ``` > >> > > > >> > > If this understanding is correct I'll proceed to update the KIP based > >> on > >> > > this. > >> > > > >> > > Looking forward to your feedback. > >> > > > >> > > Thanks, > >> > > Jorge. > >> > > > >> > > On Fri, May 29, 2020 at 3:32 AM Matthias J. Sax <mj...@apache.org> > >> wrote: > >> > > > >> > >> Hey, > >> > >> > >> > >> Sorry that I am late to the game. I am not 100% convinced about the > >> > >> current proposal. Using a new config as feature flag seems to be > >> rather > >> > >> "nasty" to me, and flipping from/to is a little bit too fancy for my > >> > >> personal taste. > >> > >> > >> > >> I agree, that the original proposal using a "ReadDirection" enum is > >> not > >> > >> ideal either. > >> > >> > >> > >> Thus, I would like to put out a new idea: We could add a new > >> interface > >> > >> that offers new methods that return revers iterators. > >> > >> > >> > >> The KIP already proposes to add `reverseAll()` and it seems backward > >> > >> incompatible to just add this method to `ReadOnlyKeyValueStore` and > >> > >> `ReadOnlyWindowStore`. I don't think we could provide a useful > >> default > >> > >> implementation for custom stores and thus either break compatibility > >> or > >> > >> need add a default that just throws an exception. Neither seems to > >> be a > >> > >> good option. > >> > >> > >> > >> Using a new interface avoid this issue and allows users implementing > >> > >> custom stores to opt-in by adding the interface to their stores. > >> > >> Furthermore, we don't need any config. In the end, we encapsulte the > >> > >> change into the store, and our runtime is agnostic to it (as it > >> should > >> > >> be). > >> > >> > >> > >> The hierarchy becomes a little complex (but uses would not really see > >> > >> the complexity): > >> > >> > >> > >> // exsiting > >> > >> ReadOnlyKeyValueStore > >> > >> KeyValueStore extend StateStore, ReadOnlyKeyValueStore > >> > >> > >> > >> > >> > >> // helper interface; users don't care > >> > >> // need similar ones for other stores > >> > >> ReverseReadOnlyKeyValueStore { > >> > >> KeyValueIterator<K, V> reverseRange(K from, K to); > >> > >> KeyValueIterator<K, V> reverseAll(); > >> > >> } > >> > >> > >> > >> > >> > >> // two new user facing interfaces for kv-store > >> > >> // need similar ones for other stores > >> > >> ReadOnlyKeyValueStoreWithReverseIterators extends > >> ReadOnlyKeyValueStore, > >> > >> ReverseReadOnlyKeyValueStore > >> > >> > >> > >> KeyValueStoreWithReverseIterators extends KeyValueStore, > >> > >> ReverseReadOnlyKeyValueStore > >> > >> > >> > >> > >> > >> // updated (also internal) > >> > >> // also need to update other built-in stores > >> > >> RocksDB implements KeyValueStoreWithReverseIterators, > >> BulkLoadingStore > >> > >> > >> > >> > >> > >> In the end, user would only care about the two (for kv-case) new > >> > >> interface that offer revers iterator (read only and regular) and can > >> > >> cast stores accordingly in their Processors/Transformers or via IQ. > >> > >> > >> > >> > >> > >> Btw: if we add revers iterator for KeyValue and Window store, should > >> we > >> > >> do the same for Session store? > >> > >> > >> > >> > >> > >> > >> > >> This might be more code to write, but I believe it provides the > >> better > >> > >> user experience. Thoughts? > >> > >> > >> > >> > >> > >> > >> > >> -Matthias > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> On 5/26/20 8:47 PM, John Roesler wrote: > >> > >> > Sorry for my silence, Jorge, > >> > >> > > >> > >> > I've just taken another look, and I'm personally happy with the > >> KIP. > >> > >> > > >> > >> > Thanks, > >> > >> > -John > >> > >> > > >> > >> > On Tue, May 26, 2020, at 16:17, Jorge Esteban Quilcate Otoya wrote: > >> > >> >> If no additional comments, I will proceed to start the a vote > >> thread. > >> > >> >> > >> > >> >> Thanks a lot for your feedback! > >> > >> >> > >> > >> >> On Fri, May 22, 2020 at 9:25 AM Jorge Esteban Quilcate Otoya < > >> > >> >> quilcate.jo...@gmail.com> wrote: > >> > >> >> > >> > >> >>> Thanks Sophie. I like the `reverseAll()` idea. > >> > >> >>> > >> > >> >>> I updated the KIP with your feedback. > >> > >> >>> > >> > >> >>> > >> > >> >>> > >> > >> >>> On Fri, May 22, 2020 at 4:22 AM Sophie Blee-Goldman < > >> > >> sop...@confluent.io> > >> > >> >>> wrote: > >> > >> >>> > >> > >> >>>> Hm, the case of `all()` does seem to present a dilemma in the > >> case of > >> > >> >>>> variable-length keys. > >> > >> >>>> > >> > >> >>>> In the case of fixed-length keys, you can just compute the keys > >> that > >> > >> >>>> correspond > >> > >> >>>> to the maximum and minimum serialized bytes, then perform a > >> `range()` > >> > >> >>>> query > >> > >> >>>> instead of an `all()`. If your keys don't have a well-defined > >> > >> ordering > >> > >> >>>> such > >> > >> >>>> that > >> > >> >>>> you can't determine the MAX_KEY, then you probably don't care > >> about > >> > >> the > >> > >> >>>> iterator order anyway. > >> > >> >>>> > >> > >> >>>> But with variable-length keys, there is no MAX_KEY. If all your > >> > >> keys were > >> > >> >>>> just > >> > >> >>>> of the form 'a', 'aa', 'aaaaa', 'aaaaaaa' then in fact the only > >> way > >> > >> to > >> > >> >>>> figure out the > >> > >> >>>> maximum key in the store is by using `all()` -- and without a > >> reverse > >> > >> >>>> iterator, you're > >> > >> >>>> doomed to iterate through every single key just to answer that > >> simple > >> > >> >>>> question. > >> > >> >>>> > >> > >> >>>> That said, I still think determining the iterator order based > >> on the > >> > >> >>>> to/from bytes > >> > >> >>>> makes a lot of intuitive sense and gives the API a nice > >> symmetry. > >> > >> What if > >> > >> >>>> we > >> > >> >>>> solved the `all()` problem by just giving `all()` a reverse > >> form to > >> > >> >>>> complement it? > >> > >> >>>> Ie we would have `all()` and `reverseAll()`, or something to > >> that > >> > >> effect. > >> > >> >>>> > >> > >> >>>> On Thu, May 21, 2020 at 3:41 PM Jorge Esteban Quilcate Otoya < > >> > >> >>>> quilcate.jo...@gmail.com> wrote: > >> > >> >>>> > >> > >> >>>>> Thanks John. > >> > >> >>>>> > >> > >> >>>>> Agree. I like the first approach as well, with StreamsConfig > >> flag > >> > >> >>>> passing > >> > >> >>>>> by via ProcessorContext. > >> > >> >>>>> > >> > >> >>>>> Another positive effect with "reverse parameters" is that in > >> the > >> > >> case of > >> > >> >>>>> `fetch(keyFrom, keyTo, timeFrom, timeTo)` users can decide > >> _which_ > >> > >> pair > >> > >> >>>> to > >> > >> >>>>> flip, whether with `ReadDirection` enum it apply to both. > >> > >> >>>>> > >> > >> >>>>> The only issue I've found while reviewing the KIP is that > >> `all()` > >> > >> won't > >> > >> >>>> fit > >> > >> >>>>> within this approach. > >> > >> >>>>> > >> > >> >>>>> We could remove it from the KIP and argue that for WindowStore, > >> > >> >>>>> `fetchAll(0, Long.MAX_VALUE)` can be used to get all in reverse > >> > >> order, > >> > >> >>>> and > >> > >> >>>>> for KeyValueStore, no ordering guarantees are provided. > >> > >> >>>>> > >> > >> >>>>> If there is consensus with this changes, I will go and update > >> the > >> > >> KIP. > >> > >> >>>>> > >> > >> >>>>> On Thu, May 21, 2020 at 3:33 PM John Roesler < > >> vvcep...@apache.org> > >> > >> >>>> wrote: > >> > >> >>>>> > >> > >> >>>>>> Hi Jorge, > >> > >> >>>>>> > >> > >> >>>>>> Thanks for that idea. I agree, a feature flag would protect > >> anyone > >> > >> >>>>>> who may be depending on the current behavior. > >> > >> >>>>>> > >> > >> >>>>>> It seems better to locate the feature flag in the > >> initialization > >> > >> >>>> logic of > >> > >> >>>>>> the store, rather than have a method on the "live" store that > >> > >> changes > >> > >> >>>>>> its behavior on the fly. > >> > >> >>>>>> > >> > >> >>>>>> It seems like there are two options here, one is to add a new > >> > >> config: > >> > >> >>>>>> > >> > >> >>>>>> StreamsConfig.ENABLE_BACKWARDS_ITERATION = > >> > >> >>>>>> "enable.backwards.iteration > >> > >> >>>>>> > >> > >> >>>>>> Or we can add a feature flag in Materialized, like > >> > >> >>>>>> > >> > >> >>>>>> Materialized.enableBackwardsIteration() > >> > >> >>>>>> > >> > >> >>>>>> I think I'd personally lean toward the config, for the > >> following > >> > >> >>>> reason. > >> > >> >>>>>> The concern that Sophie raised is that someone's program may > >> depend > >> > >> >>>>>> on the existing contract of getting an empty iterator. We > >> don't > >> > >> want > >> > >> >>>> to > >> > >> >>>>>> switch behavior when they aren't expecting it, so we provide > >> them a > >> > >> >>>>>> config to assert that they _are_ expecting the new behavior, > >> which > >> > >> >>>>>> means they take responsibility for updating their code to > >> expect > >> > >> the > >> > >> >>>> new > >> > >> >>>>>> behavior. > >> > >> >>>>>> > >> > >> >>>>>> There doesn't seem to be a reason to offer a choice of > >> behaviors > >> > >> on a > >> > >> >>>>>> per-query, or per-store basis. We just want people to be not > >> > >> surprised > >> > >> >>>>>> by this change in general. > >> > >> >>>>>> > >> > >> >>>>>> What do you think? > >> > >> >>>>>> Thanks, > >> > >> >>>>>> -John > >> > >> >>>>>> > >> > >> >>>>>> On Wed, May 20, 2020, at 17:37, Jorge Quilcate wrote: > >> > >> >>>>>>> Thank you both for the great feedback. > >> > >> >>>>>>> > >> > >> >>>>>>> I like the "fancy" proposal :), and how it removes the need > >> for > >> > >> >>>>>>> additional API methods. And with a feature flag on > >> `StateStore`, > >> > >> >>>>>>> disabled by default, should no break current users. > >> > >> >>>>>>> > >> > >> >>>>>>> The only side-effect I can think of is that: by moving the > >> flag > >> > >> >>>>> upwards, > >> > >> >>>>>>> all later operations become affected; which might be ok for > >> most > >> > >> >>>> (all?) > >> > >> >>>>>>> cases. I can't think of an scenario where this would be an > >> issue, > >> > >> >>>> just > >> > >> >>>>>>> want to point this out. > >> > >> >>>>>>> > >> > >> >>>>>>> If moving to this approach, I'd like to check if I got this > >> right > >> > >> >>>>> before > >> > >> >>>>>>> updating the KIP: > >> > >> >>>>>>> > >> > >> >>>>>>> - only `StateStore` will change by having a new method: > >> > >> >>>>>>> `backwardIteration()`, `false` by default to keep things > >> > >> compatible. > >> > >> >>>>>>> - then all `*Stores` will have to update their implementation > >> > >> based > >> > >> >>>> on > >> > >> >>>>>>> this flag. > >> > >> >>>>>>> > >> > >> >>>>>>> > >> > >> >>>>>>> On 20/05/2020 21:02, Sophie Blee-Goldman wrote: > >> > >> >>>>>>>>> There's no possibility that someone could be relying > >> > >> >>>>>>>>> on iterating over that range in increasing order, because > >> that's > >> > >> >>>> not > >> > >> >>>>>> what > >> > >> >>>>>>>>> happens. However, they could indeed be relying on getting > >> an > >> > >> >>>> empty > >> > >> >>>>>>>> iterator > >> > >> >>>>>>>> > >> > >> >>>>>>>> I just meant that they might be relying on the assumption > >> that > >> > >> the > >> > >> >>>>>> range > >> > >> >>>>>>>> query > >> > >> >>>>>>>> will never return results with decreasing keys. The empty > >> > >> iterator > >> > >> >>>>>> wouldn't > >> > >> >>>>>>>> break that contract, but of course a surprise reverse > >> iterator > >> > >> >>>> would. > >> > >> >>>>>>>> > >> > >> >>>>>>>> FWIW I actually am in favor of automatically converting to a > >> > >> >>>> reverse > >> > >> >>>>>>>> iterator, > >> > >> >>>>>>>> I just thought we should consider whether this should be > >> off by > >> > >> >>>>>> default or > >> > >> >>>>>>>> even possible to disable at all. > >> > >> >>>>>>>> > >> > >> >>>>>>>> On Tue, May 19, 2020 at 7:42 PM John Roesler < > >> > >> vvcep...@apache.org > >> > >> >>>>> > >> > >> >>>>>> wrote: > >> > >> >>>>>>>> > >> > >> >>>>>>>>> Thanks for the response, Sophie, > >> > >> >>>>>>>>> > >> > >> >>>>>>>>> I wholeheartedly agree we should take as much into account > >> as > >> > >> >>>>> possible > >> > >> >>>>>>>>> up front, rather than regretting our decisions later. I > >> actually > >> > >> >>>> do > >> > >> >>>>>> share > >> > >> >>>>>>>>> your vague sense of worry, which was what led me to say > >> > >> initially > >> > >> >>>>>> that I > >> > >> >>>>>>>>> thought my counterproposal might be "too fancy". > >> Sometimes, it's > >> > >> >>>>>> better > >> > >> >>>>>>>>> to be explicit instead of "elegant", if we think more > >> people > >> > >> >>>> will be > >> > >> >>>>>>>>> confused > >> > >> >>>>>>>>> than not. > >> > >> >>>>>>>>> > >> > >> >>>>>>>>> I really don't think that there's any danger of "relying > >> on a > >> > >> >>>> bug" > >> > >> >>>>>> here, > >> > >> >>>>>>>>> although > >> > >> >>>>>>>>> people certainly could be relying on current behavior. One > >> thing > >> > >> >>>> to > >> > >> >>>>> be > >> > >> >>>>>>>>> clear > >> > >> >>>>>>>>> about (which I just left a more detailed comment in > >> KAFKA-8159 > >> > >> >>>>> about) > >> > >> >>>>>> is > >> > >> >>>>>>>>> that > >> > >> >>>>>>>>> when we say something like key1 > key2, this ordering is > >> defined > >> > >> >>>> by > >> > >> >>>>>> the > >> > >> >>>>>>>>> serde's output and nothing else. > >> > >> >>>>>>>>> > >> > >> >>>>>>>>> Currently, thanks to your fix in > >> > >> >>>>>> https://github.com/apache/kafka/pull/6521 > >> > >> >>>>>>>>> , > >> > >> >>>>>>>>> the store contract is that for range scans, if from > to, > >> then > >> > >> >>>> the > >> > >> >>>>>> store > >> > >> >>>>>>>>> must > >> > >> >>>>>>>>> return an empty iterator. There's no possibility that > >> someone > >> > >> >>>> could > >> > >> >>>>> be > >> > >> >>>>>>>>> relying > >> > >> >>>>>>>>> on iterating over that range in increasing order, because > >> that's > >> > >> >>>> not > >> > >> >>>>>> what > >> > >> >>>>>>>>> happens. However, they could indeed be relying on getting > >> an > >> > >> >>>> empty > >> > >> >>>>>>>>> iterator. > >> > >> >>>>>>>>> > >> > >> >>>>>>>>> My counterproposal was to actually change this contract to > >> say > >> > >> >>>> that > >> > >> >>>>>> the > >> > >> >>>>>>>>> store > >> > >> >>>>>>>>> must return an iterator over the keys in that range, but > >> in the > >> > >> >>>>>> reverse > >> > >> >>>>>>>>> order. > >> > >> >>>>>>>>> So, in addition to considering whether this idea is "too > >> fancy" > >> > >> >>>> (aka > >> > >> >>>>>>>>> confusing), > >> > >> >>>>>>>>> we should also consider the likelihood of breaking an > >> existing > >> > >> >>>>>> program with > >> > >> >>>>>>>>> this behavior/contract change. > >> > >> >>>>>>>>> > >> > >> >>>>>>>>> To echo your clarification, I'm also not advocating > >> strongly in > >> > >> >>>>> favor > >> > >> >>>>>> of my > >> > >> >>>>>>>>> proposal. I just wanted to present it for consideration > >> > >> alongside > >> > >> >>>>>> Jorge's > >> > >> >>>>>>>>> original one. > >> > >> >>>>>>>>> > >> > >> >>>>>>>>> Thanks for raising these very good points, > >> > >> >>>>>>>>> -John > >> > >> >>>>>>>>> > >> > >> >>>>>>>>> On Tue, May 19, 2020, at 20:49, Sophie Blee-Goldman wrote: > >> > >> >>>>>>>>>>> Rather than working around it, I think we should just > >> fix it > >> > >> >>>>>>>>>> Now *that's* a "fancy" idea :P > >> > >> >>>>>>>>>> > >> > >> >>>>>>>>>> That was my primary concern, although I do have a vague > >> sense > >> > >> of > >> > >> >>>>>> worry > >> > >> >>>>>>>>>> that we might be allowing users to get into trouble > >> without > >> > >> >>>>>> realizing it. > >> > >> >>>>>>>>>> For example if their custom serdes suffer a similar bug > >> as the > >> > >> >>>>> above, > >> > >> >>>>>>>>>> and/or > >> > >> >>>>>>>>>> they rely on getting results in increasing order (of the > >> keys) > >> > >> >>>> even > >> > >> >>>>>> when > >> > >> >>>>>>>>>> to < from. Maybe they're relying on the fact that the > >> range > >> > >> >>>> query > >> > >> >>>>>> returns > >> > >> >>>>>>>>>> nothing in that case. > >> > >> >>>>>>>>>> > >> > >> >>>>>>>>>> Not sure if that qualifies as relying on a bug or not, > >> but in > >> > >> >>>> that > >> > >> >>>>>> past > >> > >> >>>>>>>>>> we've > >> > >> >>>>>>>>>> taken the stance that we should not break compatibility > >> even if > >> > >> >>>> the > >> > >> >>>>>> user > >> > >> >>>>>>>>>> was relying on bugs or unintentional behavior. > >> > >> >>>>>>>>>> > >> > >> >>>>>>>>>> Just to clarify I'm not advocating strongly against this > >> > >> >>>> proposal, > >> > >> >>>>>> just > >> > >> >>>>>>>>>> laying > >> > >> >>>>>>>>>> out some considerations we should take into account. At > >> the end > >> > >> >>>> of > >> > >> >>>>>> the > >> > >> >>>>>>>>> day > >> > >> >>>>>>>>>> we should do what's right rather than maintain > >> compatibility > >> > >> >>>> with > >> > >> >>>>>>>>> existing > >> > >> >>>>>>>>>> bugs, but sometimes there's a reasonable middle ground. > >> > >> >>>>>>>>>> > >> > >> >>>>>>>>>> On Tue, May 19, 2020 at 6:15 PM John Roesler < > >> > >> >>>> vvcep...@apache.org> > >> > >> >>>>>>>>> wrote: > >> > >> >>>>>>>>>>> Thanks Sophie, > >> > >> >>>>>>>>>>> > >> > >> >>>>>>>>>>> Woah, that’s a nasty bug. Rather than working around it, > >> I > >> > >> >>>> think > >> > >> >>>>> we > >> > >> >>>>>>>>> should > >> > >> >>>>>>>>>>> just fix it. I’ll leave some comments on the Jira. > >> > >> >>>>>>>>>>> > >> > >> >>>>>>>>>>> It doesn’t seem like it should be this KIP’s concern > >> that some > >> > >> >>>>>> serdes > >> > >> >>>>>>>>>>> might be incorrectly written. > >> > >> >>>>>>>>>>> > >> > >> >>>>>>>>>>> Were there other practical concerns that you had in mind? > >> > >> >>>>>>>>>>> > >> > >> >>>>>>>>>>> Thanks, > >> > >> >>>>>>>>>>> John > >> > >> >>>>>>>>>>> > >> > >> >>>>>>>>>>> On Tue, May 19, 2020, at 19:10, Sophie Blee-Goldman > >> wrote: > >> > >> >>>>>>>>>>>> I like this "fancy idea" to just flip the to/from bytes > >> but I > >> > >> >>>>> think > >> > >> >>>>>>>>> there > >> > >> >>>>>>>>>>>> are some practical limitations to implementing this. In > >> > >> >>>>> particular > >> > >> >>>>>>>>>>>> I'm thinking about this issue > >> > >> >>>>>>>>>>>> <https://issues.apache.org/jira/browse/KAFKA-8159> > >> with the > >> > >> >>>>>> built-in > >> > >> >>>>>>>>>>> signed > >> > >> >>>>>>>>>>>> number serdes. > >> > >> >>>>>>>>>>>> > >> > >> >>>>>>>>>>>> This trick would actually fix the problem for > >> > >> >>>> negative-negative > >> > >> >>>>>>>>> queries > >> > >> >>>>>>>>>>>> (ie where to & from are negative) but would cause > >> > >> undetectable > >> > >> >>>>>>>>>>>> incorrect results for negative-positive queries. For > >> example, > >> > >> >>>> say > >> > >> >>>>>> you > >> > >> >>>>>>>>>>>> call #range with from = -1 and to = 1, using the Short > >> > >> serdes. > >> > >> >>>>> The > >> > >> >>>>>>>>>>>> serialized bytes for that are > >> > >> >>>>>>>>>>>> > >> > >> >>>>>>>>>>>> from = 1111111111111111 > >> > >> >>>>>>>>>>>> to = 0000000000000001 > >> > >> >>>>>>>>>>>> > >> > >> >>>>>>>>>>>> so we would end up flipping those and iterating over > >> all keys > >> > >> >>>>> from > >> > >> >>>>>>>>>>>> 0000000000000001 to 1111111111111111. Iterating in > >> > >> >>>>> lexicographical > >> > >> >>>>>>>>>>>> order means we would iterate over every key in the space > >> > >> >>>> *except* > >> > >> >>>>>> for > >> > >> >>>>>>>>>>>> 0, but 0 is actually the *only* other key we meant to be > >> > >> >>>> included > >> > >> >>>>>> in > >> > >> >>>>>>>>> the > >> > >> >>>>>>>>>>>> range query. > >> > >> >>>>>>>>>>>> > >> > >> >>>>>>>>>>>> Currently we just log a warning and return an empty > >> iterator > >> > >> >>>> when > >> > >> >>>>>>>>>>>> to < from, which is obviously also incorrect but feels > >> > >> >>>> slightly > >> > >> >>>>>> more > >> > >> >>>>>>>>>>>> palatable. If we start automatically converting to > >> reverse > >> > >> >>>>> queries > >> > >> >>>>>> we > >> > >> >>>>>>>>>>>> can't even log a warning in this case unless we wanted > >> to log > >> > >> >>>> a > >> > >> >>>>>>>>> warning > >> > >> >>>>>>>>>>>> every time, which would be weird to do for a valid > >> usage of a > >> > >> >>>> new > >> > >> >>>>>>>>>>>> feature. > >> > >> >>>>>>>>>>>> > >> > >> >>>>>>>>>>>> All that said, I still like the idea overall. Off the > >> top of > >> > >> >>>> my > >> > >> >>>>>> head > >> > >> >>>>>>>>> I > >> > >> >>>>>>>>>>> guess > >> > >> >>>>>>>>>>>> we could add a store config to enable/disable automatic > >> > >> >>>> reverse > >> > >> >>>>>>>>>>> iteration, > >> > >> >>>>>>>>>>>> which is off by default? > >> > >> >>>>>>>>>>>> > >> > >> >>>>>>>>>>>> Thanks for the KIP! This will be a nice addition > >> > >> >>>>>>>>>>>> > >> > >> >>>>>>>>>>>> Sophie > >> > >> >>>>>>>>>>>> > >> > >> >>>>>>>>>>>> > >> > >> >>>>>>>>>>>> On Tue, May 19, 2020 at 3:21 PM John Roesler < > >> > >> >>>>> vvcep...@apache.org> > >> > >> >>>>>>>>>>> wrote: > >> > >> >>>>>>>>>>>>> Hi there Jorge, > >> > >> >>>>>>>>>>>>> > >> > >> >>>>>>>>>>>>> Thanks for the KIP! > >> > >> >>>>>>>>>>>>> > >> > >> >>>>>>>>>>>>> I think this feature sounds very reasonable. > >> > >> >>>>>>>>>>>>> > >> > >> >>>>>>>>>>>>> I'm not 100% sure if this is "too fancy", but what do > >> you > >> > >> >>>> think > >> > >> >>>>>>>>>>>>> about avoiding the enum by instead allowing people to > >> flip > >> > >> >>>>>>>>>>>>> the "from" and "to" endpoints? I.e., reading from "A" > >> to "Z" > >> > >> >>>>> would > >> > >> >>>>>>>>>>>>> be a forward scan, and from "Z" to "A" would be a > >> backward > >> > >> >>>> one? > >> > >> >>>>>>>>>>>>> > >> > >> >>>>>>>>>>>>> Thanks, > >> > >> >>>>>>>>>>>>> -John > >> > >> >>>>>>>>>>>>> > >> > >> >>>>>>>>>>>>> On Tue, May 19, 2020, at 16:20, Jorge Quilcate wrote: > >> > >> >>>>>>>>>>>>>> Hi everyone, > >> > >> >>>>>>>>>>>>>> > >> > >> >>>>>>>>>>>>>> I would like to start the discussion for KIP-617: > >> > >> >>>>>>>>>>>>>> > >> > >> >>>>>>>>> > >> > >> >>>>>> > >> > >> >>>>> > >> > >> >>>> > >> > >> > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-617%3A+Allow+Kafka+Streams+State+Stores+to+be+iterated+backwards > >> > >> >>>>>>>>>>>>>> Looking forward to your feedback. > >> > >> >>>>>>>>>>>>>> > >> > >> >>>>>>>>>>>>>> Thanks! > >> > >> >>>>>>>>>>>>>> Jorge. > >> > >> >>>>>>>>>>>>>> > >> > >> >>>>>>>>>>>>>> > >> > >> >>>>>>>>>>>>>> Attachments: > >> > >> >>>>>>>>>>>>>> * 0x5F2C6E22064982DF.asc > >> > >> >>>>>>> > >> > >> >>>>>>> > >> > >> >>>>>>> Attachments: > >> > >> >>>>>>> * 0x5F2C6E22064982DF.asc > >> > >> >>>>>> > >> > >> >>>>> > >> > >> >>>> > >> > >> >>> > >> > >> >> > >> > >> > >> > >> > >> > > >> > > >