(sorry for the spam) Actually `findSessions` are only deprecated on `InMemorySessionStore`, which seems strange as RocksDB and interfaces haven't marked these methods as deprecated.
Any hint on how to handle this? Cheers, On Thu, Jul 2, 2020 at 4:57 PM Jorge Esteban Quilcate Otoya < quilcate.jo...@gmail.com> wrote: > @John: I can see there are some deprecations in there as well. Will update > the KIP. > > Thanks, > Jorge > > > On Thu, Jul 2, 2020 at 3:29 PM Jorge Esteban Quilcate Otoya < > quilcate.jo...@gmail.com> wrote: > >> Thanks John. >> >> > it looks like there’s a revision error in which two methods are >> proposed for SessionStore, but seem like they should be in >> ReadOnlySessionStore. Do I read that right? >> >> Yes, I've opted to keep the new methods alongside the existing ones. In >> the case of SessionStore, `findSessions` are in `SessionStore`, and `fetch` >> are in `ReadOnlySessionStore`. If it makes more sense, I can move all of >> them to ReadOnlySessionStore. >> Let me know what you think. >> >> Thanks, >> Jorge. >> >> On Thu, Jul 2, 2020 at 2:36 PM John Roesler <vvcep...@apache.org> wrote: >> >>> Hi Jorge, >>> >>> Thanks for the update. I think this is a good plan. >>> >>> I just took a look at the KIP again, and it looks like there’s a >>> revision error in which two methods are proposed for SessionStore, but seem >>> like they should be in ReadOnlySessionStore. Do I read that right? >>> >>> Otherwise, I’m happy with your proposal. >>> >>> Thanks, >>> John >>> >>> On Wed, Jul 1, 2020, at 17:01, Jorge Esteban Quilcate Otoya wrote: >>> > Quick update: KIP is updated with latest changes now. >>> > Will leave it open this week while working on the PR. >>> > >>> > Hope to open a new vote thread over the next few days if no additional >>> > feedback is provided. >>> > >>> > Cheers, >>> > Jorge. >>> > >>> > On Mon, Jun 29, 2020 at 11:30 AM Jorge Esteban Quilcate Otoya < >>> > quilcate.jo...@gmail.com> wrote: >>> > >>> > > Thanks, John! >>> > > >>> > > Make sense to reconsider the current approach. I was heading in a >>> similar >>> > > direction while drafting the implementation. Metered, Caching, and >>> other >>> > > layers will also have to get duplicated to build up new methods in >>> `Stores` >>> > > factory, and class casting issues would appear on stores created by >>> DSL. >>> > > >>> > > I will draft a proposal with new methods (move methods from proposed >>> > > interfaces to existing ones) with default implementation in a KIP >>> update >>> > > and wait for Matthias to chime in to validate this approach. >>> > > >>> > > Jorge. >>> > > >>> > > >>> > > On Sat, Jun 27, 2020 at 4:01 PM John Roesler <vvcep...@apache.org> >>> wrote: >>> > > >>> > >> Hi Jorge, >>> > >> >>> > >> Sorry for my silence, I've been absorbed with the 2.6 and 2.5.1 >>> releases. >>> > >> >>> > >> The idea to separate the new methods into "mixin" interfaces seems >>> > >> like a good one, but as we've discovered in KIP-614, it doesn't work >>> > >> out that way in practice. The problem is that the store >>> implementations >>> > >> are just the base layer that get composed with other layers in >>> Streams >>> > >> before they can be accessed in the DSL. This is extremely subtle, so >>> > >> I'm going to put everyone to sleep with a detailed explanation: >>> > >> >>> > >> For example, this is the mechanism by which all KeyValueStore >>> > >> implementations get added to Streams: >>> > >> org.apache.kafka.streams.state.internals.KeyValueStoreBuilder#build >>> > >> return new MeteredKeyValueStore<>( >>> > >> maybeWrapCaching(maybeWrapLogging(storeSupplier.get())), >>> > >> storeSupplier.metricsScope(), >>> > >> time, >>> > >> keySerde, >>> > >> valueSerde >>> > >> ); >>> > >> >>> > >> In the DSL, the store that a processor gets from the context would >>> be >>> > >> the result of this composition. So even if the storeSupplier.get() >>> returns >>> > >> a store that implements the "reverse" interface, when you try to >>> use it >>> > >> from a processor like: >>> > >> org.apache.kafka.streams.kstream.ValueTransformerWithKey#init >>> > >> ReadOnlyBackwardWindowStore<K, V> store = >>> > >> (ReadOnlyBackwardWindowStore<K, V>) context.getStateStore(..) >>> > >> >>> > >> You'd just get a ClassCastException because it's actually a >>> > >> MeteredKeyValueStore, which doesn't implement >>> > >> ReadOnlyBackwardWindowStore. >>> > >> >>> > >> The only way to make this work would be to make the Metered, >>> > >> Caching, and Logging layers also implement the new interfaces, >>> > >> but this effectively forces implementations to also implement >>> > >> the interface. Otherwise, the intermediate layers would have to >>> > >> cast the store in each method, like this: >>> > >> MeteredWindowStore#backwardFetch { >>> > >> ((ReadOnlyBackwardWindowStore<K, V>) innerStore).backwardFetch(..) >>> > >> } >>> > >> >>> > >> And then if the implementation doesn't "opt in" by implementing >>> > >> the interface, you'd get a ClassCastException, not when you get the >>> > >> store, but when you try to use it. >>> > >> >>> > >> The fact that we get ClassCastExceptions no matter which way we >>> > >> turn here indicates that we're really not getting any benefit from >>> the >>> > >> type system, which makes the extra interfaces seem not worth all the >>> > >> code involved. >>> > >> >>> > >> Where we landed in KIP-614 is that, unless we want to completely >>> > >> revamp the way that StateStores work in the DSL, you might as >>> > >> well just add the new methods to the existing interfaces. To prevent >>> > >> compilation errors, we can add default implementations that throw >>> > >> UnsupportedOperationException. If a store doesn't opt in by >>> > >> implementing the methods, you'd get an >>> UnsupportedOperationException, >>> > >> which seems no worse, and maybe better, than the ClassCastException >>> > >> you'd get if we go with the "mixin interface" approach. >>> > >> >>> > >> A quick note: This entire discussion focuses on the DSL. If you're >>> just >>> > >> using the Processor API by directly adding the a custom store to the >>> > >> Topology: >>> > >> org.apache.kafka.streams.Topology#addStateStore >>> > >> and then retrieving it in the processor via: >>> > >> org.apache.kafka.streams.processor.ProcessorContext#getStateStore >>> > >> in >>> > >> org.apache.kafka.streams.processor.Processor#init >>> > >> >>> > >> Then, you can both register and retrieve _any_ StateStore >>> implementation. >>> > >> There's no need to use KeyValueStore or any other built-in >>> interface. >>> > >> In other words, KeyValueStore and company are only part of the DSL, >>> > >> not the PAPI. So, discussions about the build-in store interfaces >>> are only >>> > >> really relevant in the context of the DSL, Transformers, and >>> Materialized. >>> > >> >>> > >> So, in conclusion, I'd really recommend just adding any new methods >>> to >>> > >> the existing store interfaces. We might be able to revamp the API >>> in the >>> > >> future to support mixins, but it's a much larger scope of work than >>> this >>> > >> KIP. >>> > >> A more minor comment is that we don't need to add Deprecated >>> variants >>> > >> of new methods. >>> > >> >>> > >> Thanks again, and once again, I'm sorry I tuned out and didn't >>> offer this >>> > >> feedback before you revised the KIP. >>> > >> -John >>> > >> >>> > >> >>> > >> >>> > >> >>> > >> On Mon, Jun 22, 2020, at 06:11, Jorge Esteban Quilcate Otoya wrote: >>> > >> > Hi everyone, >>> > >> > >>> > >> > I've updated the KIP, applying Matthias' feedback regarding >>> interface >>> > >> > hierarchy. >>> > >> > >>> > >> > Also, following the last email, I think we can consider reverse >>> > >> operations >>> > >> > on KeyValue range as well, as implementation supports >>> lexicographic >>> > >> order. >>> > >> > >>> > >> > I considered different naming between Key-based ranges and >>> Time-based >>> > >> > ranges, and mitigate confusion when fetching keys and time ranges >>> as >>> > >> > WindowStore does: >>> > >> > >>> > >> > Key-based ranges: reverseRange(), reverseAll() >>> > >> > Time-based ranges: backwardFetch() >>> > >> > >>> > >> > Then, key-based changes apply to KeyValueStore, and time-based >>> changes >>> > >> to >>> > >> > Window and Session stores. >>> > >> > >>> > >> > Let me know if you have any questions. >>> > >> > >>> > >> > Thanks, >>> > >> > Jorge. >>> > >> > >>> > >> > >>> > >> > On Tue, Jun 16, 2020 at 12:47 AM Jorge Esteban Quilcate Otoya < >>> > >> > quilcate.jo...@gmail.com> wrote: >>> > >> > >>> > >> > > Hi everyone, sorry for the late reply. >>> > >> > > >>> > >> > > Thanks Matthias for your feedback. I think it makes sense to >>> > >> reconsider >>> > >> > > the current design based on your input. >>> > >> > > >>> > >> > > After digging deeper into the current implementation, I'd like >>> to >>> > >> bring my >>> > >> > > current understanding to be double-checked as it might be >>> redefining >>> > >> the >>> > >> > > KIP's scope: >>> > >> > > >>> > >> > > 1. There are 2 ranges been exposed by different stores: >>> > >> > > >>> > >> > > a. Key Range >>> > >> > > b. Timestamp Range >>> > >> > > >>> > >> > > So far, we have discussed covering both. >>> > >> > > >>> > >> > > 2. Key Range functions do not provide ordering guarantees by >>> design: >>> > >> > > >>> > >> > > ```ReadOnlyKeyValueStore.java >>> > >> > > /** >>> > >> > > * Get an iterator over a given range of keys. This >>> iterator must >>> > >> be >>> > >> > > closed after use. >>> > >> > > * The returned iterator must be safe from {@link >>> > >> > > java.util.ConcurrentModificationException}s >>> > >> > > * and must not return null values. No ordering guarantees >>> are >>> > >> > > provided. >>> > >> > > * ... >>> > >> > > */ >>> > >> > > KeyValueIterator<K, V> range(K from, K to); >>> > >> > > ``` >>> > >> > > >>> > >> > > Therefore, I'd propose removing Key range operations from the >>> scope. >>> > >> > > >>> > >> > > 3. Timestamp Range operations happen at the SegmentsStore level >>> > >> (internal) >>> > >> > > API >>> > >> > > >>> > >> > > AFAICT, Segments wrappers handle all Timestamp ranges queries. >>> > >> > > >>> > >> > > I'd propose extending `Segments#segments(timeFrom, timeTo, >>> backwards)` >>> > >> > > with a flag for backwards operations. >>> > >> > > >>> > >> > > As segments returned will be processed backwards, I'm not >>> extending >>> > >> > > KeyValueStores to query each segment backwards as previous >>> point 2. >>> > >> > > >>> > >> > > 4. Extend WindowStores implementations with a new >>> > >> > > WindowBackwardStore/ReadOnlyBackwardStore: >>> > >> > > >>> > >> > > ```java >>> > >> > > public interface ReadOnlyWindowBackwardStore<K, V> { >>> > >> > > WindowStoreIterator<V> backwardFetch(K key, Instant from, >>> Instant >>> > >> to) >>> > >> > > throws IllegalArgumentException; >>> > >> > > >>> > >> > > KeyValueIterator<Windowed<K>, V> backwardFetch(K from, K to, >>> > >> Instant >>> > >> > > fromTime, Instant toTime) >>> > >> > > throws IllegalArgumentException; >>> > >> > > >>> > >> > > KeyValueIterator<Windowed<K>, V> backwardFetchAll(Instant >>> from, >>> > >> > > Instant to) throws IllegalArgumentException; >>> > >> > > ``` >>> > >> > > >>> > >> > > 5. SessionStore is a bit different as it has fetch/find sessions >>> > >> spread >>> > >> > > between SessionStore and ReadOnlySessionStore. >>> > >> > > >>> > >> > > I'd propose a new interface `SessionBackwardStore` to expose >>> backward >>> > >> find >>> > >> > > operations: >>> > >> > > >>> > >> > > ```java >>> > >> > > public interface SessionBackwardStore<K, AGG> { >>> > >> > > KeyValueIterator<Windowed<K>, AGG> >>> backwardFindSessions(final K >>> > >> key, >>> > >> > > final long earliestSessionEndTime, final long >>> latestSessionStartTime); >>> > >> > > >>> > >> > > KeyValueIterator<Windowed<K>, AGG> >>> backwardFindSessions(final K >>> > >> > > keyFrom, final K keyTo, final long earliestSessionEndTime, >>> final long >>> > >> > > latestSessionStartTime); >>> > >> > > } >>> > >> > > ``` >>> > >> > > >>> > >> > > If this understanding is correct I'll proceed to update the KIP >>> based >>> > >> on >>> > >> > > this. >>> > >> > > >>> > >> > > Looking forward to your feedback. >>> > >> > > >>> > >> > > Thanks, >>> > >> > > Jorge. >>> > >> > > >>> > >> > > On Fri, May 29, 2020 at 3:32 AM Matthias J. Sax < >>> mj...@apache.org> >>> > >> wrote: >>> > >> > > >>> > >> > >> Hey, >>> > >> > >> >>> > >> > >> Sorry that I am late to the game. I am not 100% convinced >>> about the >>> > >> > >> current proposal. Using a new config as feature flag seems to >>> be >>> > >> rather >>> > >> > >> "nasty" to me, and flipping from/to is a little bit too fancy >>> for my >>> > >> > >> personal taste. >>> > >> > >> >>> > >> > >> I agree, that the original proposal using a "ReadDirection" >>> enum is >>> > >> not >>> > >> > >> ideal either. >>> > >> > >> >>> > >> > >> Thus, I would like to put out a new idea: We could add a new >>> > >> interface >>> > >> > >> that offers new methods that return revers iterators. >>> > >> > >> >>> > >> > >> The KIP already proposes to add `reverseAll()` and it seems >>> backward >>> > >> > >> incompatible to just add this method to >>> `ReadOnlyKeyValueStore` and >>> > >> > >> `ReadOnlyWindowStore`. I don't think we could provide a useful >>> > >> default >>> > >> > >> implementation for custom stores and thus either break >>> compatibility >>> > >> or >>> > >> > >> need add a default that just throws an exception. Neither >>> seems to >>> > >> be a >>> > >> > >> good option. >>> > >> > >> >>> > >> > >> Using a new interface avoid this issue and allows users >>> implementing >>> > >> > >> custom stores to opt-in by adding the interface to their >>> stores. >>> > >> > >> Furthermore, we don't need any config. In the end, we >>> encapsulte the >>> > >> > >> change into the store, and our runtime is agnostic to it (as it >>> > >> should >>> > >> > >> be). >>> > >> > >> >>> > >> > >> The hierarchy becomes a little complex (but uses would not >>> really see >>> > >> > >> the complexity): >>> > >> > >> >>> > >> > >> // exsiting >>> > >> > >> ReadOnlyKeyValueStore >>> > >> > >> KeyValueStore extend StateStore, ReadOnlyKeyValueStore >>> > >> > >> >>> > >> > >> >>> > >> > >> // helper interface; users don't care >>> > >> > >> // need similar ones for other stores >>> > >> > >> ReverseReadOnlyKeyValueStore { >>> > >> > >> KeyValueIterator<K, V> reverseRange(K from, K to); >>> > >> > >> KeyValueIterator<K, V> reverseAll(); >>> > >> > >> } >>> > >> > >> >>> > >> > >> >>> > >> > >> // two new user facing interfaces for kv-store >>> > >> > >> // need similar ones for other stores >>> > >> > >> ReadOnlyKeyValueStoreWithReverseIterators extends >>> > >> ReadOnlyKeyValueStore, >>> > >> > >> ReverseReadOnlyKeyValueStore >>> > >> > >> >>> > >> > >> KeyValueStoreWithReverseIterators extends KeyValueStore, >>> > >> > >> ReverseReadOnlyKeyValueStore >>> > >> > >> >>> > >> > >> >>> > >> > >> // updated (also internal) >>> > >> > >> // also need to update other built-in stores >>> > >> > >> RocksDB implements KeyValueStoreWithReverseIterators, >>> > >> BulkLoadingStore >>> > >> > >> >>> > >> > >> >>> > >> > >> In the end, user would only care about the two (for kv-case) >>> new >>> > >> > >> interface that offer revers iterator (read only and regular) >>> and can >>> > >> > >> cast stores accordingly in their Processors/Transformers or >>> via IQ. >>> > >> > >> >>> > >> > >> >>> > >> > >> Btw: if we add revers iterator for KeyValue and Window store, >>> should >>> > >> we >>> > >> > >> do the same for Session store? >>> > >> > >> >>> > >> > >> >>> > >> > >> >>> > >> > >> This might be more code to write, but I believe it provides the >>> > >> better >>> > >> > >> user experience. Thoughts? >>> > >> > >> >>> > >> > >> >>> > >> > >> >>> > >> > >> -Matthias >>> > >> > >> >>> > >> > >> >>> > >> > >> >>> > >> > >> >>> > >> > >> On 5/26/20 8:47 PM, John Roesler wrote: >>> > >> > >> > Sorry for my silence, Jorge, >>> > >> > >> > >>> > >> > >> > I've just taken another look, and I'm personally happy with >>> the >>> > >> KIP. >>> > >> > >> > >>> > >> > >> > Thanks, >>> > >> > >> > -John >>> > >> > >> > >>> > >> > >> > On Tue, May 26, 2020, at 16:17, Jorge Esteban Quilcate Otoya >>> wrote: >>> > >> > >> >> If no additional comments, I will proceed to start the a >>> vote >>> > >> thread. >>> > >> > >> >> >>> > >> > >> >> Thanks a lot for your feedback! >>> > >> > >> >> >>> > >> > >> >> On Fri, May 22, 2020 at 9:25 AM Jorge Esteban Quilcate >>> Otoya < >>> > >> > >> >> quilcate.jo...@gmail.com> wrote: >>> > >> > >> >> >>> > >> > >> >>> Thanks Sophie. I like the `reverseAll()` idea. >>> > >> > >> >>> >>> > >> > >> >>> I updated the KIP with your feedback. >>> > >> > >> >>> >>> > >> > >> >>> >>> > >> > >> >>> >>> > >> > >> >>> On Fri, May 22, 2020 at 4:22 AM Sophie Blee-Goldman < >>> > >> > >> sop...@confluent.io> >>> > >> > >> >>> wrote: >>> > >> > >> >>> >>> > >> > >> >>>> Hm, the case of `all()` does seem to present a dilemma in >>> the >>> > >> case of >>> > >> > >> >>>> variable-length keys. >>> > >> > >> >>>> >>> > >> > >> >>>> In the case of fixed-length keys, you can just compute >>> the keys >>> > >> that >>> > >> > >> >>>> correspond >>> > >> > >> >>>> to the maximum and minimum serialized bytes, then perform >>> a >>> > >> `range()` >>> > >> > >> >>>> query >>> > >> > >> >>>> instead of an `all()`. If your keys don't have a >>> well-defined >>> > >> > >> ordering >>> > >> > >> >>>> such >>> > >> > >> >>>> that >>> > >> > >> >>>> you can't determine the MAX_KEY, then you probably don't >>> care >>> > >> about >>> > >> > >> the >>> > >> > >> >>>> iterator order anyway. >>> > >> > >> >>>> >>> > >> > >> >>>> But with variable-length keys, there is no MAX_KEY. If >>> all your >>> > >> > >> keys were >>> > >> > >> >>>> just >>> > >> > >> >>>> of the form 'a', 'aa', 'aaaaa', 'aaaaaaa' then in fact >>> the only >>> > >> way >>> > >> > >> to >>> > >> > >> >>>> figure out the >>> > >> > >> >>>> maximum key in the store is by using `all()` -- and >>> without a >>> > >> reverse >>> > >> > >> >>>> iterator, you're >>> > >> > >> >>>> doomed to iterate through every single key just to answer >>> that >>> > >> simple >>> > >> > >> >>>> question. >>> > >> > >> >>>> >>> > >> > >> >>>> That said, I still think determining the iterator order >>> based >>> > >> on the >>> > >> > >> >>>> to/from bytes >>> > >> > >> >>>> makes a lot of intuitive sense and gives the API a nice >>> > >> symmetry. >>> > >> > >> What if >>> > >> > >> >>>> we >>> > >> > >> >>>> solved the `all()` problem by just giving `all()` a >>> reverse >>> > >> form to >>> > >> > >> >>>> complement it? >>> > >> > >> >>>> Ie we would have `all()` and `reverseAll()`, or something >>> to >>> > >> that >>> > >> > >> effect. >>> > >> > >> >>>> >>> > >> > >> >>>> On Thu, May 21, 2020 at 3:41 PM Jorge Esteban Quilcate >>> Otoya < >>> > >> > >> >>>> quilcate.jo...@gmail.com> wrote: >>> > >> > >> >>>> >>> > >> > >> >>>>> Thanks John. >>> > >> > >> >>>>> >>> > >> > >> >>>>> Agree. I like the first approach as well, with >>> StreamsConfig >>> > >> flag >>> > >> > >> >>>> passing >>> > >> > >> >>>>> by via ProcessorContext. >>> > >> > >> >>>>> >>> > >> > >> >>>>> Another positive effect with "reverse parameters" is >>> that in >>> > >> the >>> > >> > >> case of >>> > >> > >> >>>>> `fetch(keyFrom, keyTo, timeFrom, timeTo)` users can >>> decide >>> > >> _which_ >>> > >> > >> pair >>> > >> > >> >>>> to >>> > >> > >> >>>>> flip, whether with `ReadDirection` enum it apply to both. >>> > >> > >> >>>>> >>> > >> > >> >>>>> The only issue I've found while reviewing the KIP is that >>> > >> `all()` >>> > >> > >> won't >>> > >> > >> >>>> fit >>> > >> > >> >>>>> within this approach. >>> > >> > >> >>>>> >>> > >> > >> >>>>> We could remove it from the KIP and argue that for >>> WindowStore, >>> > >> > >> >>>>> `fetchAll(0, Long.MAX_VALUE)` can be used to get all in >>> reverse >>> > >> > >> order, >>> > >> > >> >>>> and >>> > >> > >> >>>>> for KeyValueStore, no ordering guarantees are provided. >>> > >> > >> >>>>> >>> > >> > >> >>>>> If there is consensus with this changes, I will go and >>> update >>> > >> the >>> > >> > >> KIP. >>> > >> > >> >>>>> >>> > >> > >> >>>>> On Thu, May 21, 2020 at 3:33 PM John Roesler < >>> > >> vvcep...@apache.org> >>> > >> > >> >>>> wrote: >>> > >> > >> >>>>> >>> > >> > >> >>>>>> Hi Jorge, >>> > >> > >> >>>>>> >>> > >> > >> >>>>>> Thanks for that idea. I agree, a feature flag would >>> protect >>> > >> anyone >>> > >> > >> >>>>>> who may be depending on the current behavior. >>> > >> > >> >>>>>> >>> > >> > >> >>>>>> It seems better to locate the feature flag in the >>> > >> initialization >>> > >> > >> >>>> logic of >>> > >> > >> >>>>>> the store, rather than have a method on the "live" >>> store that >>> > >> > >> changes >>> > >> > >> >>>>>> its behavior on the fly. >>> > >> > >> >>>>>> >>> > >> > >> >>>>>> It seems like there are two options here, one is to add >>> a new >>> > >> > >> config: >>> > >> > >> >>>>>> >>> > >> > >> >>>>>> StreamsConfig.ENABLE_BACKWARDS_ITERATION = >>> > >> > >> >>>>>> "enable.backwards.iteration >>> > >> > >> >>>>>> >>> > >> > >> >>>>>> Or we can add a feature flag in Materialized, like >>> > >> > >> >>>>>> >>> > >> > >> >>>>>> Materialized.enableBackwardsIteration() >>> > >> > >> >>>>>> >>> > >> > >> >>>>>> I think I'd personally lean toward the config, for the >>> > >> following >>> > >> > >> >>>> reason. >>> > >> > >> >>>>>> The concern that Sophie raised is that someone's >>> program may >>> > >> depend >>> > >> > >> >>>>>> on the existing contract of getting an empty iterator. >>> We >>> > >> don't >>> > >> > >> want >>> > >> > >> >>>> to >>> > >> > >> >>>>>> switch behavior when they aren't expecting it, so we >>> provide >>> > >> them a >>> > >> > >> >>>>>> config to assert that they _are_ expecting the new >>> behavior, >>> > >> which >>> > >> > >> >>>>>> means they take responsibility for updating their code >>> to >>> > >> expect >>> > >> > >> the >>> > >> > >> >>>> new >>> > >> > >> >>>>>> behavior. >>> > >> > >> >>>>>> >>> > >> > >> >>>>>> There doesn't seem to be a reason to offer a choice of >>> > >> behaviors >>> > >> > >> on a >>> > >> > >> >>>>>> per-query, or per-store basis. We just want people to >>> be not >>> > >> > >> surprised >>> > >> > >> >>>>>> by this change in general. >>> > >> > >> >>>>>> >>> > >> > >> >>>>>> What do you think? >>> > >> > >> >>>>>> Thanks, >>> > >> > >> >>>>>> -John >>> > >> > >> >>>>>> >>> > >> > >> >>>>>> On Wed, May 20, 2020, at 17:37, Jorge Quilcate wrote: >>> > >> > >> >>>>>>> Thank you both for the great feedback. >>> > >> > >> >>>>>>> >>> > >> > >> >>>>>>> I like the "fancy" proposal :), and how it removes the >>> need >>> > >> for >>> > >> > >> >>>>>>> additional API methods. And with a feature flag on >>> > >> `StateStore`, >>> > >> > >> >>>>>>> disabled by default, should no break current users. >>> > >> > >> >>>>>>> >>> > >> > >> >>>>>>> The only side-effect I can think of is that: by moving >>> the >>> > >> flag >>> > >> > >> >>>>> upwards, >>> > >> > >> >>>>>>> all later operations become affected; which might be >>> ok for >>> > >> most >>> > >> > >> >>>> (all?) >>> > >> > >> >>>>>>> cases. I can't think of an scenario where this would >>> be an >>> > >> issue, >>> > >> > >> >>>> just >>> > >> > >> >>>>>>> want to point this out. >>> > >> > >> >>>>>>> >>> > >> > >> >>>>>>> If moving to this approach, I'd like to check if I got >>> this >>> > >> right >>> > >> > >> >>>>> before >>> > >> > >> >>>>>>> updating the KIP: >>> > >> > >> >>>>>>> >>> > >> > >> >>>>>>> - only `StateStore` will change by having a new method: >>> > >> > >> >>>>>>> `backwardIteration()`, `false` by default to keep >>> things >>> > >> > >> compatible. >>> > >> > >> >>>>>>> - then all `*Stores` will have to update their >>> implementation >>> > >> > >> based >>> > >> > >> >>>> on >>> > >> > >> >>>>>>> this flag. >>> > >> > >> >>>>>>> >>> > >> > >> >>>>>>> >>> > >> > >> >>>>>>> On 20/05/2020 21:02, Sophie Blee-Goldman wrote: >>> > >> > >> >>>>>>>>> There's no possibility that someone could be relying >>> > >> > >> >>>>>>>>> on iterating over that range in increasing order, >>> because >>> > >> that's >>> > >> > >> >>>> not >>> > >> > >> >>>>>> what >>> > >> > >> >>>>>>>>> happens. However, they could indeed be relying on >>> getting >>> > >> an >>> > >> > >> >>>> empty >>> > >> > >> >>>>>>>> iterator >>> > >> > >> >>>>>>>> >>> > >> > >> >>>>>>>> I just meant that they might be relying on the >>> assumption >>> > >> that >>> > >> > >> the >>> > >> > >> >>>>>> range >>> > >> > >> >>>>>>>> query >>> > >> > >> >>>>>>>> will never return results with decreasing keys. The >>> empty >>> > >> > >> iterator >>> > >> > >> >>>>>> wouldn't >>> > >> > >> >>>>>>>> break that contract, but of course a surprise reverse >>> > >> iterator >>> > >> > >> >>>> would. >>> > >> > >> >>>>>>>> >>> > >> > >> >>>>>>>> FWIW I actually am in favor of automatically >>> converting to a >>> > >> > >> >>>> reverse >>> > >> > >> >>>>>>>> iterator, >>> > >> > >> >>>>>>>> I just thought we should consider whether this should >>> be >>> > >> off by >>> > >> > >> >>>>>> default or >>> > >> > >> >>>>>>>> even possible to disable at all. >>> > >> > >> >>>>>>>> >>> > >> > >> >>>>>>>> On Tue, May 19, 2020 at 7:42 PM John Roesler < >>> > >> > >> vvcep...@apache.org >>> > >> > >> >>>>> >>> > >> > >> >>>>>> wrote: >>> > >> > >> >>>>>>>> >>> > >> > >> >>>>>>>>> Thanks for the response, Sophie, >>> > >> > >> >>>>>>>>> >>> > >> > >> >>>>>>>>> I wholeheartedly agree we should take as much into >>> account >>> > >> as >>> > >> > >> >>>>> possible >>> > >> > >> >>>>>>>>> up front, rather than regretting our decisions >>> later. I >>> > >> actually >>> > >> > >> >>>> do >>> > >> > >> >>>>>> share >>> > >> > >> >>>>>>>>> your vague sense of worry, which was what led me to >>> say >>> > >> > >> initially >>> > >> > >> >>>>>> that I >>> > >> > >> >>>>>>>>> thought my counterproposal might be "too fancy". >>> > >> Sometimes, it's >>> > >> > >> >>>>>> better >>> > >> > >> >>>>>>>>> to be explicit instead of "elegant", if we think more >>> > >> people >>> > >> > >> >>>> will be >>> > >> > >> >>>>>>>>> confused >>> > >> > >> >>>>>>>>> than not. >>> > >> > >> >>>>>>>>> >>> > >> > >> >>>>>>>>> I really don't think that there's any danger of >>> "relying >>> > >> on a >>> > >> > >> >>>> bug" >>> > >> > >> >>>>>> here, >>> > >> > >> >>>>>>>>> although >>> > >> > >> >>>>>>>>> people certainly could be relying on current >>> behavior. One >>> > >> thing >>> > >> > >> >>>> to >>> > >> > >> >>>>> be >>> > >> > >> >>>>>>>>> clear >>> > >> > >> >>>>>>>>> about (which I just left a more detailed comment in >>> > >> KAFKA-8159 >>> > >> > >> >>>>> about) >>> > >> > >> >>>>>> is >>> > >> > >> >>>>>>>>> that >>> > >> > >> >>>>>>>>> when we say something like key1 > key2, this >>> ordering is >>> > >> defined >>> > >> > >> >>>> by >>> > >> > >> >>>>>> the >>> > >> > >> >>>>>>>>> serde's output and nothing else. >>> > >> > >> >>>>>>>>> >>> > >> > >> >>>>>>>>> Currently, thanks to your fix in >>> > >> > >> >>>>>> https://github.com/apache/kafka/pull/6521 >>> > >> > >> >>>>>>>>> , >>> > >> > >> >>>>>>>>> the store contract is that for range scans, if from >>> > to, >>> > >> then >>> > >> > >> >>>> the >>> > >> > >> >>>>>> store >>> > >> > >> >>>>>>>>> must >>> > >> > >> >>>>>>>>> return an empty iterator. There's no possibility that >>> > >> someone >>> > >> > >> >>>> could >>> > >> > >> >>>>> be >>> > >> > >> >>>>>>>>> relying >>> > >> > >> >>>>>>>>> on iterating over that range in increasing order, >>> because >>> > >> that's >>> > >> > >> >>>> not >>> > >> > >> >>>>>> what >>> > >> > >> >>>>>>>>> happens. However, they could indeed be relying on >>> getting >>> > >> an >>> > >> > >> >>>> empty >>> > >> > >> >>>>>>>>> iterator. >>> > >> > >> >>>>>>>>> >>> > >> > >> >>>>>>>>> My counterproposal was to actually change this >>> contract to >>> > >> say >>> > >> > >> >>>> that >>> > >> > >> >>>>>> the >>> > >> > >> >>>>>>>>> store >>> > >> > >> >>>>>>>>> must return an iterator over the keys in that range, >>> but >>> > >> in the >>> > >> > >> >>>>>> reverse >>> > >> > >> >>>>>>>>> order. >>> > >> > >> >>>>>>>>> So, in addition to considering whether this idea is >>> "too >>> > >> fancy" >>> > >> > >> >>>> (aka >>> > >> > >> >>>>>>>>> confusing), >>> > >> > >> >>>>>>>>> we should also consider the likelihood of breaking an >>> > >> existing >>> > >> > >> >>>>>> program with >>> > >> > >> >>>>>>>>> this behavior/contract change. >>> > >> > >> >>>>>>>>> >>> > >> > >> >>>>>>>>> To echo your clarification, I'm also not advocating >>> > >> strongly in >>> > >> > >> >>>>> favor >>> > >> > >> >>>>>> of my >>> > >> > >> >>>>>>>>> proposal. I just wanted to present it for >>> consideration >>> > >> > >> alongside >>> > >> > >> >>>>>> Jorge's >>> > >> > >> >>>>>>>>> original one. >>> > >> > >> >>>>>>>>> >>> > >> > >> >>>>>>>>> Thanks for raising these very good points, >>> > >> > >> >>>>>>>>> -John >>> > >> > >> >>>>>>>>> >>> > >> > >> >>>>>>>>> On Tue, May 19, 2020, at 20:49, Sophie Blee-Goldman >>> wrote: >>> > >> > >> >>>>>>>>>>> Rather than working around it, I think we should >>> just >>> > >> fix it >>> > >> > >> >>>>>>>>>> Now *that's* a "fancy" idea :P >>> > >> > >> >>>>>>>>>> >>> > >> > >> >>>>>>>>>> That was my primary concern, although I do have a >>> vague >>> > >> sense >>> > >> > >> of >>> > >> > >> >>>>>> worry >>> > >> > >> >>>>>>>>>> that we might be allowing users to get into trouble >>> > >> without >>> > >> > >> >>>>>> realizing it. >>> > >> > >> >>>>>>>>>> For example if their custom serdes suffer a similar >>> bug >>> > >> as the >>> > >> > >> >>>>> above, >>> > >> > >> >>>>>>>>>> and/or >>> > >> > >> >>>>>>>>>> they rely on getting results in increasing order >>> (of the >>> > >> keys) >>> > >> > >> >>>> even >>> > >> > >> >>>>>> when >>> > >> > >> >>>>>>>>>> to < from. Maybe they're relying on the fact that >>> the >>> > >> range >>> > >> > >> >>>> query >>> > >> > >> >>>>>> returns >>> > >> > >> >>>>>>>>>> nothing in that case. >>> > >> > >> >>>>>>>>>> >>> > >> > >> >>>>>>>>>> Not sure if that qualifies as relying on a bug or >>> not, >>> > >> but in >>> > >> > >> >>>> that >>> > >> > >> >>>>>> past >>> > >> > >> >>>>>>>>>> we've >>> > >> > >> >>>>>>>>>> taken the stance that we should not break >>> compatibility >>> > >> even if >>> > >> > >> >>>> the >>> > >> > >> >>>>>> user >>> > >> > >> >>>>>>>>>> was relying on bugs or unintentional behavior. >>> > >> > >> >>>>>>>>>> >>> > >> > >> >>>>>>>>>> Just to clarify I'm not advocating strongly against >>> this >>> > >> > >> >>>> proposal, >>> > >> > >> >>>>>> just >>> > >> > >> >>>>>>>>>> laying >>> > >> > >> >>>>>>>>>> out some considerations we should take into >>> account. At >>> > >> the end >>> > >> > >> >>>> of >>> > >> > >> >>>>>> the >>> > >> > >> >>>>>>>>> day >>> > >> > >> >>>>>>>>>> we should do what's right rather than maintain >>> > >> compatibility >>> > >> > >> >>>> with >>> > >> > >> >>>>>>>>> existing >>> > >> > >> >>>>>>>>>> bugs, but sometimes there's a reasonable middle >>> ground. >>> > >> > >> >>>>>>>>>> >>> > >> > >> >>>>>>>>>> On Tue, May 19, 2020 at 6:15 PM John Roesler < >>> > >> > >> >>>> vvcep...@apache.org> >>> > >> > >> >>>>>>>>> wrote: >>> > >> > >> >>>>>>>>>>> Thanks Sophie, >>> > >> > >> >>>>>>>>>>> >>> > >> > >> >>>>>>>>>>> Woah, that’s a nasty bug. Rather than working >>> around it, >>> > >> I >>> > >> > >> >>>> think >>> > >> > >> >>>>> we >>> > >> > >> >>>>>>>>> should >>> > >> > >> >>>>>>>>>>> just fix it. I’ll leave some comments on the Jira. >>> > >> > >> >>>>>>>>>>> >>> > >> > >> >>>>>>>>>>> It doesn’t seem like it should be this KIP’s >>> concern >>> > >> that some >>> > >> > >> >>>>>> serdes >>> > >> > >> >>>>>>>>>>> might be incorrectly written. >>> > >> > >> >>>>>>>>>>> >>> > >> > >> >>>>>>>>>>> Were there other practical concerns that you had >>> in mind? >>> > >> > >> >>>>>>>>>>> >>> > >> > >> >>>>>>>>>>> Thanks, >>> > >> > >> >>>>>>>>>>> John >>> > >> > >> >>>>>>>>>>> >>> > >> > >> >>>>>>>>>>> On Tue, May 19, 2020, at 19:10, Sophie Blee-Goldman >>> > >> wrote: >>> > >> > >> >>>>>>>>>>>> I like this "fancy idea" to just flip the to/from >>> bytes >>> > >> but I >>> > >> > >> >>>>> think >>> > >> > >> >>>>>>>>> there >>> > >> > >> >>>>>>>>>>>> are some practical limitations to implementing >>> this. In >>> > >> > >> >>>>> particular >>> > >> > >> >>>>>>>>>>>> I'm thinking about this issue >>> > >> > >> >>>>>>>>>>>> <https://issues.apache.org/jira/browse/KAFKA-8159 >>> > >>> > >> with the >>> > >> > >> >>>>>> built-in >>> > >> > >> >>>>>>>>>>> signed >>> > >> > >> >>>>>>>>>>>> number serdes. >>> > >> > >> >>>>>>>>>>>> >>> > >> > >> >>>>>>>>>>>> This trick would actually fix the problem for >>> > >> > >> >>>> negative-negative >>> > >> > >> >>>>>>>>> queries >>> > >> > >> >>>>>>>>>>>> (ie where to & from are negative) but would cause >>> > >> > >> undetectable >>> > >> > >> >>>>>>>>>>>> incorrect results for negative-positive queries. >>> For >>> > >> example, >>> > >> > >> >>>> say >>> > >> > >> >>>>>> you >>> > >> > >> >>>>>>>>>>>> call #range with from = -1 and to = 1, using the >>> Short >>> > >> > >> serdes. >>> > >> > >> >>>>> The >>> > >> > >> >>>>>>>>>>>> serialized bytes for that are >>> > >> > >> >>>>>>>>>>>> >>> > >> > >> >>>>>>>>>>>> from = 1111111111111111 >>> > >> > >> >>>>>>>>>>>> to = 0000000000000001 >>> > >> > >> >>>>>>>>>>>> >>> > >> > >> >>>>>>>>>>>> so we would end up flipping those and iterating >>> over >>> > >> all keys >>> > >> > >> >>>>> from >>> > >> > >> >>>>>>>>>>>> 0000000000000001 to 1111111111111111. Iterating in >>> > >> > >> >>>>> lexicographical >>> > >> > >> >>>>>>>>>>>> order means we would iterate over every key in >>> the space >>> > >> > >> >>>> *except* >>> > >> > >> >>>>>> for >>> > >> > >> >>>>>>>>>>>> 0, but 0 is actually the *only* other key we >>> meant to be >>> > >> > >> >>>> included >>> > >> > >> >>>>>> in >>> > >> > >> >>>>>>>>> the >>> > >> > >> >>>>>>>>>>>> range query. >>> > >> > >> >>>>>>>>>>>> >>> > >> > >> >>>>>>>>>>>> Currently we just log a warning and return an >>> empty >>> > >> iterator >>> > >> > >> >>>> when >>> > >> > >> >>>>>>>>>>>> to < from, which is obviously also incorrect but >>> feels >>> > >> > >> >>>> slightly >>> > >> > >> >>>>>> more >>> > >> > >> >>>>>>>>>>>> palatable. If we start automatically converting to >>> > >> reverse >>> > >> > >> >>>>> queries >>> > >> > >> >>>>>> we >>> > >> > >> >>>>>>>>>>>> can't even log a warning in this case unless we >>> wanted >>> > >> to log >>> > >> > >> >>>> a >>> > >> > >> >>>>>>>>> warning >>> > >> > >> >>>>>>>>>>>> every time, which would be weird to do for a valid >>> > >> usage of a >>> > >> > >> >>>> new >>> > >> > >> >>>>>>>>>>>> feature. >>> > >> > >> >>>>>>>>>>>> >>> > >> > >> >>>>>>>>>>>> All that said, I still like the idea overall. Off >>> the >>> > >> top of >>> > >> > >> >>>> my >>> > >> > >> >>>>>> head >>> > >> > >> >>>>>>>>> I >>> > >> > >> >>>>>>>>>>> guess >>> > >> > >> >>>>>>>>>>>> we could add a store config to enable/disable >>> automatic >>> > >> > >> >>>> reverse >>> > >> > >> >>>>>>>>>>> iteration, >>> > >> > >> >>>>>>>>>>>> which is off by default? >>> > >> > >> >>>>>>>>>>>> >>> > >> > >> >>>>>>>>>>>> Thanks for the KIP! This will be a nice addition >>> > >> > >> >>>>>>>>>>>> >>> > >> > >> >>>>>>>>>>>> Sophie >>> > >> > >> >>>>>>>>>>>> >>> > >> > >> >>>>>>>>>>>> >>> > >> > >> >>>>>>>>>>>> On Tue, May 19, 2020 at 3:21 PM John Roesler < >>> > >> > >> >>>>> vvcep...@apache.org> >>> > >> > >> >>>>>>>>>>> wrote: >>> > >> > >> >>>>>>>>>>>>> Hi there Jorge, >>> > >> > >> >>>>>>>>>>>>> >>> > >> > >> >>>>>>>>>>>>> Thanks for the KIP! >>> > >> > >> >>>>>>>>>>>>> >>> > >> > >> >>>>>>>>>>>>> I think this feature sounds very reasonable. >>> > >> > >> >>>>>>>>>>>>> >>> > >> > >> >>>>>>>>>>>>> I'm not 100% sure if this is "too fancy", but >>> what do >>> > >> you >>> > >> > >> >>>> think >>> > >> > >> >>>>>>>>>>>>> about avoiding the enum by instead allowing >>> people to >>> > >> flip >>> > >> > >> >>>>>>>>>>>>> the "from" and "to" endpoints? I.e., reading >>> from "A" >>> > >> to "Z" >>> > >> > >> >>>>> would >>> > >> > >> >>>>>>>>>>>>> be a forward scan, and from "Z" to "A" would be a >>> > >> backward >>> > >> > >> >>>> one? >>> > >> > >> >>>>>>>>>>>>> >>> > >> > >> >>>>>>>>>>>>> Thanks, >>> > >> > >> >>>>>>>>>>>>> -John >>> > >> > >> >>>>>>>>>>>>> >>> > >> > >> >>>>>>>>>>>>> On Tue, May 19, 2020, at 16:20, Jorge Quilcate >>> wrote: >>> > >> > >> >>>>>>>>>>>>>> Hi everyone, >>> > >> > >> >>>>>>>>>>>>>> >>> > >> > >> >>>>>>>>>>>>>> I would like to start the discussion for >>> KIP-617: >>> > >> > >> >>>>>>>>>>>>>> >>> > >> > >> >>>>>>>>> >>> > >> > >> >>>>>> >>> > >> > >> >>>>> >>> > >> > >> >>>> >>> > >> > >> >>> > >> >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-617%3A+Allow+Kafka+Streams+State+Stores+to+be+iterated+backwards >>> > >> > >> >>>>>>>>>>>>>> Looking forward to your feedback. >>> > >> > >> >>>>>>>>>>>>>> >>> > >> > >> >>>>>>>>>>>>>> Thanks! >>> > >> > >> >>>>>>>>>>>>>> Jorge. >>> > >> > >> >>>>>>>>>>>>>> >>> > >> > >> >>>>>>>>>>>>>> >>> > >> > >> >>>>>>>>>>>>>> Attachments: >>> > >> > >> >>>>>>>>>>>>>> * 0x5F2C6E22064982DF.asc >>> > >> > >> >>>>>>> >>> > >> > >> >>>>>>> >>> > >> > >> >>>>>>> Attachments: >>> > >> > >> >>>>>>> * 0x5F2C6E22064982DF.asc >>> > >> > >> >>>>>> >>> > >> > >> >>>>> >>> > >> > >> >>>> >>> > >> > >> >>> >>> > >> > >> >> >>> > >> > >> >>> > >> > >> >>> > >> > >>> > >> >>> > > >>> > >>> >>