Thank you, Bruno and Matthias, for keeping the discussion going and for reviewing the PR.
Here are the KIP updates: - I removed the `peek()` from the `ValueIterator` interface since we do not need it. - Yes, Bruno, the `validTo` field in the `VersionedRecod` class is exclusive. I updated the javadocs for that. Very important critical open questions. I list them here based on priority (descendingly). - I implemented the `get(key, fromtime, totime)` method here <https://github.com/aliehsaeedii/kafka/blob/9578b7cb7cdade22cc63f671693f5aeb993937ca/streams/src/main/java/org/apache/kafka/streams/state/internals/RocksDBVersionedStore.java#L262>: the problem is that this implementation does not guarantee consistency because processing might continue interleaved (no snapshot semantics is implemented). More over, it materializes all results in memory. - Solution 1: Use a lock and release it after retrieving all desired records from all segments. - positive point: snapshot semantics is implemented - negative points: 1) It is expensive since iterating over all segments may take a long time. 2) It still requires materializing results on memory - Solution 2: use `RocksDbIterator`. - positive points: 1) It guarantees snapshot segments. 2) It does not require materializing results in memory. - negative points: it is expensive because, anyway, we need to iterate over all (many) segments. Do you have any thoughts on this issue? (ref: Matthias's comment <https://github.com/apache/kafka/pull/14626#pullrequestreview-1709280589>) - I added the field `validTo` in `VersionedRecord`. Its default value is MAX. But as Matthias mentioned, for the single-key single-ts (`VersionedKeyQuery` in KIP-960), it may not always be true. If the returned record belongs to an old segment, maybe it is not valid any more. So MAX is not the correct value for `ValidTo`. Two solutions come to mind: - Solution 1: make the `ValidTo` as an `Optional` and set it to `empty` for the retuned result of `get(key, asOfTimestamp)`. - Solution 2: change the implementation of `get(key, asOfTimestamp)` so that it finds the correct `validTo` for the returned versionedRecord. - In this KIP and the next one, even though the default ordering is with ascending ts, I added the method `withAscendingTimestamps()` to have more user readibility (as Bruno suggested), while Hanyu did only add `withDescending...` methods (he did not need ascending because that's the default anyway). Matthias believes that we should not have inconsistencies (he actually hates them:D). Shall I change my KIP or Hanyu? Thoughts? That would be maybe helpful to look into the PR <https://github.com/apache/kafka/pull/14626> for more clarity and even review that ;-) Cheers, Alieh On Thu, Nov 2, 2023 at 7:13 PM Bruno Cadonna <cado...@apache.org> wrote: > Hi Alieh, > > First of all, I like the examples. > > Is validTo in VersionedRecord exclusive or inclusive? > In the javadocs you write: > > "@param validTo the latest timestamp that value is valid" > > I think that is not true because the validity is defined by the start > time of the new version. The new and the old version cannot both be > valid at that same time. > > Theoretically, you could set validTo to the start time of the new > version - 1. However, what is the unit of the 1? Is it nanoseconds? > Milliseconds? Seconds? Sure we could agree on one, but I think it is > more elegant to just make the validTo exclusive. Actually, you used it > as exclusive in your examples. > > > Thanks for the KIP! > > Best, > Bruno > > On 11/1/23 9:01 PM, Alieh Saeedi wrote: > > Hi all, > > @Matthias: I think Victoria was right. I must add the method `get(key, > > fromTime, toTime)` to the interface `VersionedKeyValueStore`. Right now, > > having the method only in `RocksDBVersionedStore`, made me to have an > > instance of `RocksDBVersionedStore` (instead of `VersionedKeyValueStore`) > > in `StoreQueryUtils.runMultiVersionedKeyQuery()` method. In future, we > are > > going to use the same method for In-memory/SPDB/blaBla versioned stores. > > Then either this method won't work any more, or we have to add code (if > > clauses) for each type of versioned stores. What do you think about that? > > > > Bests, > > Alieh > > > > On Tue, Oct 24, 2023 at 10:01 PM Alieh Saeedi <asae...@confluent.io> > wrote: > > > >> Thank you, Matthias, Bruno, and Guozhang for keeping the discussion > going. > >> > >> Here is the list of changes I made: > >> > >> 1. I enriched the "Example" section as Bruno suggested. Do you > please > >> have a look at that section? I think I devised an interesting one > ;-) > >> 2. As Matthias and Guozhang suggested, I renamed variables and > methods > >> as follows: > >> - "fromTimestamp" -> "fromTime" > >> - "asOfTimestamp" -> "toTime" > >> - "from(Instant fromTime)" -> "fromTime(Instant fromTime)" > >> - "asOf(Instant toTime)" -> "toTime(Instant toTime)" > >> 3. > >> > >> About throwing an NPE when time bounds are `null`: Ok, Matthias, I > am > >> convinced by your idea. I consider a null timestamp as "no bound". > But I > >> had to change KIP-960 and the corresponding PR to be consistent as > well. > >> > >> Answering questions and some more discussion points: > >> > >> 1. > >> > >> For now, I keep the class names as they are. > >> 2. > >> > >> About the new field "validTo" in VersionedRecord. Yes Matthias I > keep > >> the old constructor, which does not have `validTo` as an input > parameter. > >> But in the body of the old constructor, I consider the default > value MAX > >> for the validTo field. I think MAX must be the default value for > `validTo` > >> since before inserting a tombstone or a new value for the same key, > the > >> value must be valid till iternity. > >> 3. > >> > >> Regarding changing the ValueIterator interface from `public > interface > >> ValueIterator<V> extends Iterator<V>` to `public interface > >> ValueIterator<V> extends Iterator<VersionedRecord<V>>`: Matthias, I > do not > >> know how it improves type safety because the MultiVersionedKeyQuery > class > >> returns a ValueIterator of VersionedRecord any way. But if we want > to be > >> consistent with KeyValueIterator, we must apply the changes you > suggested. > >> 4. > >> > >> Regarding adding the new `get(key, fromTime, toTime)` method to the > >> public interface `VersionedKeyValueStore` or only adding it to the > >> class `RocksDBVersionedStore`: In the KIP, I changed the interface > as > >> Victoria suggested. But still, I am not convinced why we do that. > @Victoria: > >> Do you please clarify why we have to define it in the interface? > More > >> specifically, why do we need to use generic VersionedKeyValueStore > >> instead of simply using the implementing classes? > >> > >> Cheers, > >> Alieh > >> > >> On Sat, Oct 14, 2023 at 3:35 AM Guozhang Wang < > guozhang.wang...@gmail.com> > >> wrote: > >> > >>> Thanks Alieh for the KIP, as well as a nice summary of all the > >>> discussions! Just my 2c regarding your open questions: > >>> > >>> 1. I just checked KIP-889 > >>> ( > >>> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-889%3A+Versioned+State+Stores > >>> ) > >>> and we used "VersionedRecord<V> get(K key, long asOfTimestamp);", so I > >>> feel to be consistent with this API is better compared with being > >>> consistent with "WindowKeyQuery"? > >>> > >>> 3. I agree with Matthias that naming is always tricky, and I also tend > >>> to be consistent with what we already have (even if in retro it may > >>> not be the best idea :P and if that was really becoming a complaint, > >>> we would change it across the board in one shot as well later). > >>> > >>> Guozhang > >>> > >>> On Wed, Oct 11, 2023 at 9:12 PM Matthias J. Sax <mj...@apache.org> > wrote: > >>>> > >>>> Thanks for the update! > >>>> > >>>> > >>>> > >>>> Some thoughts about changes you made and open questions you raised: > >>>> > >>>> > >>>> 10) About asOf vs until: I was just looking into `WindowKeyQuery`, > >>>> `WindowRangeQuery` and also `ReadOnlyWindowStore` interfaces. For > those, > >>>> we use "timeFrom" and "timeTo", so it seems best to actually use > >>>> `to(Instant toTime)` to keep the naming consistent across the board? > >>>> > >>>> If yes, we should also do `from (Instant fromTime)` and use getters > >>>> `fromTime()` and `toTime()` -- given that it's range bounds it seems > >>>> acceptable to me, to diverge a little bit from KIP-960 > `asOfTimestamp()` > >>>> -- but we could also rename it to `asOfTime()`? -- Given that we > >>>> strongly type with `Instant` I am not worried about semantic > ambiguity. > >>>> > >>>> > >>>> > >>>> 20) About throwing a NPE when time bounds are `null` -- why? (For the > >>>> key it makes sense as is mandatory to have a key.) Could we not > >>>> interpret `null` as "no bound". We did KIP-941 to add `null` for > >>>> open-ended `RangeQueries`, so I am wondering if we should just stick > to > >>>> the same semantics? > >>>> > >>>> Cf > >>>> > >>> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-941%3A+Range+queries+to+accept+null+lower+and+upper+bounds > >>>> > >>>> > >>>> > >>>> 30) About the class naming. That's always tricky, and I am not married > >>>> to my proposal. I agree with Bruno that the other suggested names are > >>>> not really better. -- The underlying idea was, to get some > "consistent" > >>>> naming across the board. > >>>> > >>>> Existing `KeyQuery` > >>>> New `VersionedKeyQuery` (KIP-960; we add a prefix) > >>>> New `MultiVersionKeyQuery` (this KIP; extend the prefix with a > >>> pre-prefix) > >>>> > >>>> Existing `RangeQuery` > >>>> New `MultiVersionRangeQuery` (KIP-969; add same prefix as above) > >>>> > >>>> > >>>> > >>>> 40) I am fine with not adding `range(from, to)` -- it was just an > idea. > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> Some more follow up question: > >>>> > >>>> 50) You propose to add a new constructor and getter to > `VersionedRecord` > >>>> -- I am wondering if this implies that `validTo` is optional because > the > >>>> existing constructor is not deprecated? -- Also, what happens if > >>>> `validTo` is not set and `valueTo()` is called? Or do we intent to > make > >>>> `validTo` mandatory? > >>>> > >>>> Maybe this question can only be answered when working on the code, > but I > >>>> am wondering if we should make `validTo` mandatory or not... And what > >>>> the "blast radius" of changing `VersionedRecord` will be in general. > Do > >>>> you have already some POC PR that we could look at to get some signals > >>>> about this? > >>>> > >>>> > >>>> > >>>> 60) The new query class is defined to return > >>>> `ValueIterator<VersionedRecord<V>>` -- while I like the idea to add > >>>> `ValueIterator<V>` in a generic way on the one hand, I am wondering if > >>>> it might be better to change it, and enforce its usage (ie, return > type) > >>>> of `VersionedRecord` to improve type safety (type erasure is often a > >>>> pain, and we could mitigate it this way). > >>>> > >>>> Btw: We actually do a similar thing for `KeyValueIterator`. > >>>> > >>>> Ie, > >>>> > >>>> public interface ValueIterator<V> extends > Iterator<ValueAndTimestamp<V>> > >>>> > >>>> and > >>>> > >>>> ValueAndTimestamp<V> peek(); > >>>> > >>>> This would imply that the return type of the new query is > >>>> `ValueIterator<V>` on the interface what seems simpler and more > elegant? > >>>> > >>>> If we go with the change, I am also wondering if we need to find a > >>>> better name for the new iterator class? Maybe `VersionIterator` or > >>>> something like this? > >>>> > >>>> Of course it might limit the use of `ValueIterator` for other value > >>>> types -- not sure if this a limitation that is prohibitive? My gut > >>>> feeling is, that is should not be too limiting. > >>>> > >>>> > >>>> > >>>> > >>>> 70) Do we really need the change in `VersionedKeyValueStore` and add a > >>>> new method? In the end, the idea of IQv2 is to avoid exactly this... > It > >>>> was the main issue for IQv1, that the base interface of the store > needed > >>>> an update and thus all classed implementing the base interface, making > >>>> it very cumbersome to add new query types. -- Of course, we need this > >>>> new method on the actually implementation (as private method) that can > >>>> be called from `query()` method, but adding it to the interface seems > to > >>>> defeat the purpose of IQv2. > >>>> > >>>> Note, for existing IQv2 queries types that go against others stores, > the > >>>> public methods already existed when IQv2 was introduces, and thus the > >>>> implementation of these query types just pragmatically re-used > existing > >>>> methods -- but it does not imply that new public method should be > added. > >>>> > >>>> > >>>> > >>>> > >>>> -Matthias > >>>> > >>>> > >>>> On 10/11/23 5:11 AM, Bruno Cadonna wrote: > >>>>> Thanks for the updates, Alieh! > >>>>> > >>>>> The example in the KIP uses the allVersions() method which we agreed > >>> to > >>>>> remove. > >>>>> > >>>>> Regarding your questions: > >>>>> 1. asOf vs. until: I am fine with both but slightly prefer until. > >>>>> 2. What about KeyMultiVersionsQuery, KeyVersionsQuery (KIP-960 would > >>>>> then be KeyVersionQuery). However, I am also fine with > >>>>> MultiVersionedKeyQuery since none of the names sounds better or worse > >>> to > >>>>> me. > >>>>> 3. I agree with you not to introduce the method with the two bounds > to > >>>>> keep things simple. > >>>>> 4. Forget about fromTime() an asOfTime(), from() and asOf() is fine. > >>>>> 5. The main purpose is to show how to use the API. Maybe make an > >>> example > >>>>> with just the key to distinguish this query from the single value > >>> query > >>>>> of KIP-960 and then one with a key and a time range. When you iterate > >>>>> over the results you could also call validTo(). Maybe add some actual > >>>>> records in the comments to show what the result might look like. > >>>>> > >>>>> Regarding the test plan, I hope you also plan to add unit tests in > all > >>>>> of your KIPs. Maybe you could also explain why system tests are not > >>>>> needed here. > >>>>> > >>>>> Best, > >>>>> Bruno > >>>>> > >>>>> On 10/10/23 5:36 PM, Alieh Saeedi wrote: > >>>>>> Thank you all for the very exact and constructive comments. I really > >>>>>> enjoyed reading your ideas and all the important points you made me > >>> aware > >>>>>> of. I updated KIP-968 as follows: > >>>>>> > >>>>>> 1. If the key or time bounds are null, the method returns NPE. > >>>>>> 2. The "valid" word: I removed the sentence "all the records > >>> that are > >>>>>> valid..." and replaced it with an exact explanation. More > over, I > >>>>>> explained > >>>>>> it with an example in the KIP but not in the javadocs. Do I > need > >>>>>> to add the > >>>>>> example to the javadocs as well? > >>>>>> 3. Since I followed Bruno's suggestion and removed the > >>> allVersions() > >>>>>> method, the problem of meaningless combinations is solved, and > I > >>>>>> do not > >>>>>> need any IllegalArgumentException or something like that. > >>>>>> Therefore, the > >>>>>> change is that if no time bound is specified, the query returns > >>>>>> the records > >>>>>> with the specified key for all timestamps (all versions). > >>>>>> 4. As Victoria suggested, adding a method to the > >>>>>> *VersionedKeyValueStore > >>>>>> *interface is essential. So I did that. I had this method only > >>> in the > >>>>>> RocksDBVersionedStore class, which was not enough. > >>>>>> 5. I added the *validTo* field to the VersionedRecord class to > be > >>>>>> able > >>>>>> to represent the tombstones. As you suggested, we postpone > >>> solving > >>>>>> the > >>>>>> problem of retrieving consecutive tombstones for later. > >>>>>> 6. I added the "Test Plan" section to all KIPs. I hope what I > >>>>>> wrote is > >>>>>> convincing. > >>>>>> 7. I added the *withAscendingTimestamp()* method to provide > more > >>>>>> code readability > >>>>>> for the user. > >>>>>> 8. I removed the evil word "get" from all getter methods. > >>>>>> > >>>>>> There have also been some more suggestions which I am still not > >>> convinced > >>>>>> or clear about them: > >>>>>> > >>>>>> 1. Regarding asOf vs until: reading all comments, my conclusion > >>>>>> was that > >>>>>> I keep it as "asOf" (following Walker's idea as the native > >>> speaker > >>>>>> as well > >>>>>> as Bruno's suggestion to be consistent with > single-key_single_ts > >>>>>> queries). > >>>>>> But I do not have a personal preference. If you insist on > >>> "until", > >>>>>> I change > >>>>>> it. > >>>>>> 2. Bruno suggested renaming the class "MultiVersionedKeyQuery" > >>> to sth > >>>>>> else. We already had a long discussion about the name with > >>>>>> Matthias. I am > >>>>>> open to renaming it to something else, but do you have any > ideas? > >>>>>> 3. Matthias suggested having a method with two input parameters > >>> that > >>>>>> enables the user to specify both time bounds in the same > method. > >>>>>> Isn't it > >>>>>> introducing redundancy? It is somehow disrespectful to the idea > >>> of > >>>>>> having > >>>>>> composable methods. > >>>>>> 4. Bruno suggested renaming the methods "asOf" and "from" to > >>>>>> "asOfTime" > >>>>>> and "fromTime". If I do that, then it is not consistent with > >>> KIP-960. > >>>>>> Moreover, the input parameter is clearly a timestamp, which > >>> explains > >>>>>> enough. What do you think about that? > >>>>>> 5. I was asked to add more examples to the example section. My > >>>>>> question > >>>>>> is, what is the main purpose of that? If I know it clearly, > then > >>> I > >>>>>> can add > >>>>>> what you mean. > >>>>>> > >>>>>> > >>>>>> > >>>>>> Cheers, > >>>>>> Alieh > >>>>>> > >>>>>> On Tue, Oct 10, 2023 at 1:13 AM Matthias J. Sax <mj...@apache.org> > >>> wrote: > >>>>>> > >>>>>>> Bruno and I had some background conversation about the `get` prefix > >>>>>>> question including a few other committers. > >>>>>>> > >>>>>>> The official policy was never changed, and we should not add the > >>>>>>> `get`-prefix. It's a slip on our side in previous KIPs to add the > >>>>>>> `get`-prefix and we should actually clean it up doing a follow up > >>> KIP. > >>>>>>> > >>>>>>> > >>>>>>> -Matthias > >>>>>>> > >>>>>>> > >>>>>>> On 10/5/23 5:26 AM, Bruno Cadonna wrote: > >>>>>>>> Hi Matthias, > >>>>>>>> > >>>>>>>> Given all the IQv2 KIPs that use getX and given recent PRs > >>> (internal > >>>>>>>> interfaces mainly) that got merged, I was under the impression > >>> that we > >>>>>>>> moved away from the strict no-getX policy. > >>>>>>>> > >>>>>>>> I do not think it was an accident using getX in the IQv2 KIPs > since > >>>>>>>> somebody would have brought it up, otherwise. > >>>>>>>> > >>>>>>>> I am fine with both types of getters. > >>>>>>>> > >>>>>>>> If we think, we need to discuss this in a broader context, let's > >>>>>>>> start a > >>>>>>>> separate thread. > >>>>>>>> > >>>>>>>> > >>>>>>>> Best, > >>>>>>>> Bruno > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> On 10/5/23 7:44 AM, Matthias J. Sax wrote: > >>>>>>>>> I agree to (almost) everything what Bruno said. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> In general, we tend to move away from using getters without > >>> "get", > >>>>>>>>>> recently. So I would keep the "get". > >>>>>>>>> > >>>>>>>>> This is new to me? Can you elaborate on this point? Why do you > >>> think > >>>>>>>>> that's the case? > >>>>>>>>> > >>>>>>>>> I actually did realize (after Walker mentioned it) that existing > >>> query > >>>>>>>>> types use `get` prefix, but to me it seems that it was by > >>> accident and > >>>>>>>>> we should consider correcting it? Thus, I would actually prefer > >>> to not > >>>>>>>>> add the `get` prefix for new methods query types. > >>>>>>>>> > >>>>>>>>> IMHO, we should do a follow up KIP to deprecate all methods with > >>> `get` > >>>>>>>>> prefix and replace them with new ones without `get` -- it's of > >>> course > >>>>>>>>> always kinda "unnecessary" noise, but if we don't do it, we might > >>> get > >>>>>>>>> into more and more inconsistent naming what would result in a > >>> "bad" > >>>>>>>>> API. > >>>>>>>>> > >>>>>>>>> If we indeed want to change the convention and use the `get` > >>> prefix, I > >>>>>>>>> would strongly advocate to bit the bullet and do KIP to > >>> pro-actively > >>>>>>>>> add the `get` "everywhere" it's missing... But overall, it seems > >>> to be > >>>>>>>>> a much broader decision, and we should get buy in from many > >>> committers > >>>>>>>>> about it -- as long as there is no broad consensus to add `get` > >>>>>>>>> everywhere, I would strongly prefer not to diverge from the > >>> current > >>>>>>>>> agreement to omit `get`. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> -Matthias > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On 10/4/23 2:36 AM, Bruno Cadonna wrote: > >>>>>>>>>> Hi, > >>>>>>>>>> > >>>>>>>>>> Regarding tombstones: > >>>>>>>>>> As far as I understand, we need to add either a validTo field to > >>>>>>>>>> VersionedRecord or we need to return tombstones, otherwise the > >>> result > >>>>>>>>>> is not complete, because users could never know a record was > >>> deleted > >>>>>>>>>> at some point before the second non-null value was put. > >>>>>>>>>> I like more adding the validTo field since it makes the result > >>> more > >>>>>>>>>> concise and easier interpretable. > >>>>>>>>>> > >>>>>>>>>> Extending on Victoria's example, with the following puts > >>>>>>>>>> > >>>>>>>>>> put(k, v1, time=0) > >>>>>>>>>> put(k, null, time=5) > >>>>>>>>>> put(k, null, time=10) > >>>>>>>>>> put(k, null, time=15) > >>>>>>>>>> put(k, v2, time=20) > >>>>>>>>>> > >>>>>>>>>> the result with tombstones would be > >>>>>>>>>> > >>>>>>>>>> value, timestamp > >>>>>>>>>> (v1, 0) > >>>>>>>>>> (null, 5) > >>>>>>>>>> (null, 10) > >>>>>>>>>> (null, 15) > >>>>>>>>>> (v2, 20) > >>>>>>>>>> > >>>>>>>>>> instead of > >>>>>>>>>> > >>>>>>>>>> value, timestamp, validTo > >>>>>>>>>> (v1, 0, 5) > >>>>>>>>>> (v2, 20, null) > >>>>>>>>>> > >>>>>>>>>> The benefit of conciseness would already apply to one single > >>>>>>>>>> tombstone. > >>>>>>>>>> > >>>>>>>>>> On the other hand, why would somebody write consecutive > >>> tombstones > >>>>>>>>>> into a versioned state store? I guess if somebody does that on > >>>>>>>>>> purpose, then there should be a way to retrieve each of those > >>>>>>>>>> tombstones, right? > >>>>>>>>>> So maybe we need both -- validTo field and the option to return > >>>>>>>>>> tombstones. The latter might be moved to a future KIP in case we > >>> see > >>>>>>>>>> the need. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Regarding .within(fromTs, toTs): > >>>>>>>>>> I would keep it simple with .from() and .asOfTimestamp() (or > >>>>>>>>>> .until()). If we go with .within(), I would opt for > >>>>>>>>>> .withinTimeRange(fromTs, toTs), because the query becomes more > >>>>>>> readable: > >>>>>>>>>> > >>>>>>>>>> MultiVersionedKeyQuery > >>>>>>>>>> .withKey(1) > >>>>>>>>>> .withinTimeRange(Instant.parse(2023-08-03T10:37:30.00Z), > >>>>>>>>>> Instant.parse(2023-08-04T10:37:30.00Z)) > >>>>>>>>>> > >>>>>>>>>> If we stay with .from() and .until(), we should consider > >>> .fromTime() > >>>>>>>>>> and .untilTime() (or .toTime()): > >>>>>>>>>> > >>>>>>>>>> MultiVersionedKeyQuery > >>>>>>>>>> .withKey(1) > >>>>>>>>>> .fromTime(Instant.parse(2023-08-03T10:37:30.00Z)) > >>>>>>>>>> .untilTime(Instant.parse(2023-08-04T10:37:30.00Z)) > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Regarding asOf vs. until: > >>>>>>>>>> I think asOf() is more used in point in time queries as Walker > >>>>>>>>>> mentioned where this KIP specifies a time range. IMO asOf() fits > >>> very > >>>>>>>>>> well with KIP-960 where one version is queried, but here I think > >>>>>>>>>> .until() fits better. That might just be a matter of taste and > >>> in the > >>>>>>>>>> end I am fine with both as long as it is well documented. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Regarding getters without "get": > >>>>>>>>>> In the other IQv2 classes we used getters with "get". In > >>> general, we > >>>>>>>>>> tend to move away from using getters without "get", recently. So > >>> I > >>>>>>>>>> would keep the "get". > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Best, > >>>>>>>>>> Bruno > >>>>>>>>>> > >>>>>>>>>> On 10/3/23 7:49 PM, Walker Carlson wrote: > >>>>>>>>>>> Hey Alieh thanks for the KIP, > >>>>>>>>>>> > >>>>>>>>>>> Weighing in on the AsOf vs Until debate I think either is fine > >>>>>>>>>>> from a > >>>>>>>>>>> natural language perspective. Personally AsOf makes more sense > >>> to me > >>>>>>>>>>> where > >>>>>>>>>>> until gives me the idea that the query is making a change. It's > >>>>>>>>>>> totally a > >>>>>>>>>>> connotative difference and not that important. I think as of is > >>>>>>>>>>> pretty > >>>>>>>>>>> frequently used in point of time queries. > >>>>>>>>>>> > >>>>>>>>>>> Also for these methods it makes sense to drop the "get" We > don't > >>>>>>>>>>> normally use that in getters > >>>>>>>>>>> > >>>>>>>>>>> * The key that was specified for this query. > >>>>>>>>>>> */ > >>>>>>>>>>> public K getKey(); > >>>>>>>>>>> > >>>>>>>>>>> /** > >>>>>>>>>>> * The starting time point of the query, if specified > >>>>>>>>>>> */ > >>>>>>>>>>> public Optional<Instant> getFromTimestamp(); > >>>>>>>>>>> > >>>>>>>>>>> /** > >>>>>>>>>>> * The ending time point of the query, if specified > >>>>>>>>>>> */ > >>>>>>>>>>> public Optional<Instant> getAsOfTimestamp(); > >>>>>>>>>>> > >>>>>>>>>>> Other than that I didn't have too much to add. Overall I like > >>> the > >>>>>>>>>>> direction > >>>>>>>>>>> of the KIP and think the funcatinlyt is all there! > >>>>>>>>>>> best, > >>>>>>>>>>> Walker > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> On Mon, Oct 2, 2023 at 10:46 PM Matthias J. Sax < > >>> mj...@apache.org> > >>>>>>>>>>> wrote: > >>>>>>>>>>> > >>>>>>>>>>>> Thanks for the updated KIP. Overall I like it. > >>>>>>>>>>>> > >>>>>>>>>>>> Victoria raises a very good point, and I personally tend to > >>>>>>>>>>>> prefer (I > >>>>>>>>>>>> believe so does Victoria, but it's not totally clear from her > >>>>>>>>>>>> email) if > >>>>>>>>>>>> a range query would not return any tombstones, ie, only two > >>> records > >>>>>>> in > >>>>>>>>>>>> Victoria's example. Thus, it seems best to include a `validTo` > >>>>>>>>>>>> ts-field > >>>>>>>>>>>> to `VersionedRecord` -- otherwise, the retrieved result cannot > >>> be > >>>>>>>>>>>> interpreted correctly. > >>>>>>>>>>>> > >>>>>>>>>>>> Not sure what others think about it. > >>>>>>>>>>>> > >>>>>>>>>>>> I would also be open to actually add a `includeDeletes()` (or > >>>>>>>>>>>> `includeTombstones()`) method/flag (disabled by default) to > >>> allow > >>>>>>>>>>>> users > >>>>>>>>>>>> to get all tombstone: this would only be helpful if there are > >>> two > >>>>>>>>>>>> consecutive tombstone though (if I got it right), so not sure > >>> if we > >>>>>>>>>>>> want > >>>>>>>>>>>> to add it or not -- it seems also possible to add it later if > >>> there > >>>>>>> is > >>>>>>>>>>>> user demand for it, so it might be a premature addition as > this > >>>>>>> point? > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> Nit: > >>>>>>>>>>>> > >>>>>>>>>>>>> the public interface ValueIterator is used > >>>>>>>>>>>> > >>>>>>>>>>>> "is used" -> "is added" (otherwise it sounds like as if > >>>>>>>>>>>> `ValueIterator` > >>>>>>>>>>>> exist already) > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> Should we also add a `.within(fromTs, toTs)` (or maybe some > >>> better > >>>>>>>>>>>> name?) to allow specifying both bounds at once? The existing > >>>>>>>>>>>> `RangeQuery` does the same for specifying the key-range, so > >>>>>>>>>>>> might be > >>>>>>>>>>>> good to add for time-range too? > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> -Matthias > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> On 9/6/23 5:01 AM, Bruno Cadonna wrote: > >>>>>>>>>>>>> In my last e-mail I missed to finish a sentence. > >>>>>>>>>>>>> > >>>>>>>>>>>>> "I think from a KIP" > >>>>>>>>>>>>> > >>>>>>>>>>>>> should be > >>>>>>>>>>>>> > >>>>>>>>>>>>> "I think the KIP looks good!" > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> On 9/6/23 1:59 PM, Bruno Cadonna wrote: > >>>>>>>>>>>>>> Hi Alieh, > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Thanks for the KIP! > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> I think from a KIP > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> 1. > >>>>>>>>>>>>>> I propose to throw an IllegalArgumentException or an > >>>>>>>>>>>>>> IllegalStateException for meaningless combinations. In any > >>> case, > >>>>>>> the > >>>>>>>>>>>>>> KIP should specify what exception is thrown. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> 2. > >>>>>>>>>>>>>> Why does not specifying a range return the latest version? I > >>>>>>>>>>>>>> would > >>>>>>>>>>>>>> expect that it returns all versions since an empty lower or > >>> upper > >>>>>>>>>>>>>> limit is interpreted as no limit. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> 3. > >>>>>>>>>>>>>> I second Matthias comment about replacing "asOf" with > >>> "until" or > >>>>>>>>>>>>>> "to". > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> 4. > >>>>>>>>>>>>>> Do we need "allVersions()"? As I said above I would return > >>> all > >>>>>>>>>>>>>> versions if no limits are specified. I think if we get rid > of > >>>>>>>>>>>>>> allVersions() there might not be any meaningless > combinations > >>>>>>>>>>>>>> anymore. > >>>>>>>>>>>>>> If a user applies twice the same limit like for example > >>>>>>>>>>>>>> MultiVersionedKeyQuery.with(key).from(t1).from(t2) the last > >>> one > >>>>>>>>>>>>>> wins. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> 5. > >>>>>>>>>>>>>> Could you add some more examples with time ranges to the > >>> example > >>>>>>>>>>>> section? > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> 6. > >>>>>>>>>>>>>> The KIP misses the test plan section. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> 7. > >>>>>>>>>>>>>> I propose to rename the class to "MultiVersionKeyQuery" > >>> since we > >>>>>>> are > >>>>>>>>>>>>>> querying multiple versions of the same key. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> 8. > >>>>>>>>>>>>>> Could you also add withAscendingTimestamps()? IMO it gives > >>> users > >>>>>>> the > >>>>>>>>>>>>>> possibility to make their code more readable instead of only > >>>>>>> relying > >>>>>>>>>>>>>> on the default. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>> Bruno > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On 8/17/23 4:13 AM, Matthias J. Sax wrote: > >>>>>>>>>>>>>>> Thanks for splitting this part into a separate KIP! > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> For `withKey()` we should be explicit that `null` is not > >>>>>>>>>>>>>>> allowed. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> (Looking into existing `KeyQuery` it seems the JavaDocs > >>> don't > >>>>>>> cover > >>>>>>>>>>>>>>> this either -- would you like to do a tiny cleanup PR for > >>>>>>>>>>>>>>> this, or > >>>>>>>>>>>>>>> fix on-the-side in one of your PRs?) > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> The key query returns all the records that are valid in > the > >>>>>>>>>>>>>>>> time > >>>>>>>>>>>>>>>> range starting from the timestamp {@code fromTimestamp}. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> In the JavaDocs you use the phrase `are valid` -- I think > we > >>>>>>>>>>>>>>> need to > >>>>>>>>>>>>>>> explain what "valid" means? It might even be worth to add > >>> some > >>>>>>>>>>>>>>> examples. It's annoying, but being precise if kinda > >>> important. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> With regard to KIP-962, should we allow `null` for time > >>> bounds ? > >>>>>>>>>>>>>>> The > >>>>>>>>>>>>>>> JavaDocs should also be explicit if `null` is allowed or > >>> not and > >>>>>>>>>>>>>>> what > >>>>>>>>>>>>>>> the semantics are if allowed. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> You are using `asOf()` however, because we are doing > >>> time-range > >>>>>>>>>>>>>>> queries, to me using `until()` to describe the upper bound > >>> would > >>>>>>>>>>>>>>> sound better (I am not a native speaker though, so maybe I > >>> am > >>>>>>> off?) > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> The key query returns all the records that have timestamp > >>> <= > >>>>>>>>>>>>>>>> {@code > >>>>>>>>>>>>>>>> asOfTimestamp}. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> This is only correct if not lower-bound is set, right? > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> In your reply to KIP-960 you mentioned: > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> the meaningless combinations are prevented by throwing > >>>>>>> exceptions. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> We should add corresponding JavaDocs like: > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> @throws IllegalArgumentException if {@code > >>>>>>>>>>>>>>> fromTimestamp} is > >>>>>>>>>>>>>>> equal or > >>>>>>>>>>>>>>> larger than {@code > >>>>>>>>>>>>>>> untilTimestamp} > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Or something similar. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> With regard to KIP-960: if we need to introduce a > >>>>>>>>>>>>>>> `VersionedKeyQuery` > >>>>>>>>>>>>>>> class for single-key-single-ts lookup, would we need to > find > >>>>>>>>>>>>>>> a new > >>>>>>>>>>>>>>> name for the query class of this KIP, given that the return > >>> type > >>>>>>> is > >>>>>>>>>>>>>>> different? > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> -Matthias > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> On 8/16/23 10:57 AM, Alieh Saeedi wrote: > >>>>>>>>>>>>>>>> Hi all, > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> I splitted KIP-960 > >>>>>>>>>>>>>>>> < > >>>>>>>>>>>> > >>>>>>> > >>> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-960%3A+Support+single-key_single-timestamp+interactive+queries+%28IQv2%29+for+versioned+state+stores > >>>>>>>>>>>>> > >>>>>>>>>>>>>>>> into three separate KIPs. Therefore, please continue > >>>>>>>>>>>>>>>> discussions > >>>>>>>>>>>>>>>> about single-key, multi-timestamp interactive queries > >>> here. You > >>>>>>>>>>>>>>>> can > >>>>>>>>>>>>>>>> see all > >>>>>>>>>>>>>>>> the addressed reviews on the following page. Thanks in > >>> advance. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> KIP-968: Support single-key_multi-timestamp interactive > >>> queries > >>>>>>>>>>>>>>>> (IQv2) for > >>>>>>>>>>>>>>>> versioned state stores > >>>>>>>>>>>>>>>> < > >>>>>>>>>>>> > >>>>>>> > >>> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-968%3A+Support+single-key_multi-timestamp+interactive+queries+%28IQv2%29+for+versioned+state+stores > >>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> I look forward to your feedback! > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Cheers, > >>>>>>>>>>>>>>>> Alieh > >>>>>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>> > >>>>>> > >>> > >> > > >