Hi Alieh,

thanks a lot for the KIP. IQ with time semantics is going to be
another great improvement towards having crystal clear streaming
semantics!

1. I agree with Bruno and Matthias, to remove the 'bound' term for the
timestamps. It's confusing that we have bounds for both timestamps and
keys. In particular, `withNoBoundWithTimestampBound` seems to
contradict itself.

2. I would personally prefer having composable construction of the
query, instead of defining a separate method for each combination. So
for example:
- `keyRangeLatestValue(l,u)` ->  `withBounds(l, u).latest()`
- `withNoBoundsWithTimestampRange(t1,t2)` ->
`withNoBounds().fromTime(t1).untilTime(t2)`
- etc.pp.
This would have the advantage, that the interface would be very
similar to `RangeQuery` and we'd need a lot fewer methods, so it will
make the API reference a much quicker read. We already use this style
to define `skipCache` in `KeyQuery`. I guess that diverges quite a bit
from the current proposal, but I'll leave it here anyways for you to
consider it (even if you decide to stick with the current model).

4. Please make sure to specify in every range-based method whether the
bounds are inclusive or exclusive. I see it being mentioned for some
methods, but for others, this is omitted. As I understand, 'until' is
usually used to mean exclusive, and 'from' is usually used to mean
inclusive, but it's better to specify this in the javadoc.

5. Similarly, as Matthias says, specify what happens if the "validity
range" of a value overlaps with the query range. So, to clarify his
remark, what happens if the value v1 is inserted at time 1 and value
v2 is inserted at time 3, and I query for the range `[2,4]` - will the
result include v1 or not? It's the valid value at time 2. For
inspiration, in `WindowRangeQuery`, this important semantic detail is
even clear from the method name `withWindowStartRange`.

6. For iterators, it is convention to call the method `peek` and this
convention followed by e.g. `AbstractIterator` in Kafka, but also
Guava, Apache Commons etc. So I would also call it `peek`, not
`peekNextValue` here. It's clear what we are peeking at.

Cheers,
Lucas

On Thu, Jul 27, 2023 at 3:07 PM Alieh Saeedi
<asae...@confluent.io.invalid> wrote:
>
> Thanks, Bruno, for the feedback.
>
>
>    - I agree with both points 2 and 3. About 3: Having "VersionsQualifier"
>    reduces the number of methods and makes everything less confusing. At the
>    end, that will be easier to use for the developers.
>    - About point 4: I renamed all the properties and parameters from
>    "asOfTimestamp" to "fromTimestamp". That was my misunderstanding. So Now we
>    have these two timestamp bounds: "fromTimestamp" and "untilTimestamp".
>    - About point 5: Do we need system tests here? I assumed just
>    integration tests were enough.
>    - Regarding long vs timestamp instance: I think yes, that 's why I used
>    Long as timestamp.
>
> Bests,
> Alieh
>
>
>
>
>
>
> On Thu, Jul 27, 2023 at 2:28 PM Bruno Cadonna <cado...@apache.org> wrote:
>
> > Hi Alieh,
> >
> > Thanks for the KIP!
> >
> >
> > Here my feedback.
> >
> > 1.
> > You can remove the private fields and constructors from the KIP. Those
> > are implementation details.
> >
> >
> > 2.
> > Some proposals for renamings
> >
> > in VersionedKeyQuery
> >
> > withKeyWithTimestampBound()
> >    -> withKeyAndAsOf()
> >
> > withKeyWithTimestampRange()
> >    -> withKeyAndTimeRange()
> >
> > in VersionedRangeQuery
> >
> > KeyRangeWithTimestampBound()
> >    -> withKeyRangeAndAsOf()
> >
> > withLowerBoundWithTimestampBound()
> >    -> withLowerBoundAndAsOf()
> >
> > withUpperBoundWithTimestampBound()
> >    -> withUpperBoundAndAsOf()
> >
> > withNoBoundWithTimestampBound()
> >    -> withNoBoundsAndAsOf
> >
> > keyRangeWithTimestampRange()
> >    -> withKeyRangeAndTimeRange()
> >
> > withLowerBoundWithTimestampRange()
> >    -> withLowerBoundAndTimeRange()
> >
> > withUpperBoundWithTimestampRange()
> >    -> withUpperBounfAndTimeRange()
> >
> > withNoBoundWithTimestampRange()
> >    -> withNoBoundsAndTimeRange()
> >
> >
> > 3.
> > Would it make sense to merge
> > withKeyLatestValue(final K key)
> > and
> > withKeyAllVersions(final K key)
> > into
> > withKey(final K key, final VersionsQualifier versionsQualifier)
> > where VersionsQualifier is an enum with values (ALL, LATEST). We could
> > also add EARLIEST if we feel it might be useful.
> > Same applies to all methods that end in LatestValue or AllVersions
> >
> >
> > 4.
> > I think getAsOfTimestamp() should not return the lower bound. If I query
> > a version as of a timestamp then the query should return the latest
> > version less than the timestamp.
> > I propose to rename the getters to getTimeFrom() and getTimeTo() as in
> > WindowRangeQuery.
> >
> >
> > 5.
> > Please add the Test Plan section.
> >
> >
> > Regarding long vs Instant: Did we miss to use Instant instead of long
> > for all interfaces of the versioned state stores?
> >
> >
> > Best,
> > Bruno
> >
> >
> >
> >
> >
> >
> >
> >
> > On 7/26/23 11:40 PM, Matthias J. Sax wrote:
> > > Thanks for the KIP Alieh. Glad to see that we can add IQ to the new
> > > versioned stores!
> > >
> > >
> > >
> > > Couple of questions:
> > >
> > >> single-key lookup with timestamp (upper) bound
> > >
> > > Not sure if "bound" is the right term? In the end, it's a point lookup
> > > for a key plus timestamps, so it's an as-of timestamp (not a bound)? Of
> > > course, the returned record would most likely have a different (smaller)
> > > timestamp, but that's expected but does not make the passed in timestamp
> > > a "bound" IMHO?
> > >
> > >> single-key query with timestamp range
> > >> single-key all versions query
> > >
> > > Should we also add `withLowerTimeBound` and `withUpperTimeBound`
> > > (similar to what `RangeQuery` has)?
> > >
> > > Btw: I think we should not pass `long` for timestamps, but `Instance`
> > > types.
> > >
> > > For time-range queries, do we iterate over the values in timestamp
> > > ascending order? If yes, the interface should specify it? Also, would it
> > > make sense to add reverse order (also ok to exclude and only do if there
> > > is demand in a follow up KIP; if not, please add to "Rejected
> > > alternatives" section).
> > >
> > > Also, for time-range query, what are the exact bound for stuff we
> > > include? In the end, a value was a "valid range" (conceptually), so do
> > > we include a record if it's valid range overlaps the search time-range,
> > > or must it be fully included? Or would we only say, that the `validFrom`
> > > timestamp that is stored must be in the search range (what implies that
> > > the lower end would be a non-overlapping but "fully included" bound,
> > > while the upper end would be a overlapping bound).
> > >
> > > For key-range / time-range queries: do we return the result in `<k,ts>`
> > > order or `<ts,k>` order? Also, what about reverse iterators?
> > >
> > > About ` ValueIterator` -- think the JavaDocs have c&p error in it for
> > > `peekNextRecord` (also, should it be called `peekNextValue`? (Also some
> > > other JavaDocs seem to be incomplete and not describe all parameters?)
> > >
> > >
> > > Thanks.
> > >
> > >
> > >
> > > -Matthias
> > >
> > >
> > >
> > > On 7/26/23 7:24 AM, Alieh Saeedi wrote:
> > >> Hi all,
> > >>
> > >> I would like to propose a KIP to support IQv2 for versioned state
> > stores.
> > >>
> > >>
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-960%3A+Support+interactive+queries+%28IQv2%29+for+versioned+state+stores
> > >>
> > >> Looking forward to your feedback!
> > >>
> > >> Cheers,
> > >> Alieh
> > >>
> >

Reply via email to