Thanks for the update, Patrick!

Tl;dr: I'm +1 (binding)

I just reviewed the KIP again (I hope you don't mind, I
fixed a couple of missed renames in the text and examples).

One of the design of IQv2 is to make proposing and evolving
these queries much less onerous. Unlike adding new methods
to all StateStore interfaces and implementations, if we want
to make (eg) the WindowRangeQuery more flexible in the
future, we can easily do so by just adding some builder
methods to set the bounds independently, or by adding new
static methods to provide different parameterizations.

Therefore, even though this is not the ultimate expression
of what we think the range queries should be, it's a
perfectly fine starting point. The most pressing concerns to
me were the cases where Luke and Guozhang pointed out that
some parts of the proposal or interfaces were ambiguous,
which looks like it's fixed now.

My preference would be to go ahead and accept this proposed
scope so that we have at least some basic key-value, window,
and session store queries in the code base. Until we have
those MVP queries in place, we can't really start to address
higher-level follow-up items like:
https://issues.apache.org/jira/browse/KAFKA-13541 (to refine
how the serdes interplay with the queries) or
https://issues.apache.org/jira/browse/KAFKA-13541 (to
consider alternative ways to pin down the generic type
parameters better).

Anyway, for the current version of the KIP, my vote is +1
(binding).

Thanks again!
-John

On Wed, 2021-12-15 at 12:53 +0100, Patrick Stuedi wrote:
> Thanks everyone for the sharing comments and suggestions, this is
> very helpful!
> 
> I have updated the KIP based on the suggestions, please let me know what
> you think. Here are the main changes:
> - Removed constructor in WindowRangeQuery (Guozhang)
> - Clearly specified how each of the instantiation of the different query
> types maps to specific operations in WindowStore and SessionStore (Guozhang)
> - Renamed the window start end time parameters and getter methods (Luke)
> - Added some comments on the use of WindowRangeQuery.fromKey (Guozhang,
> John)
> 
> The KIP currently takes a minimalist approach to move things forward. There
> are two follow ups I can think of from this KIP:
> * Add additional parameter combinations to WindowRangeQuery, e.g., support
> for key range with and without window ranges.
> * Remove WindowRangeQuery.fromKey and either use
> WindowKeyQuery.withKeyAndWindowBounds or add a new call like
> WindowKeyQuery.withKeyAndNoWindowBounds(): either way that would then raise
> the question which operation in WindowStore this query should map to, given
> that WindowKeyQuery is templated against WindowStoreIterator and the
> current use of WindowRangeQuery.fromKey is to call SessionStore.fetch which
> returns a KeyValueIterator.
> * Combine WindowKeyQuery and WindowRangeQuery, e.g., by exposing the
> template type
> * Create a "Builder" interface for the query types to avoid blowing up the
> static methods.
> 
> Since I think any of those tasks will require some more discussion, and in
> the interest of being pragmatic here, I suggest we defer those
> optimizations to future KIPs.
> 
> Best,
>   Patrick
> 
> On Tue, Dec 14, 2021 at 1:02 AM Guozhang Wang <wangg...@gmail.com> wrote:
> 
> > Hi John,
> > 
> > Please see my follow-up comments inlined below.
> > 
> > On Mon, Dec 13, 2021 at 2:26 PM John Roesler <vvcep...@apache.org> wrote:
> > 
> > > Hi Patrick, thanks for the KIP!
> > > 
> > > I hope you, Guozhang, and Luke don't mind if I share some
> > > thoughts:
> > > 
> > > 1. I think you just meant to remove that private constructor
> > > from the KIP, right?
> > > 
> > 
> > Yes.
> > 
> > 
> > > 
> > > 2. I think WindowRangeQuery#withWindowRage(windowLower,
> > > windowUpper) is the version that yields an iterator over
> > > both keys and windows, so it's not totally redundant.
> > > 
> > > However, it does seem like WindowRangeQuery#withKey is
> > > equivalent to WindowKeyQuery#withKey . I overlooked it
> > > before, but we probably don't need both.
> > > 
> > > Stepping back, it seems like we could either just de-
> > > duplicate that specific withKey query or we could even just
> > > merge both use cases into the WindowRangeQuery. I've
> > > personally never been convinced that the extra complexity of
> > > having the WindowStoreIterator is worth the savings of not
> > > duplicating the key on each record. Maybe it would be if we
> > > separately deserialized the same key over and over again,
> > > but that seems optimizable.
> > > 
> > > On the other hand, that's more work, and the focus for IQv2
> > > right now is to just get some MVP of queries implemented to
> > > cover basic use cases, so it might be worthwhile to keep it
> > > simple (by just dropping `WindowRangeQuery#withKey`). I'm
> > > happy with whatever call Patrick wants to make here, since
> > > it's his work to do.
> > > 
> > 
> > My original comments was actually referring to that `WindowRangeQuery` does
> > not allow users to specify a range of [keyFrom: windowFrom, keyTo:
> > windowTo], but only [key: windowFrom, key: windowTo], or [-INF: windowFrom,
> > INF: windowTo] if we do not specify the key but only do `withWindowRage`. I
> > was not sure if this is intentional by design, i.e. that the only
> > difference between the two classes is that the latter allows [-INF:
> > windowFrom, INF: windowTo], and and I was assuming it's now, but otherwise
> > these two are then very much alike.
> > 
> > So I guess my question is that, is it by-design to not allow users do
> > something like `query.withWindowRange(..).withKeyRange(..)`?
> > 
> > 
> > > 
> > > 3. I got the impression that WindowRangeQuery was intended
> > > for use with both Window and Session stores, but that might
> > > be my assumption.
> > > 
> > > 
> > Today the session store's query API is at least different on the names,
> > e.g. "findSessions", so I'm not sure if the proposal intend to replace
> > these functions with the WindowRangeQuery such that:
> > 
> > sessionStore.findSessions(K, Ins, Ins) -> query.withKey().withWindowRange()
> > sessionStore.findSessions(K, K, Ins, Ins) ->
> > query.withKeyRange().withWindowRange()   // again, assuming we would want
> > `withKeyRange` as in 2) above.
> > sessionStore.fetch(K) -> query.withKey()
> > sessionStore.fetch(K, K) -> query.withKeyRange()
> > 
> > Also, sessionStore.fetchSession(K, Ins, Ins) seems would not be supported
> > with the new API. I'm not enforcing that we have the complete coverage in
> > this KIP, and I'm with you that we would focus on getting some MVP out
> > sooner, but I'd like our KIP to clarify on what's really covered comparing
> > against the old API and what's not.
> > 
> > 
> > 
> > > 4. That's good feedback. I think those names were inspired
> > > by the WindowStore methods that just say stuff like
> > > `timeFrom` and `timeTo`. The SessionStore does a better job
> > > of being unambiguous, since it says stuff like
> > > `earliestSessionEndTime` and `latestSessionStartTime`.
> > > 
> > > And, actually, that brings up a point that I think we
> > > overlooked before. While the method signatures of the
> > > WindowStore and SessionStore methods look the same, the
> > > meanings of their arguments are not the same. In particular,
> > > the WindowStore ranges are always in reference to the window
> > > start time, while the SessionStore ranges may be in
> > > reference to the window start time or end time (or different
> > > combinations of both).
> > > 
> > > I suspect that your instinct to cover both stores with the
> > > same Query is correct, and we just need to be specific about
> > > the kinds of ranges we're talking about. For example,
> > > instead of withWindowRange, maybe we would have:
> > > 
> > > withWindowStartRange(
> > >   Instant earliestWindowStartTime,
> > >   Instant latestWindowStartTime
> > > );
> > > 
> > > Then in the future, we could add stuff like:
> > > 
> > > withWindowStartAndEndRange(
> > >   Instant earliestWindowStartTime,
> > >   Instant latestWindowEndTime
> > > );
> > > 
> > > Etc.
> > > 
> > > 5. I think Luke's question reveals how weirdly similar but
> > > not the same these two queries are. It's not your fault,
> > > this is inherited from the existing set of weirdly-similar-
> > > but-not-the-same store methods. The thing that distinguishes
> > > the WindowKeyQuery from the WindowRangeQuery is:
> > > * WindowKeyQuery: all results share the same K key (cf point
> > > 2: I don't think we need `WindowRangeQuery#withKey(k)`).
> > > Therefore, the iterator is just (windowStartTime, value)
> > > pairs.
> > > * WindowRangeQuery: the results may have different keys, so
> > > the iterator is (Windowed<K>, value) pairs, where
> > > Windowed<K> is basically a (windowStartTime, K key) pair.
> > > 
> > > Therefore, we could support a WindowRangeQuery with a fixed
> > > key, but it would just be another way to express the same
> > > thing as the WindowKeyQuery with the same parameters.
> > > 
> > > 
> > As I suggested, we do not necessarily need to cover all existing cases
> > compared with the old API, but it's better to be very clear on what's
> > actually covered v.s. what's not. For example in WindowRangeQuery, I'd like
> > to clarify if all the following are intended in this KIP's scope:
> > 
> > * query.withKey().withWindowRange()
> > * query.withKey() // range all window
> > * query.withWindowRange() // range all keys.
> > 
> > 
> > > As in point 2, I do think it might be worth converging all
> > > use cases into the WindowRangeQuery, but we can also just
> > > shoot for parity now and defer elegance for the future.
> > > 
> > > 6. I'll just add one question of my own here: If we do keep
> > > both query classes, then I think we would drop `withKey` and
> > > `getKey` from the WindowRangeQuery. But if we do wind up
> > > keeping the WindowRangeQuery class, I think we should
> > > reconsider the name of the getter, since there are other
> > > range queries that support a range of keys. Is there a
> > > getter name we can use that would still work if we come back
> > > later and add lower and upper key bounds?
> > > 
> > > Thanks again for the KIP!
> > > John
> > > 
> > > On Mon, 2021-12-13 at 16:35 +0800, Luke Chen wrote:
> > > > Hi Patrick,
> > > > 
> > > > Thanks for the KIP!
> > > > 
> > > > I have some comments, in addition to Guozhang's comments:
> > > > 4. The parameter names `windowLower` and `windowUpper` are kind of
> > > > ambiguous to me. Could we come up a better name for it, like
> > > > `windowStartTime`, `windowEndTime`, or even we don't need the "window"
> > > > name, just `startTime` and `endTime`?
> > > > 5. Why can't we support window range query with a key within a time
> > > range?
> > > > You might need to explain in the KIP.
> > > > 
> > > > Thank you.
> > > > Luke
> > > > 
> > > > 
> > > > On Sat, Dec 11, 2021 at 7:54 AM Guozhang Wang <wangg...@gmail.com>
> > > wrote:
> > > > 
> > > > > Hi Patrick,
> > > > > 
> > > > > I made a pass on the KIP and have a few comments below:
> > > > > 
> > > > > 1. The `WindowRangeQuery` has a private constructor while the
> > > > > `WindowKeyQuery` has not, is that intentional?
> > > > > 
> > > > > 2. The `WindowRangeQuery` seems not allowing to range over both
> > window
> > > and
> > > > > key, but only window with a fixed key, in that case it seems pretty
> > > much
> > > > > the same as the other (ignoring the constructor), since we know we
> > > would
> > > > > only have a single `key` value in the returned iterator, and hence it
> > > seems
> > > > > returning in the form of `WindowStoreIterator<V>` is also fine as the
> > > key
> > > > > is fixed and hence no need to maintain it in the returned iterator.
> > I'm
> > > > > wondering should we actually support ranging over keys as well in
> > > > > `WindowRangeQuery`?
> > > > > 
> > > > > 3. The KIP title mentioned both session and window, but the APIs only
> > > > > involves window stores; However the return type `WindowStoreIterator`
> > > is
> > > > > only for window stores not session stores, so I feel we would still
> > > have
> > > > > some differences for session window query interface?
> > > > > 
> > > > > 
> > > > > Guozhang
> > > > > 
> > > > > On Fri, Dec 10, 2021 at 1:32 PM Patrick Stuedi
> > > > > <pstu...@confluent.io.invalid>
> > > > > wrote:
> > > > > 
> > > > > > Hi everyone,
> > > > > > 
> > > > > > I would like to start the vote for KIP-806 that adds window and
> > > session
> > > > > > query support to query KV-stores using IQv2.
> > > > > > 
> > > > > > The KIP can be found here:
> > > > > > https://cwiki.apache.org/confluence/x/LJaqCw
> > > > > > 
> > > > > > Skipping the discussion phase as this KIP is following the same
> > > pattern
> > > > > as
> > > > > > the previously submitted KIP-805 (KIP:
> > > > > > https://cwiki.apache.org/confluence/x/85OqCw, Discussion:
> > > > > > https://tinyurl.com/msp5mcb2). Of course concerns/comments can
> > > still be
> > > > > > brought up in this thread.
> > > > > > 
> > > > > > -Patrick
> > > > > > 
> > > > > 
> > > > > 
> > > > > --
> > > > > -- Guozhang
> > > > > 
> > > 
> > > 
> > 
> > --
> > -- Guozhang
> > 

Reply via email to