Awesome summary (seriously) -- would you kindly offer your organizational
skills to every ongoing KIP from henceforth? We need you :P

A few answers/comments:

A2: I think there is a 3rd sub-option here, which is to leave
versioned-ness out of this KIP entirely, return only the non-versioned
stores for now, and then switch over to the versioned stores (only) when
the time comes to flip the switch on making them the default across the
DSL. This has the advantage of retaining the current behavior/semantics and
provides a clear way to transition smoothly in the future, since it seems
we will want to cut to all versioned state stores rather than offer users a
choice between versioned or non-versioned stores going forward (similar to
how we only offer timestamped stores presently, and have completely
replaced non-timestamped stores in the DSL.) . In both the timestamped and
versioned cases, the old stores are/will still be available or accessible
to users via the bare StoreSuppliers, should they somehow desire or require
the old store type. Ultimately, I think either this or option (1) would be
preferable, though I think it should be up to Matthias or anyone else
involved in the versioned stores feature to decide which approach makes
sense in the context of that feature's future plans.

A3: sounds reasonable to me

A5: Also sounds fine to me, though I'll let others chime in with/if they
have an alternative suggestion/preference. I guess the other contender
would be something like DSLStoreImpl or something like that?



On Mon, Jul 24, 2023 at 9:36 AM Almog Gavra <almog.ga...@gmail.com> wrote:

> Lots of thoughts! Happy to see the thriving discussion on this post - lots
> going on so I'm trying to enumerate them to keep things organized (prefix
> "A" for "Almog" so we can use numbers in responses for other things ;P).
>
> A1. Question around closing implementation gaps (e.g. no rocks based
> suppression store)
> A2. Specifically how to handle Versioned stores
> A3. Configuration (new config/reuse old one + new one and ordering of
> config resolution)
> A4. Drawing a line between what is implementation detail (not exposed in
> API) and what is customizable (exposed in API)
> A5. Naming of StoreTypeSpec
> A6. Param classes in StoreBuilders
>
> ------------------------------
>
> Here are summaries for where it seems each of these stands (trying not to
> add any additional opinion yet):
>
> A1. Sophie/Guozhang/Me (if I count hah!) seem to agree that it is worth
> pushing this KIP through independently of the implementation gaps as it
> doesn't seem to move the intermediate state further from the end state.
> Matthias originally had some concerns.
>
> A2. There's questions around whether versioned stores should be their own
> configurable option or whether they are an implementation detail that the
> StoreSpec should decide. It seems like the discussion is converging here,
> this should be an implementation detail.
>
> A3. Matthias/Guozhang prefer adding CUSTOM and then having an additional
> config to determine the implementation. Sophie prefers deprecating the old
> config. Guozhang additional suggested flipping the resolution order such
> that the old config is only respected in a DefaultStoreSpec implementation.
>
> A4. This KIP (or rather, the discussion on the KIP) blurs the lines between
> top level store types (KV, windowed, session) and the underlying
> implementation of them (timestamped, versioned, kv-list). It seems everyone
> is in alignment to ensure that we keep these two things separate and that
> the line is clearly delineated in the text of the KIP.
>
> A5. Guozhang and Sophie agree that the current name StoreTypeSpec is
> misleading, as it's really an implementation spec, not a type
> specification.
>
> A6. Agreement that this is an improvement, Sophie believes this can be done
> in a follow up but we should ensure our naming is good here so there's no
> conflicts down the line.
>
> ---------------------------------
>
> Ok, phew! Hopefully that covers it all! Now for my thoughts, hopefully
> wrapping up some of these discussions:
>
> A1.  @Matthias - are you still hesitant here? What would you need to be
> convinced here?
>
> A2. Since we are all in agreement that versioned stores should be an
> implementation detail, we have two options:
>
> (1) we can extend the KVParams to include a parameter that indicates
> whether or not the store should be versioned
> (2) we can introduce a configuration for whether or not to use a versioned
> store, and each implementation can choose to read/ignore that config
>
> Any preferences? (1) would align more closely with what we are doing today
> (they are a top-level concept in the Stores API).
>
> A3. I like Guozhang's suggestion of making the "primary" configuration to
> be the new one, and then having a "DefaultStoreTypeSpec" (using the old
> naming) which respects the old configuration. That seems to solve nearly
> all the concerns (e.g. it'd be easy to see where the enum is used because
> it would only be used within that one class instead of littered throughout
> the code base).
>
> @Sophie, unless you have objections here I will update the KIP to do that.
>
> A4. I will make these changes to the KIP to make it clear.
>
> A5. I will rename it `DslStorePlugin` - any objections to this name?
>
> A6. Let's punt this ;) I agree with everyone that this would be a welcome
> improvement and that this KIP is aligned with moving in that direction.
> Given how much discussion there was on this KIP, which is minor relative to
> making the changes to StoreBuilder API, I'd rather not tie the two
> together.
>
> Cheers & Thanks everyone for the thoughts!
> - Almog
>
> On Sun, Jul 23, 2023 at 5:15 PM Sophie Blee-Goldman <
> ableegold...@gmail.com>
> wrote:
>
> > Guozhang:
> >
> > On your 2nd point:
> >
> > > "impl types" (in hindsight it may not be a good name) for rocksdb /
> > memory / custom, and we used "store types" for kv / windowed / sessioned
> > etc,
> > First off, thanks so much for this clarification -- using "store type"
> here
> > was definitely making me uncomfortable as this usually refers to KV vs
> > window, etc -- but I just couldn't for the life of me think of the right
> > term for rocks vs IM. We should 100% change to something like
> StoreImplSpec
> > for this kind of interface.
> >
> > > As for list-value store (for stream-stream Join)
> > Again, glad you mentioned this -- I forgot how the extra stream-stream
> join
> > store is not a "regular" KV Store but rather this special list-value
> store.
> > If we proceed with something like the current approach, perhaps that
> should
> > be a boolean (or enum) parameter in the KVConfig, similar to the
> > EmitStrategy? After all, the high-level goal of this KIP is to be able to
> > fully customize all DSL state stores, and this is currently not possible
> > due to KAFKA-14976 <https://issues.apache.org/jira/browse/KAFKA-14976>.
> >
> > If we expect there to be further customizations like this going forward,
> > perhaps we could instead have each of the three StoreConfig classes
> accept
> > a single enum parameter for the "sub-type" (or whatever you want to call
> > it), which would encompass (and replace) things like the EmitStrategy as
> > well as the list-value type (we could define one enum for each Config
> class
> > so there is no accidentally requesting a LIST_VALUE subtype on a
> > WindowStore). Thoughts?
> >
> > Lastly, regarding 3.b:
> >
> > I love that you brought this up because that is actually what I first
> > proposed to Almog, ie introducing a param class to clean up the
> > StoreBuilder API, during our chat that led to this KIP. He pushed back,
> > claiming (rightly so) that this change would be huge in scope for a
> purely
> > aesthetic/API change that doesn't add any functionality, and that it
> makes
> > more sense to start with the DSL config since there is a clear gap in
> > functionality there, particularly when it comes to custom state stores
> > (reasons 1 & 3 in the Motivation section). He did agree that the param
> > classes were a better API, which is why you see them in this KIP. In
> other
> > words: I fully agree that we should apply this improvement to the
> > PAPI/StoreBuilder interface as well, but I think that's a distinct
> concept
> > for the time-being and should not block the DSL improvement. Rather, I
> > consider this KIP as setting the stage for a followup KIP down the line
> to
> > clean up the StoreBuilders and bring them in line with the new
> param/config
> > class approach.
> >
> > That said,  A) we should definitely make sure whatever we introduce here
> > can be extended to the PAPI StoreBuilders in a natural way, and B) I
> should
> > clarify that I personally would be happy to see this included in the
> > current KIP, but as Almog's KIP it's up to him to decide whether he's
> > comfortable expanding the scope like this. If you can convince him where
> I
> > could not, more power to you! :P
> >
> > On Sun, Jul 23, 2023 at 4:48 PM Sophie Blee-Goldman <
> > ableegold...@gmail.com>
> > wrote:
> >
> > > Matthias:
> > >
> > > I'm not sure I agree with (or maybe don't follow) this take:
> > >>
> > >> we need all kind of `StoreTypeSpec` implementations,
> > >> and it might also imply that we need follow up KIPs for new feature
> > >> (like in-memory versioned store) that might not need a KIP otherwise.
> > >>
> > > I see this feature as being a nice add-on/convenience API for any store
> > > types which have a full DSL implementation. I don't think it's
> > unreasonable
> > > to just say that this feature is only going to be available for store
> > types
> > > that have KV, Window, and Session implementations. I can't think of any
> > > case besides versioned stores where this would force a KIP for a new
> > > feature that would not otherwise have to go through a KIP, and even for
> > > versioned state stores, the only issue is that the KIP for that was
> > already
> > > accepted.
> > >
> > > However, I think I agree on your main point -- that things like
> "regular"
> > > vs timestamped vs versioned are/should be an implementation detail
> that's
> > > hidden from the user. As I noted previously, the current KIP actually
> > > greatly improves the situation for timestamped stores, as this would be
> > > handled completely transparently by the OOTB RocksDBStoreSpec. To me,
> > this
> > > provides a very natural way to let the DSL operators using the default
> > > store type/spec to specify which kind of store (eg
> > > versioned/timestamped/etc) it wants, and choose the correct default. If
> > the
> > > eventual intention is to have versioned state stores replace
> timestamped
> > > stores as the default in the DSL, then we can simply swap out the
> > versioned
> > > stores for the timestamped stores in the RocksDBStoreTypeSpec, when
> that
> > > time comes. Until then, users who want to use the versioned store will
> > have
> > > to do what they do today, which is individually override operators via
> > > Materialized/StoreSuppliers.
> > >
> > > All in all, it sounds like we should not offer a versioned store type
> > > spec, as "versioned" is more akin to "timestamped" than to a true
> > > difference in underlying store implementation type (eg rocks vs
> > in-memory).
> > > W.r.t whether to deprecate the old config or introduce a new CUSTOM
> enum
> > > type, either seems fine to me, and we can go with that alternative
> > instead.
> > > The only other con to this approach that I can think of, and I'm
> honestly
> > > not sure if this is something users would care about or only devs, is
> > that
> > > the advantage to moving rocks and IM to the store type spec interface
> is
> > > that it helps to keep the relevant logic encapsulated in one easy place
> > you
> > > can quickly check to tell what kind of state store is used where. In
> the
> > > current code, I found it extremely annoying and difficult to track down
> > all
> > > usages of the StoreType enum to see which actual rocksdb store was
> being
> > > used where (for example some stores using the TimeOrderedBuffer
> variants
> > in
> > > some special cases, or to understand whether the DSL was defaulting to
> > > plain, timestamped, or versioned stores for RocksDB vs InMemory -- both
> > of
> > > which seem like they could be of interest to a user). This would be
> much
> > > easier if everything was handled in one place, and you can just go to
> the
> > > (eg) RocksDBStoreTypeSpec and see what it's doing, or find usages of
> the
> > > methods to understand what stores are being handed to which DSL
> > operators.
> > >
> > > I suppose we could still clean up the API and solve this problem by
> > having
> > > the old (and new) config delegate to a StoreTypeSpec no matter what,
> but
> > > make RocksDBStoreTypeSpec and InMemoryStoreTypeSpec internal classes
> that
> > > are simply implementation details of the ROCKSDB vs IN_MEMORY enums.
> > WDYT?
> > >
> > >
> > > On Sun, Jul 23, 2023 at 11:14 AM Guozhang Wang <
> > guozhang.wang...@gmail.com>
> > > wrote:
> > >
> > >> Thanks everyone for the great discussions so far! I first saw the JIRA
> > >> and left some quick thoughts without being aware of the
> > >> already-written KIP (kudos to Almog, very great one) and the DISCUSS
> > >> thread here. And I happily find some of my initial thoughts align with
> > >> the KIP already :)
> > >>
> > >> Would like to add a bit more of my 2c after reading through the KIP
> > >> and the thread here:
> > >>
> > >> 1. On the high level, I'm in favor of pushing this KIP through without
> > >> waiting on the other gaps to be closed. In my back pocket's
> > >> "dependency graph" of Kafka Streams roadmap of large changes or
> > >> feature gaps, the edges of dependencies are defined based on my
> > >> understanding of whether doing one first would largely complicate /
> > >> negate the effort of the other but not vice versa, in which case we
> > >> should consider getting the other done first. In this case, I feel
> > >> such a dependency is not strong enough, so encouraging the KIP
> > >> contributor to finish what he/she would love to do to close some gaps
> > >> early would be higher priorities. I did not see by just doing this we
> > >> could end up in a worse intermediate stage yet, but I could be
> > >> corrected.
> > >>
> > >> 2. Regarding the store types --- gain here I'd like to just clarify
> > >> the terms a bit since in the past it has some confusions: we used
> > >> "impl types" (in hindsight it may not be a good name) for rocksdb /
> > >> memory / custom, and we used "store types" for kv / windowed /
> > >> sessioned etc, as I said in the JIRA I think the current proposal also
> > >> have a good side effect as quality bar to really enforce us think
> > >> twice when trying to add more store types in the future as it will
> > >> impact API instantiations. In the ideal world, I would consider:
> > >>
> > >> * We have (timestamped) kv store, versioned kv store, window store,
> > >> session store as first-class DSL store types. Some DSL operators could
> > >> accept multiple store types (e.g. versioned and non versioned
> > >> kv-store) for semantics / efficiency trade-offs. But I think we would
> > >> remove un-timestamped kv stores eventually since that efficiency
> > >> trade-off is so minimal compared to its usage limitations.
> > >> * As for list-value store (for stream-stream Join), memory-lru-cache
> > >> (for PAPI use only), memory-time-ordered-buffer (for suppression),
> > >> they would not be exposed as DSL first-class store types in the
> > >> future. Instead, they would be treated as internal used stores (e.g.
> > >> list-value store is built on key-value store with specialized serde
> > >> and putInternal), or continue to be just convenient OOTB PAPI used
> > >> stores only.
> > >> * As we move on, we will continue to be very, very strict on what
> > >> would be added as DSL store types (and hence requires changes to the
> > >> proposed APIs), what to be added as convenient OOTB PAPI store impls
> > >> only, what to be added as internal used store types that should not be
> > >> exposed to users nor customizable at all.
> > >>
> > >> 3. Some more detailed thoughts below:
> > >>
> > >> 3.a) I originally also think that we can extend the existing config,
> > >> rather than replacing it. The difference was that I was thinking that
> > >> order-wise, the runtime would look at the API first, and then the
> > >> config, whereas in your rejected alternative it was looking at the
> > >> config first, and then the API --- that I think is a minor thing and
> > >> either is fine. I'm in agreement that having two configs would be more
> > >> confusing to users to learn about their precedence rather than
> > >> helpful, but if we keep both a config and a public API, then the
> > >> precedence ordering would not be so confusing as long as we state them
> > >> clearly. For example:
> > >>
> > >> * We have DefaultStoreTypeSpec OOTB, in that impl we look at the
> > >> config only, and would only expect either ROCKS or MEMORY, and return
> > >> corresponding OOTB store impls; if any other values configured, we
> > >> error out.
> > >> * Users extend that by having MyStoreTypeSpec, in which they could do
> > >> arbituray things without respecting the config at all, but our
> > >> recommended pattern in docs would still say "look into the config, if
> > >> it is ROCKS or MEMORY just return fall back to DefaultStoreTypeSepc;
> > >> otherwise if it's some String you recognize, then return your
> > >> customized store based on the string value, otherwise error out".
> > >>
> > >> 3.b) About the struct-like Params classes, I like the idea a lot and
> > >> wished we would pursue this in the first place, but if we only do this
> > >> in Spec it would leave some inconsistencies with the StoreBuilders
> > >> though arguably the latter is only for PAPI. I'm wondering if we
> > >> should consider including the changes in StoreBuilders (e.g.
> > >> WindowStoreBuilder(WindowSupplierParams)), and if yes, maybe we should
> > >> also consider renaming that e.g. `WindowSupplierParams` to
> > >> `WindowStoreSpecParams` too? For this one I only have a "weak feeling"
> > >> so I can be convinced otherwise :P
> > >>
> > >> Thanks,
> > >> Guozhang
> > >>
> > >>
> > >>
> > >> On Sun, Jul 23, 2023 at 9:52 AM Matthias J. Sax <mj...@apache.org>
> > wrote:
> > >> >
> > >> > Thanks for all the input. My intention was not to block the KIP, but
> > >> > just to take a step back and try get a holistic picture and
> > discussion,
> > >> > to explore if there are good/viable alternative designs. As said
> > >> > originally, I really like to close this gap, and was always aware
> that
> > >> > the current config is not flexible enough.
> > >> >
> > >> >
> > >> > I guess, my "concern" is that the KIP does increase the API surface
> > area
> > >> > significantly, as we need all kind of `StoreTypeSpec`
> implementations,
> > >> > and it might also imply that we need follow up KIPs for new feature
> > >> > (like in-memory versioned store) that might not need a KIP
> otherwise.
> > >> >
> > >> > The second question is if it might make the already "patchy"
> situation
> > >> > with regard to customization worse.
> > >> >
> > >> > We did de-scope the original KIP-591 for this reason, and given the
> > new
> > >> > situation of the DSL, it seems that it actually got worse compared
> to
> > >> > back in the days.
> > >> >
> > >> > Lastly, I hope to make the new versioned stores the default in the
> DSL
> > >> > and we did not do it in the previous KIP due to backward
> compatibility
> > >> > issues. Thus, from a DSL point of view, I believe there should be
> only
> > >> > "RocksDB", "InMemory", and "Custom" in an ideal world. Introducing
> (I
> > am
> > >> > exaggerating to highlight my point) "KvRocksDbSpec",
> > >> > "TimestampeKvRocksDbSpec", "VersionedRocksDbSpec", plus the
> > >> > corresponding in-memory specs seems to head into the opposite
> > direction.
> > >> > -- My goal is to give users a handle of the _physical_ store
> (RocksDB
> > vs
> > >> > InMemory vs Custom) but not the _logical_ stores (plain kv, ts-kv,
> > >> > versioned) which is "dictated" by the DSL itself and should not be
> > >> > customizable (we are just in a weird intermediate situation that we
> > need
> > >> > to clean up, but not "lean into" IMHO).
> > >> >
> > >> > Thus, I am also not sure if adding "VersionedRocksDbSpec" would be
> > ideal
> > >> > (also, given that it only changes a single store, but not the two
> > >> > windowed stores)?
> > >> >
> > >> > Furthermore, I actually hope that we could use the new versioned
> store
> > >> > to replace the window- and sessions- stores, and thus to decrease
> the
> > >> > number of required store types.
> > >> >
> > >> >
> > >> > Admittedly, I am talking a lot about a potential future, but the
> goal
> > is
> > >> > only to explore opportunities to not get into "worse" intermediate
> > >> > state, that will require a huge deprecation surface area later on.
> Of
> > >> > course, if there is no better way, and my concerns are not shared, I
> > am
> > >> > ok to move forward with the KIP.
> > >> >
> > >> >
> > >> > Bottom line: I would personally prefer to keep the current config
> and
> > >> > add a `Custom` option to it, plus adding one new config that allows
> > >> > people to set their custom `StoreTypeSpec` class. -- I would not
> add a
> > >> > built-in spec for versioned stores at this point (or any other
> > built-in
> > >> > `StorytypeSpec` implementations). I guess, users could write a
> custom
> > >> > spec if they want to enable versioned store across the board for now
> > >> > (until we make them the default anyway)?
> > >> >
> > >> >
> > >> > Hope my train of thoughts is halfway reasonable and not totally off
> > >> track?
> > >> >
> > >> >
> > >> > -Matthias
> > >> >
> > >> > On 7/21/23 15:27, Sophie Blee-Goldman wrote:
> > >> > > I agree with everything Almog said above, and will just add on to
> > two
> > >> > > points:
> > >> > >
> > >> > > 1. Regarding whether to block this KIP on the completion of any or
> > all
> > >> > > future implementations of in-memory version stores (or persist
> > >> suppression
> > >> > > buffers), I feel that would be unfair to this feature which is
> > >> completely
> > >> > > unrelated to the semantic improvements offered by versioned state
> > >> stores.
> > >> > > It seems like the responsibility of those driving the versioned
> > state
> > >> > > stores feature, not Almog/this KIP, to make sure that those bases
> > are
> > >> > > covered. Further, if anything, this KIP will help with the massive
> > >> > > proliferation of StoreSuppliers on the Stores factory class, and
> > >> provide
> > >> > > users with a much easier way to leverage the versioned stores
> > without
> > >> > > having to muck around directly with the StoreSuppliers.
> > >> > >
> > >> > > I also thought about it a bit, and really like Almog's suggestion
> to
> > >> > > introduce an additional StoreSpec for the Versioned state stores.
> > >> Obviously
> > >> > > we can add the RocksDB one to this KIP now, and then as he
> > mentioned,
> > >> > > there's an easy way to get users onto the IMVersionedStateStore
> > types
> > >> once
> > >> > > they are completed.
> > >> > >
> > >> > > Lastly, on this note, I want to point out how smoothly this solved
> > the
> > >> > > issue of timestamped stores, which are intended to be the DSL
> > default
> > >> but
> > >> > > are only a special case for RocksDB. Right now it can be confusing
> > >> for a
> > >> > > user scrolling through the endless Stores class and seeing a
> > >> timestamped
> > >> > > version of the persistent but not in-memory stores. One could
> easily
> > >> assume
> > >> > > there was no timestamped option for IM stores and that this
> feature
> > >> was
> > >> > > incomplete, if they weren't acutely aware of the internal
> > >> implementation
> > >> > > details (namely that it's only required for RocksDB for
> > compatibility
> > >> > > reasons). However, with this KIP, all that is handled completely
> > >> > > transparently to the user, and we the devs, who *are* aware of the
> > >> internal
> > >> > > implementation details, are now appropriately the ones responsible
> > for
> > >> > > handing the correct store type to a particular operator. While
> > >> versioned
> > >> > > state stores may not be completely comparable, depending on
> whether
> > >> we want
> > >> > > users to remain able to easily choose between using them or not
> (vs
> > >> > > timestamped which should be used by all), I still feel this KIP
> is a
> > >> great
> > >> > > step in the right direction that not only should not be blocked on
> > the
> > >> > > completion of the IM implementations, but in fact should
> > specifically
> > >> be
> > >> > > done first as it enables an easier way to utilize those IM
> versioned
> > >> > > stores. Just my 2 cents :)
> > >> > >
> > >> > > 2. The idea to expand the existing the config with a CUSTOM enum
> > >> without
> > >> > > introducing another config to specify the CUSTOM store spec does
> not
> > >> seem
> > >> > > appropriate, or  even possible (for the reasons Almog mentioned
> > above
> > >> about
> > >> > > config types, though perhaps there is a way I'm not seeing). I do
> > not
> > >> buy
> > >> > > the argument that we should optimize the API to make it easy for
> > >> users who
> > >> > > just want to switch to all in-memory stores, as I truly believe
> this
> > >> is a
> > >> > > very small fraction of the potential userbase of this feature
> > (anyone
> > >> who's
> > >> > > actually using this should please chime in!). It seems very likely
> > >> that the
> > >> > > majority of users of this feature are actually those with custom
> > state
> > >> > > stores, as to my knowledge, this has been the case any/every time
> > this
> > >> > > feature was requested by users.
> > >> > >
> > >> > > That said, while I don't see any way to get around introducing a
> new
> > >> > > config, I don't personally have a preference w.r.t whether to keep
> > >> around
> > >> > > the old config or deprecate it. I simply don't get the impression
> it
> > >> is
> > >> > > very heavily used as it stands today, while it only works for
> those
> > >> who
> > >> > > want all in-memory stores. Again, input from actual users would be
> > >> very
> > >> > > valuable here. In the absence of that data, I will just point to
> the
> > >> fact
> > >> > > that this KIP was proposed by a Streams dev (you :P), abandoned,
> > >> picked up
> > >> > > by another Streams dev, and finally implemented without ever
> hearing
> > >> from a
> > >> > > user that they would find this useful. This is not to disparage
> the
> > >> > > original KIP, but just to say again, as I stated back then, it
> > seemed
> > >> like
> > >> > > a major opportunity loss to leave out custom state stores
> > >> > >
> > >> > > Cheers,
> > >> > > Sophie
> > >> > >
> > >> > > On Fri, Jul 21, 2023 at 1:52 PM Almog Gavra <
> almog.ga...@gmail.com>
> > >> wrote:
> > >> > >
> > >> > >> Thanks for all the feedback folk! Responses inline.
> > >> > >>
> > >> > >>> Basically, I'm suggesting two things: first, call out in some
> way
> > >> > >> (perhaps the StoreTypeSpec javadocs) that each StoreTypeSpec is
> > >> considered
> > >> > >> a public contract in itself and should outline any semantic
> > >> guarantees it
> > >> > >> does, or does not, make. Second, we should add a note on ordering
> > >> > >> guarantees in the two OOTB specs: for RocksDB we assert that
> range
> > >> queries
> > >> > >> will honor serialized byte ordering, whereas the InMemory flavor
> > >> gives no
> > >> > >> ordering guarantee whatsoever at this time.
> > >> > >>
> > >> > >> That makes sense to me Sophie! I'll make the changes to the KIP.
> > And
> > >> @Colt,
> > >> > >> yes I believe that would be the new javadoc for the generic
> > >> > >> ReadOnlyKeyValueStore.
> > >> > >>
> > >> > >>> However, I am wondering if we should close others gaps first?
> > >> > >>
> > >> > >> @Matthias, thanks for the review and thoughts! I think we should
> > >> separate
> > >> > >> closing other gaps in the product from providing this as useful
> > >> > >> functionality to avoid feature creep so long as the API proposed
> > >> here will
> > >> > >> be suitable for when we want to close those implementation gaps!
> My
> > >> general
> > >> > >> proposal is that for things that are not customizable today by
> > >> > >> default.dsl.store they remain not customizable after this KIP.
> The
> > >> good
> > >> > >> news is, however, that there's no reason why this cannot be
> > extended
> > >> to
> > >> > >> cover those in the future if we want to - see specifics below.
> > >> > >>
> > >> > >> Comments on the specifics below
> > >> > >>
> > >> > >>> In particular, this holds for the new versioned-store ... Should
> > >> > >> versioned stores also be covered by the KIP
> > >> > >>
> > >> > >> Is there a reason why we can't introduce a
> > >> VersionedRocksDBStoreTypeSpec
> > >> > >> and if we ever support an in-memory an equivalent
> > >> > >> VersionedInMemoryRocksDBStoreTypeSpec? If so, then there would
> not
> > >> need to
> > >> > >> be any additional changes to the API proposed in this KIP.
> > >> > >>
> > >> > >>> For `suppress()` it's actually other way around we only have an
> > >> in-memory
> > >> > >> implementation. Do you aim to allow custom stores for
> `suppress()`,
> > >> too?
> > >> > >>
> > >> > >> We have three options here:
> > >> > >> 1) we can decide to maintain existing behavior and use the
> > in-memory
> > >> > >> implementation for all stores (not even going through the API at
> > all)
> > >> > >> 2a) we can introduce a new parameter to the KeyValueParams class
> > >> (boolean
> > >> > >> isTimeOrderedBuffer or something like that) and return an
> in-memory
> > >> store
> > >> > >> in the implementation of RocksDBStoreTypeSpec (this maintains the
> > >> existing
> > >> > >> behavior, and would allow us in the future to make the change to
> > >> return a
> > >> > >> RocksDB store if we ever provide one)
> > >> > >> 2b) same as 2a but we throw an exception if the requested store
> > type
> > >> does
> > >> > >> not support that (this is backwards incompatible, and since
> > ROCKS_DB
> > >> is the
> > >> > >> default we probably shouldn't do this)
> > >> > >>
> > >> > >> My proposal for now is 1) because as of KIP-825
> > >> > >> EmitStrategy#ON_WINDOW_CLOSE is the preferred way of suppressing
> > and
> > >> that
> > >> > >> is accounted for in this API already.
> > >> > >>
> > >> > >>> Last, I am not sure if the new parameter replacing the existing
> > one
> > >> is
> > >> > >> the
> > >> > >> best way to go?
> > >> > >>
> > >> > >> I'm happy either way, just let me know which you prefer - the
> > >> discussion
> > >> > >> around CUSTOM is in the rejected alternatives but I'm happy to
> > >> differ to
> > >> > >> whatever the project conventions are :)
> > >> > >>
> > >> > >>> If it's matches existing `ROCKS_DB` or `IN_MEMORY` we just
> process
> > >> it as
> > >> > >> we
> > >> > >> do know, and if know we assume it's a fully qualified class name
> > and
> > >> try to
> > >> > >> instantiate it?
> > >> > >>
> > >> > >> Note that there is no functionality for this kind of thing in
> > >> > >> AbstractConfig (it's either a String validated enum or a class)
> so
> > >> this
> > >> > >> would be a departure from convention. Again, I'm happy to
> implement
> > >> that if
> > >> > >> it's preferred.
> > >> > >>
> > >> > >>> Also wondering how it would related to the existing `Stores`
> > >> factory?
> > >> > >>
> > >> > >> StoreTypeSpec will depend on Stores factory - they're one layer
> > >> removed.
> > >> > >> You can imagine that StoreTypeSpec is just a grouping of methods
> > >> from the
> > >> > >> Stores factory into a convenient package for default
> configuration
> > >> > >> purposes.
> > >> > >>
> > >> > >> Thanks again for all the detailed thoughts Matthias!
> > >> > >>
> > >> > >> On Fri, Jul 21, 2023 at 11:50 AM Matthias J. Sax <
> mj...@apache.org
> > >
> > >> wrote:
> > >> > >>
> > >> > >>> Thanks for the KIP. Overall I like the idea to close this gap.
> > >> > >>>
> > >> > >>> However, I am wondering if we should close others gaps first? In
> > >> > >>> particular, IIRC, we have a few cases for which we only have a
> > >> RocksDB
> > >> > >>> implementation for a store, and thus, adding an in-memory
> version
> > >> for
> > >> > >>> these stores first, to make the current `IN_MEMORY` parameter
> > work,
> > >> > >>> might be the first step?
> > >> > >>>
> > >> > >>> In particular, this holds for the new versioned-store (but I
> > >> actually
> > >> > >>> believe the is some other internal store with no in-memory
> > >> > >>> implementation). -- For `suppress()` it's actually other way
> > around
> > >> we
> > >> > >>> we only have an in-memory implementation. Do you aim to allow
> > custom
> > >> > >>> stores for `suppress()`, too?
> > >> > >>>
> > >> > >>> Btw: Should versioned stores also be covered by the KIP (ie,
> > >> > >>> `StoreTypeSpec`)? We did consider to add a new option
> `VERSIONED`
> > >> to the
> > >> > >>> existing `default.dsl.store` config, but opted out for various
> > >> reasons.
> > >> > >>>
> > >> > >>> Last, I am not sure if the new parameter replacing the existing
> > one
> > >> is
> > >> > >>> the best way to go? Did you put the idea to add `CUSTOM` to the
> > >> existing
> > >> > >>> config into rejected alternative. Personally, I would prefer to
> > add
> > >> > >>> `CUSTOM` as I would like to optimize to easy of use for the
> > >> majority of
> > >> > >>> users (which don't implement a custom store), but only switch to
> > >> > >>> in-memory sometimes. -- As an alternative, you would also just
> > >> extend
> > >> > >>> `dsl.default.store` (it's just a String) and allow to pass in
> > >> anything.
> > >> > >>> If it's matches existing `ROCKS_DB` or `IN_MEMORY` we just
> process
> > >> it as
> > >> > >>> we do know, and if know we assume it's a fully qualified class
> > name
> > >> and
> > >> > >>> try to instantiate it? -- Given that we plan to keep the
> > >> store-enum, is
> > >> > >>> seems cleaner to keep the existing config and keep both the
> config
> > >> and
> > >> > >>> enum aligned to each other?
> > >> > >>>
> > >> > >>>
> > >> > >>> It's just preliminary thought. I will need to go back an take a
> > more
> > >> > >>> detailed look into the code to grok how the propose
> > `StoreTypeSpec`
> > >> > >>> falls into place. Also wondering how it would related to the
> > >> existing
> > >> > >>> `Stores` factory?
> > >> > >>>
> > >> > >>> -Matthias
> > >> > >>>
> > >> > >>>
> > >> > >>> On 7/21/23 6:45 AM, Colt McNealy wrote:
> > >> > >>>> Sophie—
> > >> > >>>>
> > >> > >>>> Thanks for chiming in here. +1 to the idea of specifying the
> > >> ordering
> > >> > >>>> guarantees that we make in the StorageTypeSpec javadocs.
> > >> > >>>>
> > >> > >>>> Quick question then. Is the javadoc that says:
> > >> > >>>>
> > >> > >>>>> Order is not guaranteed as bytes lexicographical ordering
> might
> > >> not
> > >> > >>>> represent key order.
> > >> > >>>>
> > >> > >>>> no longer correct, and should say:
> > >> > >>>>
> > >> > >>>>> Order guarantees depend on the underlying implementation of
> the
> > >> > >>>> ReadOnlyKeyValueStore. For more information, please consult the
> > >> > >>>> [StorageTypeSpec javadocs](....)
> > >> > >>>>
> > >> > >>>> Thanks,
> > >> > >>>> Colt McNealy
> > >> > >>>>
> > >> > >>>> *Founder, LittleHorse.dev*
> > >> > >>>>
> > >> > >>>>
> > >> > >>>> On Thu, Jul 20, 2023 at 9:28 PM Sophie Blee-Goldman <
> > >> > >>> ableegold...@gmail.com>
> > >> > >>>> wrote:
> > >> > >>>>
> > >> > >>>>> Hey Almog, first off, thanks for the KIP! I (and others)
> raised
> > >> > >> concerns
> > >> > >>>>> over how restrictive the default.dsl.store config would be if
> > not
> > >> > >>>>> extendable to custom store types, especially given that this
> > >> seems to
> > >> > >> be
> > >> > >>>>> the primary userbase of such a feature. At the time we didn't
> > >> really
> > >> > >>> have
> > >> > >>>>> any better ideas for a clean way to achieve that, but what you
> > >> > >> proposed
> > >> > >>>>> makes a lot of sense to me. Happy to see a good solution to
> > this,
> > >> and
> > >> > >>>>> hopefully others will share my satisfaction :P
> > >> > >>>>>
> > >> > >>>>> I did have one quick piece of feedback which arose from an
> > >> unrelated
> > >> > >>>>> question posed to the dev mailing list w/ subject line
> > >> > >>>>> "ReadOnlyKeyValueStore#range()
> > >> > >>>>> Semantics"
> > >> > >>>>> <
> > https://lists.apache.org/thread/jbckmth8d3mcgg0rd670cpvsgwzqlwrm>.
> > >> I
> > >> > >>>>> recommend checking out the full thread for context, but it
> made
> > me
> > >> > >> think
> > >> > >>>>> about how we can leverage the new StoreTypeSpec concept as an
> > >> answer
> > >> > >> to
> > >> > >>> the
> > >> > >>>>> long-standing question in Streams: where can we put guarantees
> > of
> > >> the
> > >> > >>>>> public contract for RocksDB (or other store implementations)
> > when
> > >> all
> > >> > >>> the
> > >> > >>>>> RocksDB stuff is technically internal.
> > >> > >>>>>
> > >> > >>>>> Basically, I'm suggesting two things: first, call out in some
> > way
> > >> > >>> (perhaps
> > >> > >>>>> the StoreTypeSpec javadocs) that each StoreTypeSpec is
> > considered
> > >> a
> > >> > >>> public
> > >> > >>>>> contract in itself and should outline any semantic guarantees
> it
> > >> does,
> > >> > >>> or
> > >> > >>>>> does not, make. Second, we should add a note on ordering
> > >> guarantees in
> > >> > >>> the
> > >> > >>>>> two OOTB specs: for RocksDB we assert that range queries will
> > >> honor
> > >> > >>>>> serialized byte ordering, whereas the InMemory flavor gives no
> > >> > >> ordering
> > >> > >>>>> guarantee whatsoever at this time.
> > >> > >>>>>
> > >> > >>>>> Thoughts?
> > >> > >>>>>
> > >> > >>>>> -Sophie
> > >> > >>>>>
> > >> > >>>>> On Thu, Jul 20, 2023 at 4:28 PM Almog Gavra <
> > >> almog.ga...@gmail.com>
> > >> > >>> wrote:
> > >> > >>>>>
> > >> > >>>>>> Hi All,
> > >> > >>>>>>
> > >> > >>>>>> I would like to propose a KIP to expand support for default
> > store
> > >> > >> types
> > >> > >>>>>> (KIP-591) to encompass custom store implementations:
> > >> > >>>>>>
> > >> > >>>>>>
> > >> > >>>>>
> > >> > >>>
> > >> > >>
> > >>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-954%3A+expand+default+DSL+store+configuration+to+custom+types
> > >> > >>>>>>
> > >> > >>>>>> Looking forward to your feedback!
> > >> > >>>>>>
> > >> > >>>>>> Cheers,
> > >> > >>>>>> Almog
> > >> > >>>>>>
> > >> > >>>>>
> > >> > >>>>
> > >> > >>>
> > >> > >>
> > >> > >
> > >>
> > >
> >
>

Reply via email to