Glad you like my KIP-secretary skills ;) A2. I'm definitely happy to take your suggestion here and not do anything special w.r.t. Versioned stores, I think it makes sense especially if we consider them implementation details of a specific store type.
At EOD I'll update the KIP with all of these changes and if the discussion is silent I'll open a vote tomorrow morning. Cheers, Almog On Mon, Jul 24, 2023 at 2:02 PM Sophie Blee-Goldman <ableegold...@gmail.com> wrote: > Awesome summary (seriously) -- would you kindly offer your organizational > skills to every ongoing KIP from henceforth? We need you :P > > A few answers/comments: > > A2: I think there is a 3rd sub-option here, which is to leave > versioned-ness out of this KIP entirely, return only the non-versioned > stores for now, and then switch over to the versioned stores (only) when > the time comes to flip the switch on making them the default across the > DSL. This has the advantage of retaining the current behavior/semantics and > provides a clear way to transition smoothly in the future, since it seems > we will want to cut to all versioned state stores rather than offer users a > choice between versioned or non-versioned stores going forward (similar to > how we only offer timestamped stores presently, and have completely > replaced non-timestamped stores in the DSL.) . In both the timestamped and > versioned cases, the old stores are/will still be available or accessible > to users via the bare StoreSuppliers, should they somehow desire or require > the old store type. Ultimately, I think either this or option (1) would be > preferable, though I think it should be up to Matthias or anyone else > involved in the versioned stores feature to decide which approach makes > sense in the context of that feature's future plans. > > A3: sounds reasonable to me > > A5: Also sounds fine to me, though I'll let others chime in with/if they > have an alternative suggestion/preference. I guess the other contender > would be something like DSLStoreImpl or something like that? > > > > On Mon, Jul 24, 2023 at 9:36 AM Almog Gavra <almog.ga...@gmail.com> wrote: > > > Lots of thoughts! Happy to see the thriving discussion on this post - > lots > > going on so I'm trying to enumerate them to keep things organized (prefix > > "A" for "Almog" so we can use numbers in responses for other things ;P). > > > > A1. Question around closing implementation gaps (e.g. no rocks based > > suppression store) > > A2. Specifically how to handle Versioned stores > > A3. Configuration (new config/reuse old one + new one and ordering of > > config resolution) > > A4. Drawing a line between what is implementation detail (not exposed in > > API) and what is customizable (exposed in API) > > A5. Naming of StoreTypeSpec > > A6. Param classes in StoreBuilders > > > > ------------------------------ > > > > Here are summaries for where it seems each of these stands (trying not to > > add any additional opinion yet): > > > > A1. Sophie/Guozhang/Me (if I count hah!) seem to agree that it is worth > > pushing this KIP through independently of the implementation gaps as it > > doesn't seem to move the intermediate state further from the end state. > > Matthias originally had some concerns. > > > > A2. There's questions around whether versioned stores should be their own > > configurable option or whether they are an implementation detail that the > > StoreSpec should decide. It seems like the discussion is converging here, > > this should be an implementation detail. > > > > A3. Matthias/Guozhang prefer adding CUSTOM and then having an additional > > config to determine the implementation. Sophie prefers deprecating the > old > > config. Guozhang additional suggested flipping the resolution order such > > that the old config is only respected in a DefaultStoreSpec > implementation. > > > > A4. This KIP (or rather, the discussion on the KIP) blurs the lines > between > > top level store types (KV, windowed, session) and the underlying > > implementation of them (timestamped, versioned, kv-list). It seems > everyone > > is in alignment to ensure that we keep these two things separate and that > > the line is clearly delineated in the text of the KIP. > > > > A5. Guozhang and Sophie agree that the current name StoreTypeSpec is > > misleading, as it's really an implementation spec, not a type > > specification. > > > > A6. Agreement that this is an improvement, Sophie believes this can be > done > > in a follow up but we should ensure our naming is good here so there's no > > conflicts down the line. > > > > --------------------------------- > > > > Ok, phew! Hopefully that covers it all! Now for my thoughts, hopefully > > wrapping up some of these discussions: > > > > A1. @Matthias - are you still hesitant here? What would you need to be > > convinced here? > > > > A2. Since we are all in agreement that versioned stores should be an > > implementation detail, we have two options: > > > > (1) we can extend the KVParams to include a parameter that indicates > > whether or not the store should be versioned > > (2) we can introduce a configuration for whether or not to use a > versioned > > store, and each implementation can choose to read/ignore that config > > > > Any preferences? (1) would align more closely with what we are doing > today > > (they are a top-level concept in the Stores API). > > > > A3. I like Guozhang's suggestion of making the "primary" configuration to > > be the new one, and then having a "DefaultStoreTypeSpec" (using the old > > naming) which respects the old configuration. That seems to solve nearly > > all the concerns (e.g. it'd be easy to see where the enum is used because > > it would only be used within that one class instead of littered > throughout > > the code base). > > > > @Sophie, unless you have objections here I will update the KIP to do > that. > > > > A4. I will make these changes to the KIP to make it clear. > > > > A5. I will rename it `DslStorePlugin` - any objections to this name? > > > > A6. Let's punt this ;) I agree with everyone that this would be a welcome > > improvement and that this KIP is aligned with moving in that direction. > > Given how much discussion there was on this KIP, which is minor relative > to > > making the changes to StoreBuilder API, I'd rather not tie the two > > together. > > > > Cheers & Thanks everyone for the thoughts! > > - Almog > > > > On Sun, Jul 23, 2023 at 5:15 PM Sophie Blee-Goldman < > > ableegold...@gmail.com> > > wrote: > > > > > Guozhang: > > > > > > On your 2nd point: > > > > > > > "impl types" (in hindsight it may not be a good name) for rocksdb / > > > memory / custom, and we used "store types" for kv / windowed / > sessioned > > > etc, > > > First off, thanks so much for this clarification -- using "store type" > > here > > > was definitely making me uncomfortable as this usually refers to KV vs > > > window, etc -- but I just couldn't for the life of me think of the > right > > > term for rocks vs IM. We should 100% change to something like > > StoreImplSpec > > > for this kind of interface. > > > > > > > As for list-value store (for stream-stream Join) > > > Again, glad you mentioned this -- I forgot how the extra stream-stream > > join > > > store is not a "regular" KV Store but rather this special list-value > > store. > > > If we proceed with something like the current approach, perhaps that > > should > > > be a boolean (or enum) parameter in the KVConfig, similar to the > > > EmitStrategy? After all, the high-level goal of this KIP is to be able > to > > > fully customize all DSL state stores, and this is currently not > possible > > > due to KAFKA-14976 <https://issues.apache.org/jira/browse/KAFKA-14976 > >. > > > > > > If we expect there to be further customizations like this going > forward, > > > perhaps we could instead have each of the three StoreConfig classes > > accept > > > a single enum parameter for the "sub-type" (or whatever you want to > call > > > it), which would encompass (and replace) things like the EmitStrategy > as > > > well as the list-value type (we could define one enum for each Config > > class > > > so there is no accidentally requesting a LIST_VALUE subtype on a > > > WindowStore). Thoughts? > > > > > > Lastly, regarding 3.b: > > > > > > I love that you brought this up because that is actually what I first > > > proposed to Almog, ie introducing a param class to clean up the > > > StoreBuilder API, during our chat that led to this KIP. He pushed back, > > > claiming (rightly so) that this change would be huge in scope for a > > purely > > > aesthetic/API change that doesn't add any functionality, and that it > > makes > > > more sense to start with the DSL config since there is a clear gap in > > > functionality there, particularly when it comes to custom state stores > > > (reasons 1 & 3 in the Motivation section). He did agree that the param > > > classes were a better API, which is why you see them in this KIP. In > > other > > > words: I fully agree that we should apply this improvement to the > > > PAPI/StoreBuilder interface as well, but I think that's a distinct > > concept > > > for the time-being and should not block the DSL improvement. Rather, I > > > consider this KIP as setting the stage for a followup KIP down the line > > to > > > clean up the StoreBuilders and bring them in line with the new > > param/config > > > class approach. > > > > > > That said, A) we should definitely make sure whatever we introduce > here > > > can be extended to the PAPI StoreBuilders in a natural way, and B) I > > should > > > clarify that I personally would be happy to see this included in the > > > current KIP, but as Almog's KIP it's up to him to decide whether he's > > > comfortable expanding the scope like this. If you can convince him > where > > I > > > could not, more power to you! :P > > > > > > On Sun, Jul 23, 2023 at 4:48 PM Sophie Blee-Goldman < > > > ableegold...@gmail.com> > > > wrote: > > > > > > > Matthias: > > > > > > > > I'm not sure I agree with (or maybe don't follow) this take: > > > >> > > > >> we need all kind of `StoreTypeSpec` implementations, > > > >> and it might also imply that we need follow up KIPs for new feature > > > >> (like in-memory versioned store) that might not need a KIP > otherwise. > > > >> > > > > I see this feature as being a nice add-on/convenience API for any > store > > > > types which have a full DSL implementation. I don't think it's > > > unreasonable > > > > to just say that this feature is only going to be available for store > > > types > > > > that have KV, Window, and Session implementations. I can't think of > any > > > > case besides versioned stores where this would force a KIP for a new > > > > feature that would not otherwise have to go through a KIP, and even > for > > > > versioned state stores, the only issue is that the KIP for that was > > > already > > > > accepted. > > > > > > > > However, I think I agree on your main point -- that things like > > "regular" > > > > vs timestamped vs versioned are/should be an implementation detail > > that's > > > > hidden from the user. As I noted previously, the current KIP actually > > > > greatly improves the situation for timestamped stores, as this would > be > > > > handled completely transparently by the OOTB RocksDBStoreSpec. To me, > > > this > > > > provides a very natural way to let the DSL operators using the > default > > > > store type/spec to specify which kind of store (eg > > > > versioned/timestamped/etc) it wants, and choose the correct default. > If > > > the > > > > eventual intention is to have versioned state stores replace > > timestamped > > > > stores as the default in the DSL, then we can simply swap out the > > > versioned > > > > stores for the timestamped stores in the RocksDBStoreTypeSpec, when > > that > > > > time comes. Until then, users who want to use the versioned store > will > > > have > > > > to do what they do today, which is individually override operators > via > > > > Materialized/StoreSuppliers. > > > > > > > > All in all, it sounds like we should not offer a versioned store type > > > > spec, as "versioned" is more akin to "timestamped" than to a true > > > > difference in underlying store implementation type (eg rocks vs > > > in-memory). > > > > W.r.t whether to deprecate the old config or introduce a new CUSTOM > > enum > > > > type, either seems fine to me, and we can go with that alternative > > > instead. > > > > The only other con to this approach that I can think of, and I'm > > honestly > > > > not sure if this is something users would care about or only devs, is > > > that > > > > the advantage to moving rocks and IM to the store type spec interface > > is > > > > that it helps to keep the relevant logic encapsulated in one easy > place > > > you > > > > can quickly check to tell what kind of state store is used where. In > > the > > > > current code, I found it extremely annoying and difficult to track > down > > > all > > > > usages of the StoreType enum to see which actual rocksdb store was > > being > > > > used where (for example some stores using the TimeOrderedBuffer > > variants > > > in > > > > some special cases, or to understand whether the DSL was defaulting > to > > > > plain, timestamped, or versioned stores for RocksDB vs InMemory -- > both > > > of > > > > which seem like they could be of interest to a user). This would be > > much > > > > easier if everything was handled in one place, and you can just go to > > the > > > > (eg) RocksDBStoreTypeSpec and see what it's doing, or find usages of > > the > > > > methods to understand what stores are being handed to which DSL > > > operators. > > > > > > > > I suppose we could still clean up the API and solve this problem by > > > having > > > > the old (and new) config delegate to a StoreTypeSpec no matter what, > > but > > > > make RocksDBStoreTypeSpec and InMemoryStoreTypeSpec internal classes > > that > > > > are simply implementation details of the ROCKSDB vs IN_MEMORY enums. > > > WDYT? > > > > > > > > > > > > On Sun, Jul 23, 2023 at 11:14 AM Guozhang Wang < > > > guozhang.wang...@gmail.com> > > > > wrote: > > > > > > > >> Thanks everyone for the great discussions so far! I first saw the > JIRA > > > >> and left some quick thoughts without being aware of the > > > >> already-written KIP (kudos to Almog, very great one) and the DISCUSS > > > >> thread here. And I happily find some of my initial thoughts align > with > > > >> the KIP already :) > > > >> > > > >> Would like to add a bit more of my 2c after reading through the KIP > > > >> and the thread here: > > > >> > > > >> 1. On the high level, I'm in favor of pushing this KIP through > without > > > >> waiting on the other gaps to be closed. In my back pocket's > > > >> "dependency graph" of Kafka Streams roadmap of large changes or > > > >> feature gaps, the edges of dependencies are defined based on my > > > >> understanding of whether doing one first would largely complicate / > > > >> negate the effort of the other but not vice versa, in which case we > > > >> should consider getting the other done first. In this case, I feel > > > >> such a dependency is not strong enough, so encouraging the KIP > > > >> contributor to finish what he/she would love to do to close some > gaps > > > >> early would be higher priorities. I did not see by just doing this > we > > > >> could end up in a worse intermediate stage yet, but I could be > > > >> corrected. > > > >> > > > >> 2. Regarding the store types --- gain here I'd like to just clarify > > > >> the terms a bit since in the past it has some confusions: we used > > > >> "impl types" (in hindsight it may not be a good name) for rocksdb / > > > >> memory / custom, and we used "store types" for kv / windowed / > > > >> sessioned etc, as I said in the JIRA I think the current proposal > also > > > >> have a good side effect as quality bar to really enforce us think > > > >> twice when trying to add more store types in the future as it will > > > >> impact API instantiations. In the ideal world, I would consider: > > > >> > > > >> * We have (timestamped) kv store, versioned kv store, window store, > > > >> session store as first-class DSL store types. Some DSL operators > could > > > >> accept multiple store types (e.g. versioned and non versioned > > > >> kv-store) for semantics / efficiency trade-offs. But I think we > would > > > >> remove un-timestamped kv stores eventually since that efficiency > > > >> trade-off is so minimal compared to its usage limitations. > > > >> * As for list-value store (for stream-stream Join), memory-lru-cache > > > >> (for PAPI use only), memory-time-ordered-buffer (for suppression), > > > >> they would not be exposed as DSL first-class store types in the > > > >> future. Instead, they would be treated as internal used stores (e.g. > > > >> list-value store is built on key-value store with specialized serde > > > >> and putInternal), or continue to be just convenient OOTB PAPI used > > > >> stores only. > > > >> * As we move on, we will continue to be very, very strict on what > > > >> would be added as DSL store types (and hence requires changes to the > > > >> proposed APIs), what to be added as convenient OOTB PAPI store impls > > > >> only, what to be added as internal used store types that should not > be > > > >> exposed to users nor customizable at all. > > > >> > > > >> 3. Some more detailed thoughts below: > > > >> > > > >> 3.a) I originally also think that we can extend the existing config, > > > >> rather than replacing it. The difference was that I was thinking > that > > > >> order-wise, the runtime would look at the API first, and then the > > > >> config, whereas in your rejected alternative it was looking at the > > > >> config first, and then the API --- that I think is a minor thing and > > > >> either is fine. I'm in agreement that having two configs would be > more > > > >> confusing to users to learn about their precedence rather than > > > >> helpful, but if we keep both a config and a public API, then the > > > >> precedence ordering would not be so confusing as long as we state > them > > > >> clearly. For example: > > > >> > > > >> * We have DefaultStoreTypeSpec OOTB, in that impl we look at the > > > >> config only, and would only expect either ROCKS or MEMORY, and > return > > > >> corresponding OOTB store impls; if any other values configured, we > > > >> error out. > > > >> * Users extend that by having MyStoreTypeSpec, in which they could > do > > > >> arbituray things without respecting the config at all, but our > > > >> recommended pattern in docs would still say "look into the config, > if > > > >> it is ROCKS or MEMORY just return fall back to DefaultStoreTypeSepc; > > > >> otherwise if it's some String you recognize, then return your > > > >> customized store based on the string value, otherwise error out". > > > >> > > > >> 3.b) About the struct-like Params classes, I like the idea a lot and > > > >> wished we would pursue this in the first place, but if we only do > this > > > >> in Spec it would leave some inconsistencies with the StoreBuilders > > > >> though arguably the latter is only for PAPI. I'm wondering if we > > > >> should consider including the changes in StoreBuilders (e.g. > > > >> WindowStoreBuilder(WindowSupplierParams)), and if yes, maybe we > should > > > >> also consider renaming that e.g. `WindowSupplierParams` to > > > >> `WindowStoreSpecParams` too? For this one I only have a "weak > feeling" > > > >> so I can be convinced otherwise :P > > > >> > > > >> Thanks, > > > >> Guozhang > > > >> > > > >> > > > >> > > > >> On Sun, Jul 23, 2023 at 9:52 AM Matthias J. Sax <mj...@apache.org> > > > wrote: > > > >> > > > > >> > Thanks for all the input. My intention was not to block the KIP, > but > > > >> > just to take a step back and try get a holistic picture and > > > discussion, > > > >> > to explore if there are good/viable alternative designs. As said > > > >> > originally, I really like to close this gap, and was always aware > > that > > > >> > the current config is not flexible enough. > > > >> > > > > >> > > > > >> > I guess, my "concern" is that the KIP does increase the API > surface > > > area > > > >> > significantly, as we need all kind of `StoreTypeSpec` > > implementations, > > > >> > and it might also imply that we need follow up KIPs for new > feature > > > >> > (like in-memory versioned store) that might not need a KIP > > otherwise. > > > >> > > > > >> > The second question is if it might make the already "patchy" > > situation > > > >> > with regard to customization worse. > > > >> > > > > >> > We did de-scope the original KIP-591 for this reason, and given > the > > > new > > > >> > situation of the DSL, it seems that it actually got worse compared > > to > > > >> > back in the days. > > > >> > > > > >> > Lastly, I hope to make the new versioned stores the default in the > > DSL > > > >> > and we did not do it in the previous KIP due to backward > > compatibility > > > >> > issues. Thus, from a DSL point of view, I believe there should be > > only > > > >> > "RocksDB", "InMemory", and "Custom" in an ideal world. Introducing > > (I > > > am > > > >> > exaggerating to highlight my point) "KvRocksDbSpec", > > > >> > "TimestampeKvRocksDbSpec", "VersionedRocksDbSpec", plus the > > > >> > corresponding in-memory specs seems to head into the opposite > > > direction. > > > >> > -- My goal is to give users a handle of the _physical_ store > > (RocksDB > > > vs > > > >> > InMemory vs Custom) but not the _logical_ stores (plain kv, ts-kv, > > > >> > versioned) which is "dictated" by the DSL itself and should not be > > > >> > customizable (we are just in a weird intermediate situation that > we > > > need > > > >> > to clean up, but not "lean into" IMHO). > > > >> > > > > >> > Thus, I am also not sure if adding "VersionedRocksDbSpec" would be > > > ideal > > > >> > (also, given that it only changes a single store, but not the two > > > >> > windowed stores)? > > > >> > > > > >> > Furthermore, I actually hope that we could use the new versioned > > store > > > >> > to replace the window- and sessions- stores, and thus to decrease > > the > > > >> > number of required store types. > > > >> > > > > >> > > > > >> > Admittedly, I am talking a lot about a potential future, but the > > goal > > > is > > > >> > only to explore opportunities to not get into "worse" intermediate > > > >> > state, that will require a huge deprecation surface area later on. > > Of > > > >> > course, if there is no better way, and my concerns are not > shared, I > > > am > > > >> > ok to move forward with the KIP. > > > >> > > > > >> > > > > >> > Bottom line: I would personally prefer to keep the current config > > and > > > >> > add a `Custom` option to it, plus adding one new config that > allows > > > >> > people to set their custom `StoreTypeSpec` class. -- I would not > > add a > > > >> > built-in spec for versioned stores at this point (or any other > > > built-in > > > >> > `StorytypeSpec` implementations). I guess, users could write a > > custom > > > >> > spec if they want to enable versioned store across the board for > now > > > >> > (until we make them the default anyway)? > > > >> > > > > >> > > > > >> > Hope my train of thoughts is halfway reasonable and not totally > off > > > >> track? > > > >> > > > > >> > > > > >> > -Matthias > > > >> > > > > >> > On 7/21/23 15:27, Sophie Blee-Goldman wrote: > > > >> > > I agree with everything Almog said above, and will just add on > to > > > two > > > >> > > points: > > > >> > > > > > >> > > 1. Regarding whether to block this KIP on the completion of any > or > > > all > > > >> > > future implementations of in-memory version stores (or persist > > > >> suppression > > > >> > > buffers), I feel that would be unfair to this feature which is > > > >> completely > > > >> > > unrelated to the semantic improvements offered by versioned > state > > > >> stores. > > > >> > > It seems like the responsibility of those driving the versioned > > > state > > > >> > > stores feature, not Almog/this KIP, to make sure that those > bases > > > are > > > >> > > covered. Further, if anything, this KIP will help with the > massive > > > >> > > proliferation of StoreSuppliers on the Stores factory class, and > > > >> provide > > > >> > > users with a much easier way to leverage the versioned stores > > > without > > > >> > > having to muck around directly with the StoreSuppliers. > > > >> > > > > > >> > > I also thought about it a bit, and really like Almog's > suggestion > > to > > > >> > > introduce an additional StoreSpec for the Versioned state > stores. > > > >> Obviously > > > >> > > we can add the RocksDB one to this KIP now, and then as he > > > mentioned, > > > >> > > there's an easy way to get users onto the IMVersionedStateStore > > > types > > > >> once > > > >> > > they are completed. > > > >> > > > > > >> > > Lastly, on this note, I want to point out how smoothly this > solved > > > the > > > >> > > issue of timestamped stores, which are intended to be the DSL > > > default > > > >> but > > > >> > > are only a special case for RocksDB. Right now it can be > confusing > > > >> for a > > > >> > > user scrolling through the endless Stores class and seeing a > > > >> timestamped > > > >> > > version of the persistent but not in-memory stores. One could > > easily > > > >> assume > > > >> > > there was no timestamped option for IM stores and that this > > feature > > > >> was > > > >> > > incomplete, if they weren't acutely aware of the internal > > > >> implementation > > > >> > > details (namely that it's only required for RocksDB for > > > compatibility > > > >> > > reasons). However, with this KIP, all that is handled completely > > > >> > > transparently to the user, and we the devs, who *are* aware of > the > > > >> internal > > > >> > > implementation details, are now appropriately the ones > responsible > > > for > > > >> > > handing the correct store type to a particular operator. While > > > >> versioned > > > >> > > state stores may not be completely comparable, depending on > > whether > > > >> we want > > > >> > > users to remain able to easily choose between using them or not > > (vs > > > >> > > timestamped which should be used by all), I still feel this KIP > > is a > > > >> great > > > >> > > step in the right direction that not only should not be blocked > on > > > the > > > >> > > completion of the IM implementations, but in fact should > > > specifically > > > >> be > > > >> > > done first as it enables an easier way to utilize those IM > > versioned > > > >> > > stores. Just my 2 cents :) > > > >> > > > > > >> > > 2. The idea to expand the existing the config with a CUSTOM enum > > > >> without > > > >> > > introducing another config to specify the CUSTOM store spec does > > not > > > >> seem > > > >> > > appropriate, or even possible (for the reasons Almog mentioned > > > above > > > >> about > > > >> > > config types, though perhaps there is a way I'm not seeing). I > do > > > not > > > >> buy > > > >> > > the argument that we should optimize the API to make it easy for > > > >> users who > > > >> > > just want to switch to all in-memory stores, as I truly believe > > this > > > >> is a > > > >> > > very small fraction of the potential userbase of this feature > > > (anyone > > > >> who's > > > >> > > actually using this should please chime in!). It seems very > likely > > > >> that the > > > >> > > majority of users of this feature are actually those with custom > > > state > > > >> > > stores, as to my knowledge, this has been the case any/every > time > > > this > > > >> > > feature was requested by users. > > > >> > > > > > >> > > That said, while I don't see any way to get around introducing a > > new > > > >> > > config, I don't personally have a preference w.r.t whether to > keep > > > >> around > > > >> > > the old config or deprecate it. I simply don't get the > impression > > it > > > >> is > > > >> > > very heavily used as it stands today, while it only works for > > those > > > >> who > > > >> > > want all in-memory stores. Again, input from actual users would > be > > > >> very > > > >> > > valuable here. In the absence of that data, I will just point to > > the > > > >> fact > > > >> > > that this KIP was proposed by a Streams dev (you :P), abandoned, > > > >> picked up > > > >> > > by another Streams dev, and finally implemented without ever > > hearing > > > >> from a > > > >> > > user that they would find this useful. This is not to disparage > > the > > > >> > > original KIP, but just to say again, as I stated back then, it > > > seemed > > > >> like > > > >> > > a major opportunity loss to leave out custom state stores > > > >> > > > > > >> > > Cheers, > > > >> > > Sophie > > > >> > > > > > >> > > On Fri, Jul 21, 2023 at 1:52 PM Almog Gavra < > > almog.ga...@gmail.com> > > > >> wrote: > > > >> > > > > > >> > >> Thanks for all the feedback folk! Responses inline. > > > >> > >> > > > >> > >>> Basically, I'm suggesting two things: first, call out in some > > way > > > >> > >> (perhaps the StoreTypeSpec javadocs) that each StoreTypeSpec is > > > >> considered > > > >> > >> a public contract in itself and should outline any semantic > > > >> guarantees it > > > >> > >> does, or does not, make. Second, we should add a note on > ordering > > > >> > >> guarantees in the two OOTB specs: for RocksDB we assert that > > range > > > >> queries > > > >> > >> will honor serialized byte ordering, whereas the InMemory > flavor > > > >> gives no > > > >> > >> ordering guarantee whatsoever at this time. > > > >> > >> > > > >> > >> That makes sense to me Sophie! I'll make the changes to the > KIP. > > > And > > > >> @Colt, > > > >> > >> yes I believe that would be the new javadoc for the generic > > > >> > >> ReadOnlyKeyValueStore. > > > >> > >> > > > >> > >>> However, I am wondering if we should close others gaps first? > > > >> > >> > > > >> > >> @Matthias, thanks for the review and thoughts! I think we > should > > > >> separate > > > >> > >> closing other gaps in the product from providing this as useful > > > >> > >> functionality to avoid feature creep so long as the API > proposed > > > >> here will > > > >> > >> be suitable for when we want to close those implementation > gaps! > > My > > > >> general > > > >> > >> proposal is that for things that are not customizable today by > > > >> > >> default.dsl.store they remain not customizable after this KIP. > > The > > > >> good > > > >> > >> news is, however, that there's no reason why this cannot be > > > extended > > > >> to > > > >> > >> cover those in the future if we want to - see specifics below. > > > >> > >> > > > >> > >> Comments on the specifics below > > > >> > >> > > > >> > >>> In particular, this holds for the new versioned-store ... > Should > > > >> > >> versioned stores also be covered by the KIP > > > >> > >> > > > >> > >> Is there a reason why we can't introduce a > > > >> VersionedRocksDBStoreTypeSpec > > > >> > >> and if we ever support an in-memory an equivalent > > > >> > >> VersionedInMemoryRocksDBStoreTypeSpec? If so, then there would > > not > > > >> need to > > > >> > >> be any additional changes to the API proposed in this KIP. > > > >> > >> > > > >> > >>> For `suppress()` it's actually other way around we only have > an > > > >> in-memory > > > >> > >> implementation. Do you aim to allow custom stores for > > `suppress()`, > > > >> too? > > > >> > >> > > > >> > >> We have three options here: > > > >> > >> 1) we can decide to maintain existing behavior and use the > > > in-memory > > > >> > >> implementation for all stores (not even going through the API > at > > > all) > > > >> > >> 2a) we can introduce a new parameter to the KeyValueParams > class > > > >> (boolean > > > >> > >> isTimeOrderedBuffer or something like that) and return an > > in-memory > > > >> store > > > >> > >> in the implementation of RocksDBStoreTypeSpec (this maintains > the > > > >> existing > > > >> > >> behavior, and would allow us in the future to make the change > to > > > >> return a > > > >> > >> RocksDB store if we ever provide one) > > > >> > >> 2b) same as 2a but we throw an exception if the requested store > > > type > > > >> does > > > >> > >> not support that (this is backwards incompatible, and since > > > ROCKS_DB > > > >> is the > > > >> > >> default we probably shouldn't do this) > > > >> > >> > > > >> > >> My proposal for now is 1) because as of KIP-825 > > > >> > >> EmitStrategy#ON_WINDOW_CLOSE is the preferred way of > suppressing > > > and > > > >> that > > > >> > >> is accounted for in this API already. > > > >> > >> > > > >> > >>> Last, I am not sure if the new parameter replacing the > existing > > > one > > > >> is > > > >> > >> the > > > >> > >> best way to go? > > > >> > >> > > > >> > >> I'm happy either way, just let me know which you prefer - the > > > >> discussion > > > >> > >> around CUSTOM is in the rejected alternatives but I'm happy to > > > >> differ to > > > >> > >> whatever the project conventions are :) > > > >> > >> > > > >> > >>> If it's matches existing `ROCKS_DB` or `IN_MEMORY` we just > > process > > > >> it as > > > >> > >> we > > > >> > >> do know, and if know we assume it's a fully qualified class > name > > > and > > > >> try to > > > >> > >> instantiate it? > > > >> > >> > > > >> > >> Note that there is no functionality for this kind of thing in > > > >> > >> AbstractConfig (it's either a String validated enum or a class) > > so > > > >> this > > > >> > >> would be a departure from convention. Again, I'm happy to > > implement > > > >> that if > > > >> > >> it's preferred. > > > >> > >> > > > >> > >>> Also wondering how it would related to the existing `Stores` > > > >> factory? > > > >> > >> > > > >> > >> StoreTypeSpec will depend on Stores factory - they're one layer > > > >> removed. > > > >> > >> You can imagine that StoreTypeSpec is just a grouping of > methods > > > >> from the > > > >> > >> Stores factory into a convenient package for default > > configuration > > > >> > >> purposes. > > > >> > >> > > > >> > >> Thanks again for all the detailed thoughts Matthias! > > > >> > >> > > > >> > >> On Fri, Jul 21, 2023 at 11:50 AM Matthias J. Sax < > > mj...@apache.org > > > > > > > >> wrote: > > > >> > >> > > > >> > >>> Thanks for the KIP. Overall I like the idea to close this gap. > > > >> > >>> > > > >> > >>> However, I am wondering if we should close others gaps first? > In > > > >> > >>> particular, IIRC, we have a few cases for which we only have a > > > >> RocksDB > > > >> > >>> implementation for a store, and thus, adding an in-memory > > version > > > >> for > > > >> > >>> these stores first, to make the current `IN_MEMORY` parameter > > > work, > > > >> > >>> might be the first step? > > > >> > >>> > > > >> > >>> In particular, this holds for the new versioned-store (but I > > > >> actually > > > >> > >>> believe the is some other internal store with no in-memory > > > >> > >>> implementation). -- For `suppress()` it's actually other way > > > around > > > >> we > > > >> > >>> we only have an in-memory implementation. Do you aim to allow > > > custom > > > >> > >>> stores for `suppress()`, too? > > > >> > >>> > > > >> > >>> Btw: Should versioned stores also be covered by the KIP (ie, > > > >> > >>> `StoreTypeSpec`)? We did consider to add a new option > > `VERSIONED` > > > >> to the > > > >> > >>> existing `default.dsl.store` config, but opted out for various > > > >> reasons. > > > >> > >>> > > > >> > >>> Last, I am not sure if the new parameter replacing the > existing > > > one > > > >> is > > > >> > >>> the best way to go? Did you put the idea to add `CUSTOM` to > the > > > >> existing > > > >> > >>> config into rejected alternative. Personally, I would prefer > to > > > add > > > >> > >>> `CUSTOM` as I would like to optimize to easy of use for the > > > >> majority of > > > >> > >>> users (which don't implement a custom store), but only switch > to > > > >> > >>> in-memory sometimes. -- As an alternative, you would also just > > > >> extend > > > >> > >>> `dsl.default.store` (it's just a String) and allow to pass in > > > >> anything. > > > >> > >>> If it's matches existing `ROCKS_DB` or `IN_MEMORY` we just > > process > > > >> it as > > > >> > >>> we do know, and if know we assume it's a fully qualified class > > > name > > > >> and > > > >> > >>> try to instantiate it? -- Given that we plan to keep the > > > >> store-enum, is > > > >> > >>> seems cleaner to keep the existing config and keep both the > > config > > > >> and > > > >> > >>> enum aligned to each other? > > > >> > >>> > > > >> > >>> > > > >> > >>> It's just preliminary thought. I will need to go back an take > a > > > more > > > >> > >>> detailed look into the code to grok how the propose > > > `StoreTypeSpec` > > > >> > >>> falls into place. Also wondering how it would related to the > > > >> existing > > > >> > >>> `Stores` factory? > > > >> > >>> > > > >> > >>> -Matthias > > > >> > >>> > > > >> > >>> > > > >> > >>> On 7/21/23 6:45 AM, Colt McNealy wrote: > > > >> > >>>> Sophie— > > > >> > >>>> > > > >> > >>>> Thanks for chiming in here. +1 to the idea of specifying the > > > >> ordering > > > >> > >>>> guarantees that we make in the StorageTypeSpec javadocs. > > > >> > >>>> > > > >> > >>>> Quick question then. Is the javadoc that says: > > > >> > >>>> > > > >> > >>>>> Order is not guaranteed as bytes lexicographical ordering > > might > > > >> not > > > >> > >>>> represent key order. > > > >> > >>>> > > > >> > >>>> no longer correct, and should say: > > > >> > >>>> > > > >> > >>>>> Order guarantees depend on the underlying implementation of > > the > > > >> > >>>> ReadOnlyKeyValueStore. For more information, please consult > the > > > >> > >>>> [StorageTypeSpec javadocs](....) > > > >> > >>>> > > > >> > >>>> Thanks, > > > >> > >>>> Colt McNealy > > > >> > >>>> > > > >> > >>>> *Founder, LittleHorse.dev* > > > >> > >>>> > > > >> > >>>> > > > >> > >>>> On Thu, Jul 20, 2023 at 9:28 PM Sophie Blee-Goldman < > > > >> > >>> ableegold...@gmail.com> > > > >> > >>>> wrote: > > > >> > >>>> > > > >> > >>>>> Hey Almog, first off, thanks for the KIP! I (and others) > > raised > > > >> > >> concerns > > > >> > >>>>> over how restrictive the default.dsl.store config would be > if > > > not > > > >> > >>>>> extendable to custom store types, especially given that this > > > >> seems to > > > >> > >> be > > > >> > >>>>> the primary userbase of such a feature. At the time we > didn't > > > >> really > > > >> > >>> have > > > >> > >>>>> any better ideas for a clean way to achieve that, but what > you > > > >> > >> proposed > > > >> > >>>>> makes a lot of sense to me. Happy to see a good solution to > > > this, > > > >> and > > > >> > >>>>> hopefully others will share my satisfaction :P > > > >> > >>>>> > > > >> > >>>>> I did have one quick piece of feedback which arose from an > > > >> unrelated > > > >> > >>>>> question posed to the dev mailing list w/ subject line > > > >> > >>>>> "ReadOnlyKeyValueStore#range() > > > >> > >>>>> Semantics" > > > >> > >>>>> < > > > https://lists.apache.org/thread/jbckmth8d3mcgg0rd670cpvsgwzqlwrm>. > > > >> I > > > >> > >>>>> recommend checking out the full thread for context, but it > > made > > > me > > > >> > >> think > > > >> > >>>>> about how we can leverage the new StoreTypeSpec concept as > an > > > >> answer > > > >> > >> to > > > >> > >>> the > > > >> > >>>>> long-standing question in Streams: where can we put > guarantees > > > of > > > >> the > > > >> > >>>>> public contract for RocksDB (or other store implementations) > > > when > > > >> all > > > >> > >>> the > > > >> > >>>>> RocksDB stuff is technically internal. > > > >> > >>>>> > > > >> > >>>>> Basically, I'm suggesting two things: first, call out in > some > > > way > > > >> > >>> (perhaps > > > >> > >>>>> the StoreTypeSpec javadocs) that each StoreTypeSpec is > > > considered > > > >> a > > > >> > >>> public > > > >> > >>>>> contract in itself and should outline any semantic > guarantees > > it > > > >> does, > > > >> > >>> or > > > >> > >>>>> does not, make. Second, we should add a note on ordering > > > >> guarantees in > > > >> > >>> the > > > >> > >>>>> two OOTB specs: for RocksDB we assert that range queries > will > > > >> honor > > > >> > >>>>> serialized byte ordering, whereas the InMemory flavor gives > no > > > >> > >> ordering > > > >> > >>>>> guarantee whatsoever at this time. > > > >> > >>>>> > > > >> > >>>>> Thoughts? > > > >> > >>>>> > > > >> > >>>>> -Sophie > > > >> > >>>>> > > > >> > >>>>> On Thu, Jul 20, 2023 at 4:28 PM Almog Gavra < > > > >> almog.ga...@gmail.com> > > > >> > >>> wrote: > > > >> > >>>>> > > > >> > >>>>>> Hi All, > > > >> > >>>>>> > > > >> > >>>>>> I would like to propose a KIP to expand support for default > > > store > > > >> > >> types > > > >> > >>>>>> (KIP-591) to encompass custom store implementations: > > > >> > >>>>>> > > > >> > >>>>>> > > > >> > >>>>> > > > >> > >>> > > > >> > >> > > > >> > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-954%3A+expand+default+DSL+store+configuration+to+custom+types > > > >> > >>>>>> > > > >> > >>>>>> Looking forward to your feedback! > > > >> > >>>>>> > > > >> > >>>>>> Cheers, > > > >> > >>>>>> Almog > > > >> > >>>>>> > > > >> > >>>>> > > > >> > >>>> > > > >> > >>> > > > >> > >> > > > >> > > > > > >> > > > > > > > > > >