I have updated the KIP with the points as discussed above. @Guozhang, the suggested configuration makes it a little more awkward around the Materialized.as and Materialized.withStoreType APIs than it was when we were totally deprecating the old configuration. Let me know what you think.
I will open the voting tomorrow! Thanks again everyone for the discussion. Cheers, Almog On Tue, Jul 25, 2023 at 9:20 AM Almog Gavra <almog.ga...@gmail.com> wrote: > Glad you like my KIP-secretary skills ;) > > A2. I'm definitely happy to take your suggestion here and not do anything > special w.r.t. Versioned stores, I think it makes sense especially if we > consider them implementation details of a specific store type. > > At EOD I'll update the KIP with all of these changes and if the > discussion is silent I'll open a vote tomorrow morning. > > Cheers, > Almog > > On Mon, Jul 24, 2023 at 2:02 PM Sophie Blee-Goldman < > ableegold...@gmail.com> wrote: > >> Awesome summary (seriously) -- would you kindly offer your organizational >> skills to every ongoing KIP from henceforth? We need you :P >> >> A few answers/comments: >> >> A2: I think there is a 3rd sub-option here, which is to leave >> versioned-ness out of this KIP entirely, return only the non-versioned >> stores for now, and then switch over to the versioned stores (only) when >> the time comes to flip the switch on making them the default across the >> DSL. This has the advantage of retaining the current behavior/semantics >> and >> provides a clear way to transition smoothly in the future, since it seems >> we will want to cut to all versioned state stores rather than offer users >> a >> choice between versioned or non-versioned stores going forward (similar to >> how we only offer timestamped stores presently, and have completely >> replaced non-timestamped stores in the DSL.) . In both the timestamped and >> versioned cases, the old stores are/will still be available or accessible >> to users via the bare StoreSuppliers, should they somehow desire or >> require >> the old store type. Ultimately, I think either this or option (1) would be >> preferable, though I think it should be up to Matthias or anyone else >> involved in the versioned stores feature to decide which approach makes >> sense in the context of that feature's future plans. >> >> A3: sounds reasonable to me >> >> A5: Also sounds fine to me, though I'll let others chime in with/if they >> have an alternative suggestion/preference. I guess the other contender >> would be something like DSLStoreImpl or something like that? >> >> >> >> On Mon, Jul 24, 2023 at 9:36 AM Almog Gavra <almog.ga...@gmail.com> >> wrote: >> >> > Lots of thoughts! Happy to see the thriving discussion on this post - >> lots >> > going on so I'm trying to enumerate them to keep things organized >> (prefix >> > "A" for "Almog" so we can use numbers in responses for other things ;P). >> > >> > A1. Question around closing implementation gaps (e.g. no rocks based >> > suppression store) >> > A2. Specifically how to handle Versioned stores >> > A3. Configuration (new config/reuse old one + new one and ordering of >> > config resolution) >> > A4. Drawing a line between what is implementation detail (not exposed in >> > API) and what is customizable (exposed in API) >> > A5. Naming of StoreTypeSpec >> > A6. Param classes in StoreBuilders >> > >> > ------------------------------ >> > >> > Here are summaries for where it seems each of these stands (trying not >> to >> > add any additional opinion yet): >> > >> > A1. Sophie/Guozhang/Me (if I count hah!) seem to agree that it is worth >> > pushing this KIP through independently of the implementation gaps as it >> > doesn't seem to move the intermediate state further from the end state. >> > Matthias originally had some concerns. >> > >> > A2. There's questions around whether versioned stores should be their >> own >> > configurable option or whether they are an implementation detail that >> the >> > StoreSpec should decide. It seems like the discussion is converging >> here, >> > this should be an implementation detail. >> > >> > A3. Matthias/Guozhang prefer adding CUSTOM and then having an additional >> > config to determine the implementation. Sophie prefers deprecating the >> old >> > config. Guozhang additional suggested flipping the resolution order such >> > that the old config is only respected in a DefaultStoreSpec >> implementation. >> > >> > A4. This KIP (or rather, the discussion on the KIP) blurs the lines >> between >> > top level store types (KV, windowed, session) and the underlying >> > implementation of them (timestamped, versioned, kv-list). It seems >> everyone >> > is in alignment to ensure that we keep these two things separate and >> that >> > the line is clearly delineated in the text of the KIP. >> > >> > A5. Guozhang and Sophie agree that the current name StoreTypeSpec is >> > misleading, as it's really an implementation spec, not a type >> > specification. >> > >> > A6. Agreement that this is an improvement, Sophie believes this can be >> done >> > in a follow up but we should ensure our naming is good here so there's >> no >> > conflicts down the line. >> > >> > --------------------------------- >> > >> > Ok, phew! Hopefully that covers it all! Now for my thoughts, hopefully >> > wrapping up some of these discussions: >> > >> > A1. @Matthias - are you still hesitant here? What would you need to be >> > convinced here? >> > >> > A2. Since we are all in agreement that versioned stores should be an >> > implementation detail, we have two options: >> > >> > (1) we can extend the KVParams to include a parameter that indicates >> > whether or not the store should be versioned >> > (2) we can introduce a configuration for whether or not to use a >> versioned >> > store, and each implementation can choose to read/ignore that config >> > >> > Any preferences? (1) would align more closely with what we are doing >> today >> > (they are a top-level concept in the Stores API). >> > >> > A3. I like Guozhang's suggestion of making the "primary" configuration >> to >> > be the new one, and then having a "DefaultStoreTypeSpec" (using the old >> > naming) which respects the old configuration. That seems to solve nearly >> > all the concerns (e.g. it'd be easy to see where the enum is used >> because >> > it would only be used within that one class instead of littered >> throughout >> > the code base). >> > >> > @Sophie, unless you have objections here I will update the KIP to do >> that. >> > >> > A4. I will make these changes to the KIP to make it clear. >> > >> > A5. I will rename it `DslStorePlugin` - any objections to this name? >> > >> > A6. Let's punt this ;) I agree with everyone that this would be a >> welcome >> > improvement and that this KIP is aligned with moving in that direction. >> > Given how much discussion there was on this KIP, which is minor >> relative to >> > making the changes to StoreBuilder API, I'd rather not tie the two >> > together. >> > >> > Cheers & Thanks everyone for the thoughts! >> > - Almog >> > >> > On Sun, Jul 23, 2023 at 5:15 PM Sophie Blee-Goldman < >> > ableegold...@gmail.com> >> > wrote: >> > >> > > Guozhang: >> > > >> > > On your 2nd point: >> > > >> > > > "impl types" (in hindsight it may not be a good name) for rocksdb / >> > > memory / custom, and we used "store types" for kv / windowed / >> sessioned >> > > etc, >> > > First off, thanks so much for this clarification -- using "store type" >> > here >> > > was definitely making me uncomfortable as this usually refers to KV vs >> > > window, etc -- but I just couldn't for the life of me think of the >> right >> > > term for rocks vs IM. We should 100% change to something like >> > StoreImplSpec >> > > for this kind of interface. >> > > >> > > > As for list-value store (for stream-stream Join) >> > > Again, glad you mentioned this -- I forgot how the extra stream-stream >> > join >> > > store is not a "regular" KV Store but rather this special list-value >> > store. >> > > If we proceed with something like the current approach, perhaps that >> > should >> > > be a boolean (or enum) parameter in the KVConfig, similar to the >> > > EmitStrategy? After all, the high-level goal of this KIP is to be >> able to >> > > fully customize all DSL state stores, and this is currently not >> possible >> > > due to KAFKA-14976 <https://issues.apache.org/jira/browse/KAFKA-14976 >> >. >> > > >> > > If we expect there to be further customizations like this going >> forward, >> > > perhaps we could instead have each of the three StoreConfig classes >> > accept >> > > a single enum parameter for the "sub-type" (or whatever you want to >> call >> > > it), which would encompass (and replace) things like the EmitStrategy >> as >> > > well as the list-value type (we could define one enum for each Config >> > class >> > > so there is no accidentally requesting a LIST_VALUE subtype on a >> > > WindowStore). Thoughts? >> > > >> > > Lastly, regarding 3.b: >> > > >> > > I love that you brought this up because that is actually what I first >> > > proposed to Almog, ie introducing a param class to clean up the >> > > StoreBuilder API, during our chat that led to this KIP. He pushed >> back, >> > > claiming (rightly so) that this change would be huge in scope for a >> > purely >> > > aesthetic/API change that doesn't add any functionality, and that it >> > makes >> > > more sense to start with the DSL config since there is a clear gap in >> > > functionality there, particularly when it comes to custom state stores >> > > (reasons 1 & 3 in the Motivation section). He did agree that the param >> > > classes were a better API, which is why you see them in this KIP. In >> > other >> > > words: I fully agree that we should apply this improvement to the >> > > PAPI/StoreBuilder interface as well, but I think that's a distinct >> > concept >> > > for the time-being and should not block the DSL improvement. Rather, I >> > > consider this KIP as setting the stage for a followup KIP down the >> line >> > to >> > > clean up the StoreBuilders and bring them in line with the new >> > param/config >> > > class approach. >> > > >> > > That said, A) we should definitely make sure whatever we introduce >> here >> > > can be extended to the PAPI StoreBuilders in a natural way, and B) I >> > should >> > > clarify that I personally would be happy to see this included in the >> > > current KIP, but as Almog's KIP it's up to him to decide whether he's >> > > comfortable expanding the scope like this. If you can convince him >> where >> > I >> > > could not, more power to you! :P >> > > >> > > On Sun, Jul 23, 2023 at 4:48 PM Sophie Blee-Goldman < >> > > ableegold...@gmail.com> >> > > wrote: >> > > >> > > > Matthias: >> > > > >> > > > I'm not sure I agree with (or maybe don't follow) this take: >> > > >> >> > > >> we need all kind of `StoreTypeSpec` implementations, >> > > >> and it might also imply that we need follow up KIPs for new feature >> > > >> (like in-memory versioned store) that might not need a KIP >> otherwise. >> > > >> >> > > > I see this feature as being a nice add-on/convenience API for any >> store >> > > > types which have a full DSL implementation. I don't think it's >> > > unreasonable >> > > > to just say that this feature is only going to be available for >> store >> > > types >> > > > that have KV, Window, and Session implementations. I can't think of >> any >> > > > case besides versioned stores where this would force a KIP for a new >> > > > feature that would not otherwise have to go through a KIP, and even >> for >> > > > versioned state stores, the only issue is that the KIP for that was >> > > already >> > > > accepted. >> > > > >> > > > However, I think I agree on your main point -- that things like >> > "regular" >> > > > vs timestamped vs versioned are/should be an implementation detail >> > that's >> > > > hidden from the user. As I noted previously, the current KIP >> actually >> > > > greatly improves the situation for timestamped stores, as this >> would be >> > > > handled completely transparently by the OOTB RocksDBStoreSpec. To >> me, >> > > this >> > > > provides a very natural way to let the DSL operators using the >> default >> > > > store type/spec to specify which kind of store (eg >> > > > versioned/timestamped/etc) it wants, and choose the correct >> default. If >> > > the >> > > > eventual intention is to have versioned state stores replace >> > timestamped >> > > > stores as the default in the DSL, then we can simply swap out the >> > > versioned >> > > > stores for the timestamped stores in the RocksDBStoreTypeSpec, when >> > that >> > > > time comes. Until then, users who want to use the versioned store >> will >> > > have >> > > > to do what they do today, which is individually override operators >> via >> > > > Materialized/StoreSuppliers. >> > > > >> > > > All in all, it sounds like we should not offer a versioned store >> type >> > > > spec, as "versioned" is more akin to "timestamped" than to a true >> > > > difference in underlying store implementation type (eg rocks vs >> > > in-memory). >> > > > W.r.t whether to deprecate the old config or introduce a new CUSTOM >> > enum >> > > > type, either seems fine to me, and we can go with that alternative >> > > instead. >> > > > The only other con to this approach that I can think of, and I'm >> > honestly >> > > > not sure if this is something users would care about or only devs, >> is >> > > that >> > > > the advantage to moving rocks and IM to the store type spec >> interface >> > is >> > > > that it helps to keep the relevant logic encapsulated in one easy >> place >> > > you >> > > > can quickly check to tell what kind of state store is used where. In >> > the >> > > > current code, I found it extremely annoying and difficult to track >> down >> > > all >> > > > usages of the StoreType enum to see which actual rocksdb store was >> > being >> > > > used where (for example some stores using the TimeOrderedBuffer >> > variants >> > > in >> > > > some special cases, or to understand whether the DSL was defaulting >> to >> > > > plain, timestamped, or versioned stores for RocksDB vs InMemory -- >> both >> > > of >> > > > which seem like they could be of interest to a user). This would be >> > much >> > > > easier if everything was handled in one place, and you can just go >> to >> > the >> > > > (eg) RocksDBStoreTypeSpec and see what it's doing, or find usages of >> > the >> > > > methods to understand what stores are being handed to which DSL >> > > operators. >> > > > >> > > > I suppose we could still clean up the API and solve this problem by >> > > having >> > > > the old (and new) config delegate to a StoreTypeSpec no matter what, >> > but >> > > > make RocksDBStoreTypeSpec and InMemoryStoreTypeSpec internal classes >> > that >> > > > are simply implementation details of the ROCKSDB vs IN_MEMORY enums. >> > > WDYT? >> > > > >> > > > >> > > > On Sun, Jul 23, 2023 at 11:14 AM Guozhang Wang < >> > > guozhang.wang...@gmail.com> >> > > > wrote: >> > > > >> > > >> Thanks everyone for the great discussions so far! I first saw the >> JIRA >> > > >> and left some quick thoughts without being aware of the >> > > >> already-written KIP (kudos to Almog, very great one) and the >> DISCUSS >> > > >> thread here. And I happily find some of my initial thoughts align >> with >> > > >> the KIP already :) >> > > >> >> > > >> Would like to add a bit more of my 2c after reading through the KIP >> > > >> and the thread here: >> > > >> >> > > >> 1. On the high level, I'm in favor of pushing this KIP through >> without >> > > >> waiting on the other gaps to be closed. In my back pocket's >> > > >> "dependency graph" of Kafka Streams roadmap of large changes or >> > > >> feature gaps, the edges of dependencies are defined based on my >> > > >> understanding of whether doing one first would largely complicate / >> > > >> negate the effort of the other but not vice versa, in which case we >> > > >> should consider getting the other done first. In this case, I feel >> > > >> such a dependency is not strong enough, so encouraging the KIP >> > > >> contributor to finish what he/she would love to do to close some >> gaps >> > > >> early would be higher priorities. I did not see by just doing this >> we >> > > >> could end up in a worse intermediate stage yet, but I could be >> > > >> corrected. >> > > >> >> > > >> 2. Regarding the store types --- gain here I'd like to just clarify >> > > >> the terms a bit since in the past it has some confusions: we used >> > > >> "impl types" (in hindsight it may not be a good name) for rocksdb / >> > > >> memory / custom, and we used "store types" for kv / windowed / >> > > >> sessioned etc, as I said in the JIRA I think the current proposal >> also >> > > >> have a good side effect as quality bar to really enforce us think >> > > >> twice when trying to add more store types in the future as it will >> > > >> impact API instantiations. In the ideal world, I would consider: >> > > >> >> > > >> * We have (timestamped) kv store, versioned kv store, window store, >> > > >> session store as first-class DSL store types. Some DSL operators >> could >> > > >> accept multiple store types (e.g. versioned and non versioned >> > > >> kv-store) for semantics / efficiency trade-offs. But I think we >> would >> > > >> remove un-timestamped kv stores eventually since that efficiency >> > > >> trade-off is so minimal compared to its usage limitations. >> > > >> * As for list-value store (for stream-stream Join), >> memory-lru-cache >> > > >> (for PAPI use only), memory-time-ordered-buffer (for suppression), >> > > >> they would not be exposed as DSL first-class store types in the >> > > >> future. Instead, they would be treated as internal used stores >> (e.g. >> > > >> list-value store is built on key-value store with specialized serde >> > > >> and putInternal), or continue to be just convenient OOTB PAPI used >> > > >> stores only. >> > > >> * As we move on, we will continue to be very, very strict on what >> > > >> would be added as DSL store types (and hence requires changes to >> the >> > > >> proposed APIs), what to be added as convenient OOTB PAPI store >> impls >> > > >> only, what to be added as internal used store types that should >> not be >> > > >> exposed to users nor customizable at all. >> > > >> >> > > >> 3. Some more detailed thoughts below: >> > > >> >> > > >> 3.a) I originally also think that we can extend the existing >> config, >> > > >> rather than replacing it. The difference was that I was thinking >> that >> > > >> order-wise, the runtime would look at the API first, and then the >> > > >> config, whereas in your rejected alternative it was looking at the >> > > >> config first, and then the API --- that I think is a minor thing >> and >> > > >> either is fine. I'm in agreement that having two configs would be >> more >> > > >> confusing to users to learn about their precedence rather than >> > > >> helpful, but if we keep both a config and a public API, then the >> > > >> precedence ordering would not be so confusing as long as we state >> them >> > > >> clearly. For example: >> > > >> >> > > >> * We have DefaultStoreTypeSpec OOTB, in that impl we look at the >> > > >> config only, and would only expect either ROCKS or MEMORY, and >> return >> > > >> corresponding OOTB store impls; if any other values configured, we >> > > >> error out. >> > > >> * Users extend that by having MyStoreTypeSpec, in which they could >> do >> > > >> arbituray things without respecting the config at all, but our >> > > >> recommended pattern in docs would still say "look into the config, >> if >> > > >> it is ROCKS or MEMORY just return fall back to >> DefaultStoreTypeSepc; >> > > >> otherwise if it's some String you recognize, then return your >> > > >> customized store based on the string value, otherwise error out". >> > > >> >> > > >> 3.b) About the struct-like Params classes, I like the idea a lot >> and >> > > >> wished we would pursue this in the first place, but if we only do >> this >> > > >> in Spec it would leave some inconsistencies with the StoreBuilders >> > > >> though arguably the latter is only for PAPI. I'm wondering if we >> > > >> should consider including the changes in StoreBuilders (e.g. >> > > >> WindowStoreBuilder(WindowSupplierParams)), and if yes, maybe we >> should >> > > >> also consider renaming that e.g. `WindowSupplierParams` to >> > > >> `WindowStoreSpecParams` too? For this one I only have a "weak >> feeling" >> > > >> so I can be convinced otherwise :P >> > > >> >> > > >> Thanks, >> > > >> Guozhang >> > > >> >> > > >> >> > > >> >> > > >> On Sun, Jul 23, 2023 at 9:52 AM Matthias J. Sax <mj...@apache.org> >> > > wrote: >> > > >> > >> > > >> > Thanks for all the input. My intention was not to block the KIP, >> but >> > > >> > just to take a step back and try get a holistic picture and >> > > discussion, >> > > >> > to explore if there are good/viable alternative designs. As said >> > > >> > originally, I really like to close this gap, and was always aware >> > that >> > > >> > the current config is not flexible enough. >> > > >> > >> > > >> > >> > > >> > I guess, my "concern" is that the KIP does increase the API >> surface >> > > area >> > > >> > significantly, as we need all kind of `StoreTypeSpec` >> > implementations, >> > > >> > and it might also imply that we need follow up KIPs for new >> feature >> > > >> > (like in-memory versioned store) that might not need a KIP >> > otherwise. >> > > >> > >> > > >> > The second question is if it might make the already "patchy" >> > situation >> > > >> > with regard to customization worse. >> > > >> > >> > > >> > We did de-scope the original KIP-591 for this reason, and given >> the >> > > new >> > > >> > situation of the DSL, it seems that it actually got worse >> compared >> > to >> > > >> > back in the days. >> > > >> > >> > > >> > Lastly, I hope to make the new versioned stores the default in >> the >> > DSL >> > > >> > and we did not do it in the previous KIP due to backward >> > compatibility >> > > >> > issues. Thus, from a DSL point of view, I believe there should be >> > only >> > > >> > "RocksDB", "InMemory", and "Custom" in an ideal world. >> Introducing >> > (I >> > > am >> > > >> > exaggerating to highlight my point) "KvRocksDbSpec", >> > > >> > "TimestampeKvRocksDbSpec", "VersionedRocksDbSpec", plus the >> > > >> > corresponding in-memory specs seems to head into the opposite >> > > direction. >> > > >> > -- My goal is to give users a handle of the _physical_ store >> > (RocksDB >> > > vs >> > > >> > InMemory vs Custom) but not the _logical_ stores (plain kv, >> ts-kv, >> > > >> > versioned) which is "dictated" by the DSL itself and should not >> be >> > > >> > customizable (we are just in a weird intermediate situation that >> we >> > > need >> > > >> > to clean up, but not "lean into" IMHO). >> > > >> > >> > > >> > Thus, I am also not sure if adding "VersionedRocksDbSpec" would >> be >> > > ideal >> > > >> > (also, given that it only changes a single store, but not the two >> > > >> > windowed stores)? >> > > >> > >> > > >> > Furthermore, I actually hope that we could use the new versioned >> > store >> > > >> > to replace the window- and sessions- stores, and thus to decrease >> > the >> > > >> > number of required store types. >> > > >> > >> > > >> > >> > > >> > Admittedly, I am talking a lot about a potential future, but the >> > goal >> > > is >> > > >> > only to explore opportunities to not get into "worse" >> intermediate >> > > >> > state, that will require a huge deprecation surface area later >> on. >> > Of >> > > >> > course, if there is no better way, and my concerns are not >> shared, I >> > > am >> > > >> > ok to move forward with the KIP. >> > > >> > >> > > >> > >> > > >> > Bottom line: I would personally prefer to keep the current config >> > and >> > > >> > add a `Custom` option to it, plus adding one new config that >> allows >> > > >> > people to set their custom `StoreTypeSpec` class. -- I would not >> > add a >> > > >> > built-in spec for versioned stores at this point (or any other >> > > built-in >> > > >> > `StorytypeSpec` implementations). I guess, users could write a >> > custom >> > > >> > spec if they want to enable versioned store across the board for >> now >> > > >> > (until we make them the default anyway)? >> > > >> > >> > > >> > >> > > >> > Hope my train of thoughts is halfway reasonable and not totally >> off >> > > >> track? >> > > >> > >> > > >> > >> > > >> > -Matthias >> > > >> > >> > > >> > On 7/21/23 15:27, Sophie Blee-Goldman wrote: >> > > >> > > I agree with everything Almog said above, and will just add on >> to >> > > two >> > > >> > > points: >> > > >> > > >> > > >> > > 1. Regarding whether to block this KIP on the completion of >> any or >> > > all >> > > >> > > future implementations of in-memory version stores (or persist >> > > >> suppression >> > > >> > > buffers), I feel that would be unfair to this feature which is >> > > >> completely >> > > >> > > unrelated to the semantic improvements offered by versioned >> state >> > > >> stores. >> > > >> > > It seems like the responsibility of those driving the versioned >> > > state >> > > >> > > stores feature, not Almog/this KIP, to make sure that those >> bases >> > > are >> > > >> > > covered. Further, if anything, this KIP will help with the >> massive >> > > >> > > proliferation of StoreSuppliers on the Stores factory class, >> and >> > > >> provide >> > > >> > > users with a much easier way to leverage the versioned stores >> > > without >> > > >> > > having to muck around directly with the StoreSuppliers. >> > > >> > > >> > > >> > > I also thought about it a bit, and really like Almog's >> suggestion >> > to >> > > >> > > introduce an additional StoreSpec for the Versioned state >> stores. >> > > >> Obviously >> > > >> > > we can add the RocksDB one to this KIP now, and then as he >> > > mentioned, >> > > >> > > there's an easy way to get users onto the IMVersionedStateStore >> > > types >> > > >> once >> > > >> > > they are completed. >> > > >> > > >> > > >> > > Lastly, on this note, I want to point out how smoothly this >> solved >> > > the >> > > >> > > issue of timestamped stores, which are intended to be the DSL >> > > default >> > > >> but >> > > >> > > are only a special case for RocksDB. Right now it can be >> confusing >> > > >> for a >> > > >> > > user scrolling through the endless Stores class and seeing a >> > > >> timestamped >> > > >> > > version of the persistent but not in-memory stores. One could >> > easily >> > > >> assume >> > > >> > > there was no timestamped option for IM stores and that this >> > feature >> > > >> was >> > > >> > > incomplete, if they weren't acutely aware of the internal >> > > >> implementation >> > > >> > > details (namely that it's only required for RocksDB for >> > > compatibility >> > > >> > > reasons). However, with this KIP, all that is handled >> completely >> > > >> > > transparently to the user, and we the devs, who *are* aware of >> the >> > > >> internal >> > > >> > > implementation details, are now appropriately the ones >> responsible >> > > for >> > > >> > > handing the correct store type to a particular operator. While >> > > >> versioned >> > > >> > > state stores may not be completely comparable, depending on >> > whether >> > > >> we want >> > > >> > > users to remain able to easily choose between using them or not >> > (vs >> > > >> > > timestamped which should be used by all), I still feel this KIP >> > is a >> > > >> great >> > > >> > > step in the right direction that not only should not be >> blocked on >> > > the >> > > >> > > completion of the IM implementations, but in fact should >> > > specifically >> > > >> be >> > > >> > > done first as it enables an easier way to utilize those IM >> > versioned >> > > >> > > stores. Just my 2 cents :) >> > > >> > > >> > > >> > > 2. The idea to expand the existing the config with a CUSTOM >> enum >> > > >> without >> > > >> > > introducing another config to specify the CUSTOM store spec >> does >> > not >> > > >> seem >> > > >> > > appropriate, or even possible (for the reasons Almog mentioned >> > > above >> > > >> about >> > > >> > > config types, though perhaps there is a way I'm not seeing). I >> do >> > > not >> > > >> buy >> > > >> > > the argument that we should optimize the API to make it easy >> for >> > > >> users who >> > > >> > > just want to switch to all in-memory stores, as I truly believe >> > this >> > > >> is a >> > > >> > > very small fraction of the potential userbase of this feature >> > > (anyone >> > > >> who's >> > > >> > > actually using this should please chime in!). It seems very >> likely >> > > >> that the >> > > >> > > majority of users of this feature are actually those with >> custom >> > > state >> > > >> > > stores, as to my knowledge, this has been the case any/every >> time >> > > this >> > > >> > > feature was requested by users. >> > > >> > > >> > > >> > > That said, while I don't see any way to get around introducing >> a >> > new >> > > >> > > config, I don't personally have a preference w.r.t whether to >> keep >> > > >> around >> > > >> > > the old config or deprecate it. I simply don't get the >> impression >> > it >> > > >> is >> > > >> > > very heavily used as it stands today, while it only works for >> > those >> > > >> who >> > > >> > > want all in-memory stores. Again, input from actual users >> would be >> > > >> very >> > > >> > > valuable here. In the absence of that data, I will just point >> to >> > the >> > > >> fact >> > > >> > > that this KIP was proposed by a Streams dev (you :P), >> abandoned, >> > > >> picked up >> > > >> > > by another Streams dev, and finally implemented without ever >> > hearing >> > > >> from a >> > > >> > > user that they would find this useful. This is not to disparage >> > the >> > > >> > > original KIP, but just to say again, as I stated back then, it >> > > seemed >> > > >> like >> > > >> > > a major opportunity loss to leave out custom state stores >> > > >> > > >> > > >> > > Cheers, >> > > >> > > Sophie >> > > >> > > >> > > >> > > On Fri, Jul 21, 2023 at 1:52 PM Almog Gavra < >> > almog.ga...@gmail.com> >> > > >> wrote: >> > > >> > > >> > > >> > >> Thanks for all the feedback folk! Responses inline. >> > > >> > >> >> > > >> > >>> Basically, I'm suggesting two things: first, call out in some >> > way >> > > >> > >> (perhaps the StoreTypeSpec javadocs) that each StoreTypeSpec >> is >> > > >> considered >> > > >> > >> a public contract in itself and should outline any semantic >> > > >> guarantees it >> > > >> > >> does, or does not, make. Second, we should add a note on >> ordering >> > > >> > >> guarantees in the two OOTB specs: for RocksDB we assert that >> > range >> > > >> queries >> > > >> > >> will honor serialized byte ordering, whereas the InMemory >> flavor >> > > >> gives no >> > > >> > >> ordering guarantee whatsoever at this time. >> > > >> > >> >> > > >> > >> That makes sense to me Sophie! I'll make the changes to the >> KIP. >> > > And >> > > >> @Colt, >> > > >> > >> yes I believe that would be the new javadoc for the generic >> > > >> > >> ReadOnlyKeyValueStore. >> > > >> > >> >> > > >> > >>> However, I am wondering if we should close others gaps first? >> > > >> > >> >> > > >> > >> @Matthias, thanks for the review and thoughts! I think we >> should >> > > >> separate >> > > >> > >> closing other gaps in the product from providing this as >> useful >> > > >> > >> functionality to avoid feature creep so long as the API >> proposed >> > > >> here will >> > > >> > >> be suitable for when we want to close those implementation >> gaps! >> > My >> > > >> general >> > > >> > >> proposal is that for things that are not customizable today by >> > > >> > >> default.dsl.store they remain not customizable after this KIP. >> > The >> > > >> good >> > > >> > >> news is, however, that there's no reason why this cannot be >> > > extended >> > > >> to >> > > >> > >> cover those in the future if we want to - see specifics below. >> > > >> > >> >> > > >> > >> Comments on the specifics below >> > > >> > >> >> > > >> > >>> In particular, this holds for the new versioned-store ... >> Should >> > > >> > >> versioned stores also be covered by the KIP >> > > >> > >> >> > > >> > >> Is there a reason why we can't introduce a >> > > >> VersionedRocksDBStoreTypeSpec >> > > >> > >> and if we ever support an in-memory an equivalent >> > > >> > >> VersionedInMemoryRocksDBStoreTypeSpec? If so, then there would >> > not >> > > >> need to >> > > >> > >> be any additional changes to the API proposed in this KIP. >> > > >> > >> >> > > >> > >>> For `suppress()` it's actually other way around we only have >> an >> > > >> in-memory >> > > >> > >> implementation. Do you aim to allow custom stores for >> > `suppress()`, >> > > >> too? >> > > >> > >> >> > > >> > >> We have three options here: >> > > >> > >> 1) we can decide to maintain existing behavior and use the >> > > in-memory >> > > >> > >> implementation for all stores (not even going through the API >> at >> > > all) >> > > >> > >> 2a) we can introduce a new parameter to the KeyValueParams >> class >> > > >> (boolean >> > > >> > >> isTimeOrderedBuffer or something like that) and return an >> > in-memory >> > > >> store >> > > >> > >> in the implementation of RocksDBStoreTypeSpec (this maintains >> the >> > > >> existing >> > > >> > >> behavior, and would allow us in the future to make the change >> to >> > > >> return a >> > > >> > >> RocksDB store if we ever provide one) >> > > >> > >> 2b) same as 2a but we throw an exception if the requested >> store >> > > type >> > > >> does >> > > >> > >> not support that (this is backwards incompatible, and since >> > > ROCKS_DB >> > > >> is the >> > > >> > >> default we probably shouldn't do this) >> > > >> > >> >> > > >> > >> My proposal for now is 1) because as of KIP-825 >> > > >> > >> EmitStrategy#ON_WINDOW_CLOSE is the preferred way of >> suppressing >> > > and >> > > >> that >> > > >> > >> is accounted for in this API already. >> > > >> > >> >> > > >> > >>> Last, I am not sure if the new parameter replacing the >> existing >> > > one >> > > >> is >> > > >> > >> the >> > > >> > >> best way to go? >> > > >> > >> >> > > >> > >> I'm happy either way, just let me know which you prefer - the >> > > >> discussion >> > > >> > >> around CUSTOM is in the rejected alternatives but I'm happy to >> > > >> differ to >> > > >> > >> whatever the project conventions are :) >> > > >> > >> >> > > >> > >>> If it's matches existing `ROCKS_DB` or `IN_MEMORY` we just >> > process >> > > >> it as >> > > >> > >> we >> > > >> > >> do know, and if know we assume it's a fully qualified class >> name >> > > and >> > > >> try to >> > > >> > >> instantiate it? >> > > >> > >> >> > > >> > >> Note that there is no functionality for this kind of thing in >> > > >> > >> AbstractConfig (it's either a String validated enum or a >> class) >> > so >> > > >> this >> > > >> > >> would be a departure from convention. Again, I'm happy to >> > implement >> > > >> that if >> > > >> > >> it's preferred. >> > > >> > >> >> > > >> > >>> Also wondering how it would related to the existing `Stores` >> > > >> factory? >> > > >> > >> >> > > >> > >> StoreTypeSpec will depend on Stores factory - they're one >> layer >> > > >> removed. >> > > >> > >> You can imagine that StoreTypeSpec is just a grouping of >> methods >> > > >> from the >> > > >> > >> Stores factory into a convenient package for default >> > configuration >> > > >> > >> purposes. >> > > >> > >> >> > > >> > >> Thanks again for all the detailed thoughts Matthias! >> > > >> > >> >> > > >> > >> On Fri, Jul 21, 2023 at 11:50 AM Matthias J. Sax < >> > mj...@apache.org >> > > > >> > > >> wrote: >> > > >> > >> >> > > >> > >>> Thanks for the KIP. Overall I like the idea to close this >> gap. >> > > >> > >>> >> > > >> > >>> However, I am wondering if we should close others gaps >> first? In >> > > >> > >>> particular, IIRC, we have a few cases for which we only have >> a >> > > >> RocksDB >> > > >> > >>> implementation for a store, and thus, adding an in-memory >> > version >> > > >> for >> > > >> > >>> these stores first, to make the current `IN_MEMORY` parameter >> > > work, >> > > >> > >>> might be the first step? >> > > >> > >>> >> > > >> > >>> In particular, this holds for the new versioned-store (but I >> > > >> actually >> > > >> > >>> believe the is some other internal store with no in-memory >> > > >> > >>> implementation). -- For `suppress()` it's actually other way >> > > around >> > > >> we >> > > >> > >>> we only have an in-memory implementation. Do you aim to allow >> > > custom >> > > >> > >>> stores for `suppress()`, too? >> > > >> > >>> >> > > >> > >>> Btw: Should versioned stores also be covered by the KIP (ie, >> > > >> > >>> `StoreTypeSpec`)? We did consider to add a new option >> > `VERSIONED` >> > > >> to the >> > > >> > >>> existing `default.dsl.store` config, but opted out for >> various >> > > >> reasons. >> > > >> > >>> >> > > >> > >>> Last, I am not sure if the new parameter replacing the >> existing >> > > one >> > > >> is >> > > >> > >>> the best way to go? Did you put the idea to add `CUSTOM` to >> the >> > > >> existing >> > > >> > >>> config into rejected alternative. Personally, I would prefer >> to >> > > add >> > > >> > >>> `CUSTOM` as I would like to optimize to easy of use for the >> > > >> majority of >> > > >> > >>> users (which don't implement a custom store), but only >> switch to >> > > >> > >>> in-memory sometimes. -- As an alternative, you would also >> just >> > > >> extend >> > > >> > >>> `dsl.default.store` (it's just a String) and allow to pass in >> > > >> anything. >> > > >> > >>> If it's matches existing `ROCKS_DB` or `IN_MEMORY` we just >> > process >> > > >> it as >> > > >> > >>> we do know, and if know we assume it's a fully qualified >> class >> > > name >> > > >> and >> > > >> > >>> try to instantiate it? -- Given that we plan to keep the >> > > >> store-enum, is >> > > >> > >>> seems cleaner to keep the existing config and keep both the >> > config >> > > >> and >> > > >> > >>> enum aligned to each other? >> > > >> > >>> >> > > >> > >>> >> > > >> > >>> It's just preliminary thought. I will need to go back an >> take a >> > > more >> > > >> > >>> detailed look into the code to grok how the propose >> > > `StoreTypeSpec` >> > > >> > >>> falls into place. Also wondering how it would related to the >> > > >> existing >> > > >> > >>> `Stores` factory? >> > > >> > >>> >> > > >> > >>> -Matthias >> > > >> > >>> >> > > >> > >>> >> > > >> > >>> On 7/21/23 6:45 AM, Colt McNealy wrote: >> > > >> > >>>> Sophie— >> > > >> > >>>> >> > > >> > >>>> Thanks for chiming in here. +1 to the idea of specifying the >> > > >> ordering >> > > >> > >>>> guarantees that we make in the StorageTypeSpec javadocs. >> > > >> > >>>> >> > > >> > >>>> Quick question then. Is the javadoc that says: >> > > >> > >>>> >> > > >> > >>>>> Order is not guaranteed as bytes lexicographical ordering >> > might >> > > >> not >> > > >> > >>>> represent key order. >> > > >> > >>>> >> > > >> > >>>> no longer correct, and should say: >> > > >> > >>>> >> > > >> > >>>>> Order guarantees depend on the underlying implementation of >> > the >> > > >> > >>>> ReadOnlyKeyValueStore. For more information, please consult >> the >> > > >> > >>>> [StorageTypeSpec javadocs](....) >> > > >> > >>>> >> > > >> > >>>> Thanks, >> > > >> > >>>> Colt McNealy >> > > >> > >>>> >> > > >> > >>>> *Founder, LittleHorse.dev* >> > > >> > >>>> >> > > >> > >>>> >> > > >> > >>>> On Thu, Jul 20, 2023 at 9:28 PM Sophie Blee-Goldman < >> > > >> > >>> ableegold...@gmail.com> >> > > >> > >>>> wrote: >> > > >> > >>>> >> > > >> > >>>>> Hey Almog, first off, thanks for the KIP! I (and others) >> > raised >> > > >> > >> concerns >> > > >> > >>>>> over how restrictive the default.dsl.store config would be >> if >> > > not >> > > >> > >>>>> extendable to custom store types, especially given that >> this >> > > >> seems to >> > > >> > >> be >> > > >> > >>>>> the primary userbase of such a feature. At the time we >> didn't >> > > >> really >> > > >> > >>> have >> > > >> > >>>>> any better ideas for a clean way to achieve that, but what >> you >> > > >> > >> proposed >> > > >> > >>>>> makes a lot of sense to me. Happy to see a good solution to >> > > this, >> > > >> and >> > > >> > >>>>> hopefully others will share my satisfaction :P >> > > >> > >>>>> >> > > >> > >>>>> I did have one quick piece of feedback which arose from an >> > > >> unrelated >> > > >> > >>>>> question posed to the dev mailing list w/ subject line >> > > >> > >>>>> "ReadOnlyKeyValueStore#range() >> > > >> > >>>>> Semantics" >> > > >> > >>>>> < >> > > https://lists.apache.org/thread/jbckmth8d3mcgg0rd670cpvsgwzqlwrm>. >> > > >> I >> > > >> > >>>>> recommend checking out the full thread for context, but it >> > made >> > > me >> > > >> > >> think >> > > >> > >>>>> about how we can leverage the new StoreTypeSpec concept as >> an >> > > >> answer >> > > >> > >> to >> > > >> > >>> the >> > > >> > >>>>> long-standing question in Streams: where can we put >> guarantees >> > > of >> > > >> the >> > > >> > >>>>> public contract for RocksDB (or other store >> implementations) >> > > when >> > > >> all >> > > >> > >>> the >> > > >> > >>>>> RocksDB stuff is technically internal. >> > > >> > >>>>> >> > > >> > >>>>> Basically, I'm suggesting two things: first, call out in >> some >> > > way >> > > >> > >>> (perhaps >> > > >> > >>>>> the StoreTypeSpec javadocs) that each StoreTypeSpec is >> > > considered >> > > >> a >> > > >> > >>> public >> > > >> > >>>>> contract in itself and should outline any semantic >> guarantees >> > it >> > > >> does, >> > > >> > >>> or >> > > >> > >>>>> does not, make. Second, we should add a note on ordering >> > > >> guarantees in >> > > >> > >>> the >> > > >> > >>>>> two OOTB specs: for RocksDB we assert that range queries >> will >> > > >> honor >> > > >> > >>>>> serialized byte ordering, whereas the InMemory flavor >> gives no >> > > >> > >> ordering >> > > >> > >>>>> guarantee whatsoever at this time. >> > > >> > >>>>> >> > > >> > >>>>> Thoughts? >> > > >> > >>>>> >> > > >> > >>>>> -Sophie >> > > >> > >>>>> >> > > >> > >>>>> On Thu, Jul 20, 2023 at 4:28 PM Almog Gavra < >> > > >> almog.ga...@gmail.com> >> > > >> > >>> wrote: >> > > >> > >>>>> >> > > >> > >>>>>> Hi All, >> > > >> > >>>>>> >> > > >> > >>>>>> I would like to propose a KIP to expand support for >> default >> > > store >> > > >> > >> types >> > > >> > >>>>>> (KIP-591) to encompass custom store implementations: >> > > >> > >>>>>> >> > > >> > >>>>>> >> > > >> > >>>>> >> > > >> > >>> >> > > >> > >> >> > > >> >> > > >> > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-954%3A+expand+default+DSL+store+configuration+to+custom+types >> > > >> > >>>>>> >> > > >> > >>>>>> Looking forward to your feedback! >> > > >> > >>>>>> >> > > >> > >>>>>> Cheers, >> > > >> > >>>>>> Almog >> > > >> > >>>>>> >> > > >> > >>>>> >> > > >> > >>>> >> > > >> > >>> >> > > >> > >> >> > > >> > > >> > > >> >> > > > >> > > >> > >> >