Thanks everyone, with that I'll go ahead and close this KIP with 4 binding votes (Sophie, Guozhang, John, Matthias). Thanks for the lively discussion!
Leah On Wed, Jan 20, 2021 at 6:00 PM Matthias J. Sax <mj...@apache.org> wrote: > Thanks for reviving this KIP, Leah. > > I agree that we should not extend the scope of this KIP to potentially > deprecate/rename the `default.windowed.[key|value].serde.inner` configs. > > @Sophie: if you feel strong about it, let's do a separate KIP. > > > +1 (binding) > > > -Matthias > > On 1/19/21 3:00 PM, John Roesler wrote: > > Hi all, > > > > I've just caught up on the thread, and FWIW, I'm still +1. > > > > Thanks, > > -John > > > > On Mon, 2021-01-18 at 21:53 -0800, Guozhang Wang wrote: > >> Read the above updates and the KIP's scope. Makes sense to me. +1 still > >> counts :) > >> > >> On Wed, Jan 13, 2021 at 2:04 PM Sophie Blee-Goldman < > sop...@confluent.io> > >> wrote: > >> > >>> That sounds good to me. Thanks for reviving this > >>> > >>> Sophie > >>> > >>> On Wed, Jan 13, 2021 at 7:47 AM Leah Thomas <ltho...@confluent.io> > wrote: > >>> > >>>> Hey all, > >>>> > >>>> Bringing this back up for discussion. > >>>> > >>>> It seems like the next steps are to: > >>>> 1. rename the config "window.size.ms" > >>>> 2. ensure that users set window size EITHER through the config OR > through > >>>> the constructor. On this note, it may make sense to remove the default > >>> for > >>>> the `window.size.ms` config, so that there won't be a fall back if > the > >>>> window size isn't set in either spot. WDYT? This could also address > the > >>>> issue of multiple window sizes within a streams app. > >>>> > >>>> I see what Sophie is saying about the > `default.windowed.key.serde.inner` > >>>> config, but I do think deprecating and moving those configs would > >>> require a > >>>> larger discussion. I'm open to looping them into this KIP if we feel > like > >>>> it's vital (or incredibly convenient with low cost to users), but my > >>>> initial reaction is to leave that out for now and work within the > current > >>>> set-up for window size. > >>>> > >>>> Thanks for all the comments so far, > >>>> Leah > >>>> > >>>> On Tue, Sep 29, 2020 at 10:44 PM Sophie Blee-Goldman < > >>> sop...@confluent.io> > >>>> wrote: > >>>> > >>>>> There are two cases where you need to specify the window size -- > >>> directly > >>>>> using a > >>>>> Consumer (eg the console consumer) or reading as an input topic > within > >>>>> Streams. > >>>>> We need a config for the first case, since you can't pass a > >>> Deserializer > >>>>> object to the > >>>>> console consumer. In the Streams case, the reverse is true, and you > >>> have > >>>> to > >>>>> pass in > >>>>> an actual Serde object. > >>>>> > >>>>> Imo we should keep these two cases separate and not use the config > for > >>>> the > >>>>> Streams > >>>>> case at all. But that's hard to enforce (we'd have to strip the > config > >>>> out > >>>>> of the user's > >>>>> StreamsConfig if they tried to use it within Streams, for example) > and > >>> it > >>>>> also puts us > >>>>> in an awkward position due to the default.windowed.inner.serde.class > >>>>> configs. If > >>>>> they can specify the inner serde class through their Streams app > >>> config, > >>>>> they > >>>>> should be able to specify the window size through config as well. > >>>> Otherwise > >>>>> we > >>>>> either force a mix-and-match as Matthias described, or you just > always > >>>> have > >>>>> to > >>>>> specify both the inner class and the window size in the constructor, > at > >>>>> which > >>>>> point, why even have the default.windowed.inner.serde.class config at > >>>> all? > >>>>> > >>>>> ... > >>>>> that's not a rhetorical question, actually. Personally I do think we > >>>> should > >>>>> deprecate the default.windowed.serde.inner.class and replace it with > >>>>> separate > >>>>> windowed.serializer.inner.class/windowed.deserializer.inner.class > >>>> configs. > >>>>> This > >>>>> way it's immediately obvious that the configs are only for the > >>>>> Consumer/Producer, > >>>>> and you should construct your own TimeWindowedSerde with all the > >>>> necessary > >>>>> parameters for use in your Streams app. > >>>>> > >>>>> That might be too radical, and maybe the problem isn't worth the > burden > >>>> of > >>>>> forcing users to change their code and replace the config with actual > >>>> Serde > >>>>> objects. But it should be an easy change to make, and if it isn't, > >>> that's > >>>>> probably a good sign that you're using the serde incorrectly > somewhere. > >>>>> > >>>>> If we don't deprecate the default.windowed.serde.inner.class, then > it's > >>>>> less clear > >>>>> to me how to proceed. The only really consistent thing to do seems to > >>> be > >>>> to > >>>>> name and position the new window size config as a default config and > >>>> allow > >>>>> it to be used similar to the default inner class configs. Which, as > >>>>> established > >>>>> throughout this discussion, seems very very wrong > >>>>> > >>>>> So yes, I think we should just stick with the original name > >>>> window.size.ms > >>>>> . > >>>>> Or > >>>>> better yet, call it windowed.deserializer.window.size.ms and name > the > >>>>> default.windowed.serde.inner.class replacements > >>>>> windowed.deserializer.inner.class and windowed.serializer.inner.class > >>>>> > >>>>> On Tue, Sep 8, 2020 at 2:07 PM Matthias J. Sax <mj...@apache.org> > >>> wrote: > >>>>> > >>>>>> From my understanding, the KIP aims for the case when a user does > not > >>>>>> control the code, eg, when using the command line consumer (or > >>> similar > >>>>>> tools). > >>>>>> > >>>>>> If the user writes code, we should always encourage her to > >>> instantiate > >>>>>> the deserializer explicitly and not relying on reflection+config. > >>>>>> > >>>>>> I also don't think that the `default` prefix does make sense, as it > >>>>>> indicates that there might be a non-default. However, IMHO, we > should > >>>>>> not allow "overwrite semantics" but rather throw an exception if the > >>>>>> config is set and a window size is provided via the constructor. We > >>>>>> should not allow to mix-and-match both and should stick to a strict > >>>>>> either-or pattern. > >>>>>> > >>>>>> > >>>>>> -Matthias > >>>>>> > >>>>>> On 9/8/20 11:52 AM, Guozhang Wang wrote: > >>>>>>> Hi Sophie, > >>>>>>> > >>>>>>> Seems I do have some mis-understanding of the KIP's motivation here > >>>> :) > >>>>>> Just > >>>>>>> for clarification my reasoning is that: > >>>>>>> > >>>>>>> 1) today Streams itself never uses a windowed deserializer itself > >>>> since > >>>>>> its > >>>>>>> built-in operators only need the serializer and users do not need > >>> to > >>>>>>> override it, plus standby / restore active tasks would just copy > >>> the > >>>>>> bytes > >>>>>>> directly. So this KIP's motivation is not for Stream's own code > >>>>> anyways. > >>>>>>> > >>>>>>> 2) It is only when user specified serde is missing the window size, > >>>>> which > >>>>>>> is either when a) one is trying to read a source topic as windowed > >>>>>> records > >>>>>>> in Streams, this is a big blocker for KIP-300, and when b) one is > >>>>> trying > >>>>>> to > >>>>>>> read a topic as windowed records in Consumer, we would have issues > >>> if > >>>>>> users > >>>>>>> fail to use the appropriate serde constructs. > >>>>>>> > >>>>>>> I thought the main motivation of this KIP is for 2.a), in which we > >>>>> would > >>>>>>> just encourage the users to use the right constructor with the > >>> window > >>>>>> size > >>>>>>> by deprecating the other constructs. But I'm not sure how this > >>> would > >>>>> help > >>>>>>> with 2.b) since the proposal is on adding to StreamsConfig. If it > >>> is > >>>>> the > >>>>>>> case, then I agree that probably we can just not add an extra > >>> config > >>>>> but > >>>>>>> just deprecating the constructs. > >>>>>>> > >>>>>>> > >>>>>>> Guozhang > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> On Tue, Sep 8, 2020 at 10:50 AM Sophie Blee-Goldman < > >>>>> sop...@confluent.io > >>>>>>> > >>>>>>> wrote: > >>>>>>> > >>>>>>>> Hey Guozhang & Leah, > >>>>>>>> > >>>>>>>> I want to push back a bit on the assumption that we would fall > >>> back > >>>> on > >>>>>> this > >>>>>>>> config > >>>>>>>> in the case of an unspecified window size in a Streams serde. I > >>>> don't > >>>>>> think > >>>>>>>> it should > >>>>>>>> be a default at all, either in name or in effect. To borrow the > >>>>>> rhetorical > >>>>>>>> question that > >>>>>>>> John raised earlier: what is the default window size of an > >>>>> application? > >>>>>>>> > >>>>>>>> Personally, I agree that that doesn't make much sense. Sure, if > >>> you > >>>>> only > >>>>>>>> have a single > >>>>>>>> windowed operation in your app then you could just specify the > >>>> window > >>>>>> size > >>>>>>>> by config, > >>>>>>>> but how is that any more ergonomic than specifying the window size > >>>> in > >>>>>> the > >>>>>>>> Serde's > >>>>>>>> constructor? If anything, it seems worse to put physical and > >>> mental > >>>>>>>> distance between > >>>>>>>> the specification and the actual usage of such parameters. What if > >>>> you > >>>>>> add > >>>>>>>> another > >>>>>>>> windowed operation later, with a different size, and forget to > >>>> specify > >>>>>> the > >>>>>>>> new size in > >>>>>>>> the new Serde? Or what if you never specify a default window size > >>>>>> config at > >>>>>>>> all and > >>>>>>>> accidentally end up using the default config value of > >>>> Long.MAX_VALUE? > >>>>>>>> Avoiding this > >>>>>>>> possibility was/is one of the main motivations of this KIP, and > >>> the > >>>>>> whole > >>>>>>>> point of > >>>>>>>> deprecating the TimeWindowedSerde(innerClass) constructor. > >>>>>>>> > >>>>>>>> I actually would have advocated to remove this config entirely, > >>> but > >>>> as > >>>>>> John > >>>>>>>> pointed > >>>>>>>> out, we still need it to configure things like the console > >>> consumer > >>>>>>>> > >>>>>>>> On Tue, Sep 8, 2020 at 10:40 AM Leah Thomas <ltho...@confluent.io > >>>> > >>>>>> wrote: > >>>>>>>> > >>>>>>>>> Hi Guozhang, > >>>>>>>>> > >>>>>>>>> Yes, the config would read them as a single window size. I think > >>>> this > >>>>>>>>> relates to John's comments about having variably sized windows, > >>>> which > >>>>>>>> this > >>>>>>>>> config doesn't handle. I like the name change and updated the > >>> wiki > >>>> to > >>>>>>>>> reflect that, and to clarify that the default value will still be > >>>>>>>>> Long.MAX_VALUE. > >>>>>>>>> > >>>>>>>>> Thanks for your feedback! > >>>>>>>>> Leah > >>>>>>>>> > >>>>>>>>> On Tue, Sep 8, 2020 at 11:54 AM Guozhang Wang < > >>> wangg...@gmail.com> > >>>>>>>> wrote: > >>>>>>>>> > >>>>>>>>>> Hello Leah, > >>>>>>>>>> > >>>>>>>>>> Thanks for initiating this. I just have one minor clarification > >>>>>>>> question > >>>>>>>>>> here: the config "window.size.ms" seems to be used as the > >>> default > >>>>>>>> window > >>>>>>>>>> size when reading from a topic that represents windowed records > >>>>> right? > >>>>>>>>> I.e. > >>>>>>>>>> if there are multiple topics that represent windowed records but > >>>>> their > >>>>>>>>>> window sizes are different, with this config we can only read > >>> them > >>>>>>>> with a > >>>>>>>>>> single window size? If yes, could we rename the config as " > >>>>>>>>>> default.window.size.ms" and make that clear in the description > >>> as > >>>>>>>> well? > >>>>>>>>>> Also we'd better also include its default value which I think > >>>> would > >>>>>>>> still > >>>>>>>>>> be MAX_VALUE for compatibility. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Guozhang > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On Tue, Sep 8, 2020 at 9:38 AM Leah Thomas < > >>> ltho...@confluent.io> > >>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>>> Hey all, > >>>>>>>>>>> > >>>>>>>>>>> We should be good to wrap up voting now that the discussion has > >>>>> been > >>>>>>>>>>> resolved. > >>>>>>>>>>> > >>>>>>>>>>> Cheers, > >>>>>>>>>>> Leah > >>>>>>>>>>> > >>>>>>>>>>> On Wed, Sep 2, 2020 at 7:23 PM Matthias J. Sax < > >>> mj...@apache.org > >>>>> > >>>>>>>>> wrote: > >>>>>>>>>>> > >>>>>>>>>>>> +1 (binding) > >>>>>>>>>>>> > >>>>>>>>>>>> On 8/26/20 8:02 AM, John Roesler wrote: > >>>>>>>>>>>>> Hi all, > >>>>>>>>>>>>> > >>>>>>>>>>>>> I've just sent a new message to the DISCUSS thread. We > >>>>>>>>>>>>> forgot to include the Scala API in the proposal. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Thanks, > >>>>>>>>>>>>> -John > >>>>>>>>>>>>> > >>>>>>>>>>>>> On Mon, 2020-08-24 at 18:00 -0700, Sophie Blee-Goldman > >>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>> Thanks for the KIP! +1 (non-binding) > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Sophie > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On Mon, Aug 24, 2020 at 5:06 PM John Roesler < > >>>>>>>> vvcep...@apache.org > >>>>>>>>>> > >>>>>>>>>>>> wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Thanks Leah, > >>>>>>>>>>>>>>> I’m +1 (binding) > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> -John > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> On Mon, Aug 24, 2020, at 16:54, Leah Thomas wrote: > >>>>>>>>>>>>>>>> Hi everyone, > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> I'd like to kick-off the vote for KIP-659: Improve > >>>>>>>>>>>>>>>> TimeWindowedDeserializer > >>>>>>>>>>>>>>>> and TimeWindowedSerde to handle window size. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>> > >>>>> > >>>> > >>> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-659%3A+Improve+TimeWindowedDeserializer+and+TimeWindowedSerde+to+handle+window+size > >>>>>>>>>>>>>>>> Thanks, > >>>>>>>>>>>>>>>> Leah > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> -- > >>>>>>>>>> -- Guozhang > >>>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>>> > >>>>>> > >>>>>> > >>>>> > >>>> > >>> > >> > >> > > > > >