That sounds good to me. Thanks for reviving this Sophie
On Wed, Jan 13, 2021 at 7:47 AM Leah Thomas <ltho...@confluent.io> wrote: > Hey all, > > Bringing this back up for discussion. > > It seems like the next steps are to: > 1. rename the config "window.size.ms" > 2. ensure that users set window size EITHER through the config OR through > the constructor. On this note, it may make sense to remove the default for > the `window.size.ms` config, so that there won't be a fall back if the > window size isn't set in either spot. WDYT? This could also address the > issue of multiple window sizes within a streams app. > > I see what Sophie is saying about the `default.windowed.key.serde.inner` > config, but I do think deprecating and moving those configs would require a > larger discussion. I'm open to looping them into this KIP if we feel like > it's vital (or incredibly convenient with low cost to users), but my > initial reaction is to leave that out for now and work within the current > set-up for window size. > > Thanks for all the comments so far, > Leah > > On Tue, Sep 29, 2020 at 10:44 PM Sophie Blee-Goldman <sop...@confluent.io> > wrote: > > > There are two cases where you need to specify the window size -- directly > > using a > > Consumer (eg the console consumer) or reading as an input topic within > > Streams. > > We need a config for the first case, since you can't pass a Deserializer > > object to the > > console consumer. In the Streams case, the reverse is true, and you have > to > > pass in > > an actual Serde object. > > > > Imo we should keep these two cases separate and not use the config for > the > > Streams > > case at all. But that's hard to enforce (we'd have to strip the config > out > > of the user's > > StreamsConfig if they tried to use it within Streams, for example) and it > > also puts us > > in an awkward position due to the default.windowed.inner.serde.class > > configs. If > > they can specify the inner serde class through their Streams app config, > > they > > should be able to specify the window size through config as well. > Otherwise > > we > > either force a mix-and-match as Matthias described, or you just always > have > > to > > specify both the inner class and the window size in the constructor, at > > which > > point, why even have the default.windowed.inner.serde.class config at > all? > > > > ... > > that's not a rhetorical question, actually. Personally I do think we > should > > deprecate the default.windowed.serde.inner.class and replace it with > > separate > > windowed.serializer.inner.class/windowed.deserializer.inner.class > configs. > > This > > way it's immediately obvious that the configs are only for the > > Consumer/Producer, > > and you should construct your own TimeWindowedSerde with all the > necessary > > parameters for use in your Streams app. > > > > That might be too radical, and maybe the problem isn't worth the burden > of > > forcing users to change their code and replace the config with actual > Serde > > objects. But it should be an easy change to make, and if it isn't, that's > > probably a good sign that you're using the serde incorrectly somewhere. > > > > If we don't deprecate the default.windowed.serde.inner.class, then it's > > less clear > > to me how to proceed. The only really consistent thing to do seems to be > to > > name and position the new window size config as a default config and > allow > > it to be used similar to the default inner class configs. Which, as > > established > > throughout this discussion, seems very very wrong > > > > So yes, I think we should just stick with the original name > window.size.ms > > . > > Or > > better yet, call it windowed.deserializer.window.size.ms and name the > > default.windowed.serde.inner.class replacements > > windowed.deserializer.inner.class and windowed.serializer.inner.class > > > > On Tue, Sep 8, 2020 at 2:07 PM Matthias J. Sax <mj...@apache.org> wrote: > > > > > From my understanding, the KIP aims for the case when a user does not > > > control the code, eg, when using the command line consumer (or similar > > > tools). > > > > > > If the user writes code, we should always encourage her to instantiate > > > the deserializer explicitly and not relying on reflection+config. > > > > > > I also don't think that the `default` prefix does make sense, as it > > > indicates that there might be a non-default. However, IMHO, we should > > > not allow "overwrite semantics" but rather throw an exception if the > > > config is set and a window size is provided via the constructor. We > > > should not allow to mix-and-match both and should stick to a strict > > > either-or pattern. > > > > > > > > > -Matthias > > > > > > On 9/8/20 11:52 AM, Guozhang Wang wrote: > > > > Hi Sophie, > > > > > > > > Seems I do have some mis-understanding of the KIP's motivation here > :) > > > Just > > > > for clarification my reasoning is that: > > > > > > > > 1) today Streams itself never uses a windowed deserializer itself > since > > > its > > > > built-in operators only need the serializer and users do not need to > > > > override it, plus standby / restore active tasks would just copy the > > > bytes > > > > directly. So this KIP's motivation is not for Stream's own code > > anyways. > > > > > > > > 2) It is only when user specified serde is missing the window size, > > which > > > > is either when a) one is trying to read a source topic as windowed > > > records > > > > in Streams, this is a big blocker for KIP-300, and when b) one is > > trying > > > to > > > > read a topic as windowed records in Consumer, we would have issues if > > > users > > > > fail to use the appropriate serde constructs. > > > > > > > > I thought the main motivation of this KIP is for 2.a), in which we > > would > > > > just encourage the users to use the right constructor with the window > > > size > > > > by deprecating the other constructs. But I'm not sure how this would > > help > > > > with 2.b) since the proposal is on adding to StreamsConfig. If it is > > the > > > > case, then I agree that probably we can just not add an extra config > > but > > > > just deprecating the constructs. > > > > > > > > > > > > Guozhang > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Sep 8, 2020 at 10:50 AM Sophie Blee-Goldman < > > sop...@confluent.io > > > > > > > > wrote: > > > > > > > >> Hey Guozhang & Leah, > > > >> > > > >> I want to push back a bit on the assumption that we would fall back > on > > > this > > > >> config > > > >> in the case of an unspecified window size in a Streams serde. I > don't > > > think > > > >> it should > > > >> be a default at all, either in name or in effect. To borrow the > > > rhetorical > > > >> question that > > > >> John raised earlier: what is the default window size of an > > application? > > > >> > > > >> Personally, I agree that that doesn't make much sense. Sure, if you > > only > > > >> have a single > > > >> windowed operation in your app then you could just specify the > window > > > size > > > >> by config, > > > >> but how is that any more ergonomic than specifying the window size > in > > > the > > > >> Serde's > > > >> constructor? If anything, it seems worse to put physical and mental > > > >> distance between > > > >> the specification and the actual usage of such parameters. What if > you > > > add > > > >> another > > > >> windowed operation later, with a different size, and forget to > specify > > > the > > > >> new size in > > > >> the new Serde? Or what if you never specify a default window size > > > config at > > > >> all and > > > >> accidentally end up using the default config value of > Long.MAX_VALUE? > > > >> Avoiding this > > > >> possibility was/is one of the main motivations of this KIP, and the > > > whole > > > >> point of > > > >> deprecating the TimeWindowedSerde(innerClass) constructor. > > > >> > > > >> I actually would have advocated to remove this config entirely, but > as > > > John > > > >> pointed > > > >> out, we still need it to configure things like the console consumer > > > >> > > > >> On Tue, Sep 8, 2020 at 10:40 AM Leah Thomas <ltho...@confluent.io> > > > wrote: > > > >> > > > >>> Hi Guozhang, > > > >>> > > > >>> Yes, the config would read them as a single window size. I think > this > > > >>> relates to John's comments about having variably sized windows, > which > > > >> this > > > >>> config doesn't handle. I like the name change and updated the wiki > to > > > >>> reflect that, and to clarify that the default value will still be > > > >>> Long.MAX_VALUE. > > > >>> > > > >>> Thanks for your feedback! > > > >>> Leah > > > >>> > > > >>> On Tue, Sep 8, 2020 at 11:54 AM Guozhang Wang <wangg...@gmail.com> > > > >> wrote: > > > >>> > > > >>>> Hello Leah, > > > >>>> > > > >>>> Thanks for initiating this. I just have one minor clarification > > > >> question > > > >>>> here: the config "window.size.ms" seems to be used as the default > > > >> window > > > >>>> size when reading from a topic that represents windowed records > > right? > > > >>> I.e. > > > >>>> if there are multiple topics that represent windowed records but > > their > > > >>>> window sizes are different, with this config we can only read them > > > >> with a > > > >>>> single window size? If yes, could we rename the config as " > > > >>>> default.window.size.ms" and make that clear in the description as > > > >> well? > > > >>>> Also we'd better also include its default value which I think > would > > > >> still > > > >>>> be MAX_VALUE for compatibility. > > > >>>> > > > >>>> > > > >>>> Guozhang > > > >>>> > > > >>>> > > > >>>> On Tue, Sep 8, 2020 at 9:38 AM Leah Thomas <ltho...@confluent.io> > > > >> wrote: > > > >>>> > > > >>>>> Hey all, > > > >>>>> > > > >>>>> We should be good to wrap up voting now that the discussion has > > been > > > >>>>> resolved. > > > >>>>> > > > >>>>> Cheers, > > > >>>>> Leah > > > >>>>> > > > >>>>> On Wed, Sep 2, 2020 at 7:23 PM Matthias J. Sax <mj...@apache.org > > > > > >>> wrote: > > > >>>>> > > > >>>>>> +1 (binding) > > > >>>>>> > > > >>>>>> On 8/26/20 8:02 AM, John Roesler wrote: > > > >>>>>>> Hi all, > > > >>>>>>> > > > >>>>>>> I've just sent a new message to the DISCUSS thread. We > > > >>>>>>> forgot to include the Scala API in the proposal. > > > >>>>>>> > > > >>>>>>> Thanks, > > > >>>>>>> -John > > > >>>>>>> > > > >>>>>>> On Mon, 2020-08-24 at 18:00 -0700, Sophie Blee-Goldman > > > >>>>>>> wrote: > > > >>>>>>>> Thanks for the KIP! +1 (non-binding) > > > >>>>>>>> > > > >>>>>>>> Sophie > > > >>>>>>>> > > > >>>>>>>> On Mon, Aug 24, 2020 at 5:06 PM John Roesler < > > > >> vvcep...@apache.org > > > >>>> > > > >>>>>> wrote: > > > >>>>>>>> > > > >>>>>>>>> Thanks Leah, > > > >>>>>>>>> I’m +1 (binding) > > > >>>>>>>>> > > > >>>>>>>>> -John > > > >>>>>>>>> > > > >>>>>>>>> On Mon, Aug 24, 2020, at 16:54, Leah Thomas wrote: > > > >>>>>>>>>> Hi everyone, > > > >>>>>>>>>> > > > >>>>>>>>>> I'd like to kick-off the vote for KIP-659: Improve > > > >>>>>>>>>> TimeWindowedDeserializer > > > >>>>>>>>>> and TimeWindowedSerde to handle window size. > > > >>>>>>>>>> > > > >>>>>>>>> > > > >>>>>> > > > >>>>> > > > >>>> > > > >>> > > > >> > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-659%3A+Improve+TimeWindowedDeserializer+and+TimeWindowedSerde+to+handle+window+size > > > >>>>>>>>>> Thanks, > > > >>>>>>>>>> Leah > > > >>>>>>>>>> > > > >>>>>>> > > > >>>>>> > > > >>>>>> > > > >>>>> > > > >>>> > > > >>>> > > > >>>> -- > > > >>>> -- Guozhang > > > >>>> > > > >>> > > > >> > > > > > > > > > > > > > > > > >