Thanks Matthias and Guozhang, About deprecating the "segments" field instead of making it private. Yes, I just took another look at the code, and that is correct. I'll update the KIP.
I do agree that in the long run, it makes more sense as a parameter to the store somehow than as a parameter to the window. I think this isn't a super high priority, though, because it's not exposed in the DSL (or it wasn't intended to be). I felt Guozhang's point is valid, and that we should probably revisit it later, possibly in the scope of https://issues.apache.org/jira/browse/KAFKA-4730 I'll wait an hour or so for more feedback before moving on to a vote. Thanks again, -John On Mon, Jun 25, 2018 at 12:20 PM Guozhang Wang <wangg...@gmail.com> wrote: > Re `segmentInterval` parameter in Windows: currently it is used in two > places, the windowed stream aggregation, and the stream-stream joins. For > the former, we can potentially move the parameter from windowedBy() to > Materialized, but for the latter we currently do not expose a Materialized > object yet, only the Windows spec. So I think in this KIP we probably > cannot move it immediately. > > But in future KIPs if we decide to expose the stream-stream join's store / > changelog / repartition topic names, we may well adding the Materialized > object into the operator, and we can then move the parameter to > Materialized by then. > > > Guozhang > > On Sun, Jun 24, 2018 at 5:16 PM, Matthias J. Sax <matth...@confluent.io> > wrote: > > > Thanks for the KIP. Overall, I think it makes sense to clean up the API. > > > > Couple of comments: > > > > > Sadly there's no way to "deprecate" this > > > exposure > > > > I disagree. We can just mark the variable as deprecated and I advocate > > to do this. When the deprecation period passed, we can make it private > > (or actually remove it; cf. my next comment). > > > > > > Parameter, `segmentInterval` is semantically not a "window" > > specification parameter but an implementation detail and thus a store > > parameter. Would it be better to add it to `Materialized`? > > > > > > -Matthias > > > > On 6/22/18 5:13 PM, Guozhang Wang wrote: > > > Thanks John. > > > > > > On Fri, Jun 22, 2018 at 5:05 PM, John Roesler <j...@confluent.io> > wrote: > > > > > >> Thanks for the feedback, Bill and Guozhang, > > >> > > >> I've updated the KIP accordingly. > > >> > > >> Thanks, > > >> -John > > >> > > >> On Fri, Jun 22, 2018 at 5:51 PM Guozhang Wang <wangg...@gmail.com> > > wrote: > > >> > > >>> Thanks for the KIP. I'm +1 on the proposal. One minor comment on the > > >> wiki: > > >>> the `In Windows, we will:` section code snippet is empty. > > >>> > > >>> On Fri, Jun 22, 2018 at 3:29 PM, Bill Bejeck <bbej...@gmail.com> > > wrote: > > >>> > > >>>> Hi John, > > >>>> > > >>>> Thanks for the KIP, and overall it's a +1 for me. > > >>>> > > >>>> In the JavaDoc for the segmentInterval method, there's no mention of > > >> the > > >>>> number of segments there can be at any one time. While it's implied > > >> that > > >>>> the number of segments is potentially unbounded, would be better to > > >>>> explicitly state that the previous limit on the number of segments > is > > >>> going > > >>>> to be removed as well? > > >>>> > > >>>> I have a couple of nit comments. The method name is still > > segmentSize > > >>> in > > >>>> the code block vs segmentInterval and the order of the parameters > for > > >> the > > >>>> third persistentWindowStore don't match the order in the JavaDoc. > > >>>> > > >>>> Thanks, > > >>>> Bill > > >>>> > > >>>> > > >>>> > > >>>> On Thu, Jun 21, 2018 at 3:32 PM John Roesler <j...@confluent.io> > > >> wrote: > > >>>> > > >>>>> I've updated the KIP and draft PR accordingly. > > >>>>> > > >>>>> On Thu, Jun 21, 2018 at 2:03 PM John Roesler <j...@confluent.io> > > >>> wrote: > > >>>>> > > >>>>>> Interesting... I did not initially consider it because I didn't > > >> want > > >>> to > > >>>>>> have an impact on anyone's Streams apps, but now I see that unless > > >>>>>> developers have subclassed `Windows`, the number of segments would > > >>>> always > > >>>>>> be 3! > > >>>>>> > > >>>>>> There's one caveat to this, which I think was a mistake. The field > > >>>>>> `segments` in Windows is public, which means that anyone can > > >> actually > > >>>> set > > >>>>>> it directly on any Window instance like: > > >>>>>> > > >>>>>> TimeWindows tw = TimeWindows.of(100); > > >>>>>> tw.segments = 12345; > > >>>>>> > > >>>>>> Bypassing the bounds check and contradicting the javadoc in > Windows > > >>>> that > > >>>>>> says users can't directly set it. Sadly there's no way to > > >> "deprecate" > > >>>>> this > > >>>>>> exposure, so I propose just to make it private. > > >>>>>> > > >>>>>> With this new knowledge, I agree, I think we can switch to > > >>>>>> "segmentInterval" throughout the interface. > > >>>>>> > > >>>>>> On Wed, Jun 20, 2018 at 5:06 PM Guozhang Wang <wangg...@gmail.com > > > > >>>>> wrote: > > >>>>>> > > >>>>>>> Hello John, > > >>>>>>> > > >>>>>>> Thanks for the KIP. > > >>>>>>> > > >>>>>>> Should we consider making the change on > > >>> `Stores#persistentWindowStore` > > >>>>>>> parameters as well? > > >>>>>>> > > >>>>>>> > > >>>>>>> Guozhang > > >>>>>>> > > >>>>>>> > > >>>>>>> On Wed, Jun 20, 2018 at 1:31 PM, John Roesler <j...@confluent.io > > > > >>>>> wrote: > > >>>>>>> > > >>>>>>>> Hi Ted, > > >>>>>>>> > > >>>>>>>> Ah, when you made that comment to me before, I thought you meant > > >>> as > > >>>>>>> opposed > > >>>>>>>> to "segments". Now it makes sense that you meant as opposed to > > >>>>>>>> "segmentSize". > > >>>>>>>> > > >>>>>>>> I named it that way to match the peer method "windowSize", which > > >>> is > > >>>>>>> also a > > >>>>>>>> quantity of milliseconds. > > >>>>>>>> > > >>>>>>>> I agree that "interval" is more intuitive, but I think I favor > > >>>>>>> consistency > > >>>>>>>> in this case. Does that seem reasonable? > > >>>>>>>> > > >>>>>>>> Thanks, > > >>>>>>>> -John > > >>>>>>>> > > >>>>>>>> On Wed, Jun 20, 2018 at 1:06 PM Ted Yu <yuzhih...@gmail.com> > > >>> wrote: > > >>>>>>>> > > >>>>>>>>> Normally size is not measured in time unit, such as > > >>> milliseconds. > > >>>>>>>>> How about naming the new method segmentInterval ? > > >>>>>>>>> Thanks > > >>>>>>>>> -------- Original message --------From: John Roesler < > > >>>>>>> j...@confluent.io> > > >>>>>>>>> Date: 6/20/18 10:45 AM (GMT-08:00) To: dev@kafka.apache.org > > >>>>>>> Subject: > > >>>>>>>>> [DISCUSS] KIP-319: Replace segments with segmentSize in > > >>>>>>>>> WindowBytesStoreSupplier > > >>>>>>>>> Hello All, > > >>>>>>>>> > > >>>>>>>>> I'd like to propose KIP-319 to fix an issue I identified in > > >>>>>>> KAFKA-7080. > > >>>>>>>>> Specifically, we're creating CachingWindowStore with the > > >> *number > > >>>> of > > >>>>>>>>> segments* instead of the *segment size*. > > >>>>>>>>> > > >>>>>>>>> Here's the jira: > > >>> https://issues.apache.org/jira/browse/KAFKA-7080 > > >>>>>>>>> Here's the KIP: https://cwiki.apache.org/confluence/x/mQU0BQ > > >>>>>>>>> > > >>>>>>>>> additionally, here's a draft PR for clarity: > > >>>>>>>>> https://github.com/apache/kafka/pull/5257 > > >>>>>>>>> > > >>>>>>>>> Please let me know what you think! > > >>>>>>>>> > > >>>>>>>>> Thanks, > > >>>>>>>>> -John > > >>>>>>>>> > > >>>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> > > >>>>>>> -- > > >>>>>>> -- Guozhang > > >>>>>>> > > >>>>>> > > >>>>> > > >>>> > > >>> > > >>> > > >>> > > >>> -- > > >>> -- Guozhang > > >>> > > >> > > > > > > > > > > > > > > > > -- > -- Guozhang >