Ah, it turns out I did create a ticket: it's KAFKA-7080: https://issues.apache.org/jira/browse/KAFKA-7080
-John On Mon, Jun 25, 2018 at 4:44 PM John Roesler <j...@confluent.io> wrote: > Matthias, > > That's a good idea. I'm not sure why I didn't... > > Thanks, > -John > > On Mon, Jun 25, 2018 at 4:35 PM Matthias J. Sax <matth...@confluent.io> > wrote: > >> Ok. >> >> @John: can you create a JIRA to track this? I think KAFKA-4730 is >> related, but actually an own ticket (that is blocked by not having >> Materialized for stream-stream joins). >> >> >> -Matthias >> >> On 6/25/18 2:10 PM, Bill Bejeck wrote: >> > I agree that it makes sense to have segmentInterval as a parameter to a >> > store, but I also agree with Guozhang's point about not moving as part >> of >> > this KIP. >> > >> > Thanks, >> > Bill >> > >> > On Mon, Jun 25, 2018 at 4:17 PM John Roesler <j...@confluent.io> wrote: >> > >> >> Thanks Matthias and Guozhang, >> >> >> >> About deprecating the "segments" field instead of making it private. >> Yes, I >> >> just took another look at the code, and that is correct. I'll update >> the >> >> KIP. >> >> >> >> I do agree that in the long run, it makes more sense as a parameter to >> the >> >> store somehow than as a parameter to the window. I think this isn't a >> super >> >> high priority, though, because it's not exposed in the DSL (or it >> wasn't >> >> intended to be). >> >> >> >> I felt Guozhang's point is valid, and that we should probably revisit >> it >> >> later, possibly in the scope of >> >> https://issues.apache.org/jira/browse/KAFKA-4730 >> >> >> >> I'll wait an hour or so for more feedback before moving on to a vote. >> >> >> >> Thanks again, >> >> -John >> >> >> >> On Mon, Jun 25, 2018 at 12:20 PM Guozhang Wang <wangg...@gmail.com> >> wrote: >> >> >> >>> Re `segmentInterval` parameter in Windows: currently it is used in two >> >>> places, the windowed stream aggregation, and the stream-stream joins. >> For >> >>> the former, we can potentially move the parameter from windowedBy() to >> >>> Materialized, but for the latter we currently do not expose a >> >> Materialized >> >>> object yet, only the Windows spec. So I think in this KIP we probably >> >>> cannot move it immediately. >> >>> >> >>> But in future KIPs if we decide to expose the stream-stream join's >> store >> >> / >> >>> changelog / repartition topic names, we may well adding the >> Materialized >> >>> object into the operator, and we can then move the parameter to >> >>> Materialized by then. >> >>> >> >>> >> >>> Guozhang >> >>> >> >>> On Sun, Jun 24, 2018 at 5:16 PM, Matthias J. Sax < >> matth...@confluent.io> >> >>> wrote: >> >>> >> >>>> Thanks for the KIP. Overall, I think it makes sense to clean up the >> >> API. >> >>>> >> >>>> Couple of comments: >> >>>> >> >>>>> Sadly there's no way to "deprecate" this >> >>>>> exposure >> >>>> >> >>>> I disagree. We can just mark the variable as deprecated and I >> advocate >> >>>> to do this. When the deprecation period passed, we can make it >> private >> >>>> (or actually remove it; cf. my next comment). >> >>>> >> >>>> >> >>>> Parameter, `segmentInterval` is semantically not a "window" >> >>>> specification parameter but an implementation detail and thus a store >> >>>> parameter. Would it be better to add it to `Materialized`? >> >>>> >> >>>> >> >>>> -Matthias >> >>>> >> >>>> On 6/22/18 5:13 PM, Guozhang Wang wrote: >> >>>>> Thanks John. >> >>>>> >> >>>>> On Fri, Jun 22, 2018 at 5:05 PM, John Roesler <j...@confluent.io> >> >>> wrote: >> >>>>> >> >>>>>> Thanks for the feedback, Bill and Guozhang, >> >>>>>> >> >>>>>> I've updated the KIP accordingly. >> >>>>>> >> >>>>>> Thanks, >> >>>>>> -John >> >>>>>> >> >>>>>> On Fri, Jun 22, 2018 at 5:51 PM Guozhang Wang <wangg...@gmail.com> >> >>>> wrote: >> >>>>>> >> >>>>>>> Thanks for the KIP. I'm +1 on the proposal. One minor comment on >> >> the >> >>>>>> wiki: >> >>>>>>> the `In Windows, we will:` section code snippet is empty. >> >>>>>>> >> >>>>>>> On Fri, Jun 22, 2018 at 3:29 PM, Bill Bejeck <bbej...@gmail.com> >> >>>> wrote: >> >>>>>>> >> >>>>>>>> Hi John, >> >>>>>>>> >> >>>>>>>> Thanks for the KIP, and overall it's a +1 for me. >> >>>>>>>> >> >>>>>>>> In the JavaDoc for the segmentInterval method, there's no mention >> >> of >> >>>>>> the >> >>>>>>>> number of segments there can be at any one time. While it's >> >> implied >> >>>>>> that >> >>>>>>>> the number of segments is potentially unbounded, would be better >> >> to >> >>>>>>>> explicitly state that the previous limit on the number of >> segments >> >>> is >> >>>>>>> going >> >>>>>>>> to be removed as well? >> >>>>>>>> >> >>>>>>>> I have a couple of nit comments. The method name is still >> >>>> segmentSize >> >>>>>>> in >> >>>>>>>> the code block vs segmentInterval and the order of the parameters >> >>> for >> >>>>>> the >> >>>>>>>> third persistentWindowStore don't match the order in the JavaDoc. >> >>>>>>>> >> >>>>>>>> Thanks, >> >>>>>>>> Bill >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> >> >>>>>>>> On Thu, Jun 21, 2018 at 3:32 PM John Roesler <j...@confluent.io> >> >>>>>> wrote: >> >>>>>>>> >> >>>>>>>>> I've updated the KIP and draft PR accordingly. >> >>>>>>>>> >> >>>>>>>>> On Thu, Jun 21, 2018 at 2:03 PM John Roesler <j...@confluent.io >> > >> >>>>>>> wrote: >> >>>>>>>>> >> >>>>>>>>>> Interesting... I did not initially consider it because I didn't >> >>>>>> want >> >>>>>>> to >> >>>>>>>>>> have an impact on anyone's Streams apps, but now I see that >> >> unless >> >>>>>>>>>> developers have subclassed `Windows`, the number of segments >> >> would >> >>>>>>>> always >> >>>>>>>>>> be 3! >> >>>>>>>>>> >> >>>>>>>>>> There's one caveat to this, which I think was a mistake. The >> >> field >> >>>>>>>>>> `segments` in Windows is public, which means that anyone can >> >>>>>> actually >> >>>>>>>> set >> >>>>>>>>>> it directly on any Window instance like: >> >>>>>>>>>> >> >>>>>>>>>> TimeWindows tw = TimeWindows.of(100); >> >>>>>>>>>> tw.segments = 12345; >> >>>>>>>>>> >> >>>>>>>>>> Bypassing the bounds check and contradicting the javadoc in >> >>> Windows >> >>>>>>>> that >> >>>>>>>>>> says users can't directly set it. Sadly there's no way to >> >>>>>> "deprecate" >> >>>>>>>>> this >> >>>>>>>>>> exposure, so I propose just to make it private. >> >>>>>>>>>> >> >>>>>>>>>> With this new knowledge, I agree, I think we can switch to >> >>>>>>>>>> "segmentInterval" throughout the interface. >> >>>>>>>>>> >> >>>>>>>>>> On Wed, Jun 20, 2018 at 5:06 PM Guozhang Wang < >> >> wangg...@gmail.com >> >>>> >> >>>>>>>>> wrote: >> >>>>>>>>>> >> >>>>>>>>>>> Hello John, >> >>>>>>>>>>> >> >>>>>>>>>>> Thanks for the KIP. >> >>>>>>>>>>> >> >>>>>>>>>>> Should we consider making the change on >> >>>>>>> `Stores#persistentWindowStore` >> >>>>>>>>>>> parameters as well? >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> Guozhang >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> On Wed, Jun 20, 2018 at 1:31 PM, John Roesler < >> >> j...@confluent.io >> >>>> >> >>>>>>>>> wrote: >> >>>>>>>>>>> >> >>>>>>>>>>>> Hi Ted, >> >>>>>>>>>>>> >> >>>>>>>>>>>> Ah, when you made that comment to me before, I thought you >> >> meant >> >>>>>>> as >> >>>>>>>>>>> opposed >> >>>>>>>>>>>> to "segments". Now it makes sense that you meant as opposed >> to >> >>>>>>>>>>>> "segmentSize". >> >>>>>>>>>>>> >> >>>>>>>>>>>> I named it that way to match the peer method "windowSize", >> >> which >> >>>>>>> is >> >>>>>>>>>>> also a >> >>>>>>>>>>>> quantity of milliseconds. >> >>>>>>>>>>>> >> >>>>>>>>>>>> I agree that "interval" is more intuitive, but I think I >> favor >> >>>>>>>>>>> consistency >> >>>>>>>>>>>> in this case. Does that seem reasonable? >> >>>>>>>>>>>> >> >>>>>>>>>>>> Thanks, >> >>>>>>>>>>>> -John >> >>>>>>>>>>>> >> >>>>>>>>>>>> On Wed, Jun 20, 2018 at 1:06 PM Ted Yu <yuzhih...@gmail.com> >> >>>>>>> wrote: >> >>>>>>>>>>>> >> >>>>>>>>>>>>> Normally size is not measured in time unit, such as >> >>>>>>> milliseconds. >> >>>>>>>>>>>>> How about naming the new method segmentInterval ? >> >>>>>>>>>>>>> Thanks >> >>>>>>>>>>>>> -------- Original message --------From: John Roesler < >> >>>>>>>>>>> j...@confluent.io> >> >>>>>>>>>>>>> Date: 6/20/18 10:45 AM (GMT-08:00) To: >> >> dev@kafka.apache.org >> >>>>>>>>>>> Subject: >> >>>>>>>>>>>>> [DISCUSS] KIP-319: Replace segments with segmentSize in >> >>>>>>>>>>>>> WindowBytesStoreSupplier >> >>>>>>>>>>>>> Hello All, >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> I'd like to propose KIP-319 to fix an issue I identified in >> >>>>>>>>>>> KAFKA-7080. >> >>>>>>>>>>>>> Specifically, we're creating CachingWindowStore with the >> >>>>>> *number >> >>>>>>>> of >> >>>>>>>>>>>>> segments* instead of the *segment size*. >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> Here's the jira: >> >>>>>>> https://issues.apache.org/jira/browse/KAFKA-7080 >> >>>>>>>>>>>>> Here's the KIP: >> https://cwiki.apache.org/confluence/x/mQU0BQ >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> additionally, here's a draft PR for clarity: >> >>>>>>>>>>>>> https://github.com/apache/kafka/pull/5257 >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> Please let me know what you think! >> >>>>>>>>>>>>> >> >>>>>>>>>>>>> Thanks, >> >>>>>>>>>>>>> -John >> >>>>>>>>>>>>> >> >>>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> -- >> >>>>>>>>>>> -- Guozhang >> >>>>>>>>>>> >> >>>>>>>>>> >> >>>>>>>>> >> >>>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> >> >>>>>>> -- >> >>>>>>> -- Guozhang >> >>>>>>> >> >>>>>> >> >>>>> >> >>>>> >> >>>>> >> >>>> >> >>>> >> >>> >> >>> >> >>> -- >> >>> -- Guozhang >> >>> >> >> >> > >> >>