KAFKA-7080 is for this KIP. I meant to create a JIRA to add `segmentInterval` to `Materialized` and a JIRA to add `Materialized` to `KStream#join(KStream)`.
Thx. -Matthias On 6/25/18 2:46 PM, John Roesler wrote: > Ah, it turns out I did create a ticket: it's KAFKA-7080: > https://issues.apache.org/jira/browse/KAFKA-7080 > > -John > > On Mon, Jun 25, 2018 at 4:44 PM John Roesler <j...@confluent.io> wrote: > >> Matthias, >> >> That's a good idea. I'm not sure why I didn't... >> >> Thanks, >> -John >> >> On Mon, Jun 25, 2018 at 4:35 PM Matthias J. Sax <matth...@confluent.io> >> wrote: >> >>> Ok. >>> >>> @John: can you create a JIRA to track this? I think KAFKA-4730 is >>> related, but actually an own ticket (that is blocked by not having >>> Materialized for stream-stream joins). >>> >>> >>> -Matthias >>> >>> On 6/25/18 2:10 PM, Bill Bejeck wrote: >>>> I agree that it makes sense to have segmentInterval as a parameter to a >>>> store, but I also agree with Guozhang's point about not moving as part >>> of >>>> this KIP. >>>> >>>> Thanks, >>>> Bill >>>> >>>> On Mon, Jun 25, 2018 at 4:17 PM John Roesler <j...@confluent.io> wrote: >>>> >>>>> Thanks Matthias and Guozhang, >>>>> >>>>> About deprecating the "segments" field instead of making it private. >>> Yes, I >>>>> just took another look at the code, and that is correct. I'll update >>> the >>>>> KIP. >>>>> >>>>> I do agree that in the long run, it makes more sense as a parameter to >>> the >>>>> store somehow than as a parameter to the window. I think this isn't a >>> super >>>>> high priority, though, because it's not exposed in the DSL (or it >>> wasn't >>>>> intended to be). >>>>> >>>>> I felt Guozhang's point is valid, and that we should probably revisit >>> it >>>>> later, possibly in the scope of >>>>> https://issues.apache.org/jira/browse/KAFKA-4730 >>>>> >>>>> I'll wait an hour or so for more feedback before moving on to a vote. >>>>> >>>>> Thanks again, >>>>> -John >>>>> >>>>> On Mon, Jun 25, 2018 at 12:20 PM Guozhang Wang <wangg...@gmail.com> >>> wrote: >>>>> >>>>>> Re `segmentInterval` parameter in Windows: currently it is used in two >>>>>> places, the windowed stream aggregation, and the stream-stream joins. >>> For >>>>>> the former, we can potentially move the parameter from windowedBy() to >>>>>> Materialized, but for the latter we currently do not expose a >>>>> Materialized >>>>>> object yet, only the Windows spec. So I think in this KIP we probably >>>>>> cannot move it immediately. >>>>>> >>>>>> But in future KIPs if we decide to expose the stream-stream join's >>> store >>>>> / >>>>>> changelog / repartition topic names, we may well adding the >>> Materialized >>>>>> object into the operator, and we can then move the parameter to >>>>>> Materialized by then. >>>>>> >>>>>> >>>>>> Guozhang >>>>>> >>>>>> On Sun, Jun 24, 2018 at 5:16 PM, Matthias J. Sax < >>> matth...@confluent.io> >>>>>> wrote: >>>>>> >>>>>>> Thanks for the KIP. Overall, I think it makes sense to clean up the >>>>> API. >>>>>>> >>>>>>> Couple of comments: >>>>>>> >>>>>>>> Sadly there's no way to "deprecate" this >>>>>>>> exposure >>>>>>> >>>>>>> I disagree. We can just mark the variable as deprecated and I >>> advocate >>>>>>> to do this. When the deprecation period passed, we can make it >>> private >>>>>>> (or actually remove it; cf. my next comment). >>>>>>> >>>>>>> >>>>>>> Parameter, `segmentInterval` is semantically not a "window" >>>>>>> specification parameter but an implementation detail and thus a store >>>>>>> parameter. Would it be better to add it to `Materialized`? >>>>>>> >>>>>>> >>>>>>> -Matthias >>>>>>> >>>>>>> On 6/22/18 5:13 PM, Guozhang Wang wrote: >>>>>>>> Thanks John. >>>>>>>> >>>>>>>> On Fri, Jun 22, 2018 at 5:05 PM, John Roesler <j...@confluent.io> >>>>>> wrote: >>>>>>>> >>>>>>>>> Thanks for the feedback, Bill and Guozhang, >>>>>>>>> >>>>>>>>> I've updated the KIP accordingly. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> -John >>>>>>>>> >>>>>>>>> On Fri, Jun 22, 2018 at 5:51 PM Guozhang Wang <wangg...@gmail.com> >>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Thanks for the KIP. I'm +1 on the proposal. One minor comment on >>>>> the >>>>>>>>> wiki: >>>>>>>>>> the `In Windows, we will:` section code snippet is empty. >>>>>>>>>> >>>>>>>>>> On Fri, Jun 22, 2018 at 3:29 PM, Bill Bejeck <bbej...@gmail.com> >>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi John, >>>>>>>>>>> >>>>>>>>>>> Thanks for the KIP, and overall it's a +1 for me. >>>>>>>>>>> >>>>>>>>>>> In the JavaDoc for the segmentInterval method, there's no mention >>>>> of >>>>>>>>> the >>>>>>>>>>> number of segments there can be at any one time. While it's >>>>> implied >>>>>>>>> that >>>>>>>>>>> the number of segments is potentially unbounded, would be better >>>>> to >>>>>>>>>>> explicitly state that the previous limit on the number of >>> segments >>>>>> is >>>>>>>>>> going >>>>>>>>>>> to be removed as well? >>>>>>>>>>> >>>>>>>>>>> I have a couple of nit comments. The method name is still >>>>>>> segmentSize >>>>>>>>>> in >>>>>>>>>>> the code block vs segmentInterval and the order of the parameters >>>>>> for >>>>>>>>> the >>>>>>>>>>> third persistentWindowStore don't match the order in the JavaDoc. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Bill >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Thu, Jun 21, 2018 at 3:32 PM John Roesler <j...@confluent.io> >>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> I've updated the KIP and draft PR accordingly. >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Jun 21, 2018 at 2:03 PM John Roesler <j...@confluent.io >>>> >>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Interesting... I did not initially consider it because I didn't >>>>>>>>> want >>>>>>>>>> to >>>>>>>>>>>>> have an impact on anyone's Streams apps, but now I see that >>>>> unless >>>>>>>>>>>>> developers have subclassed `Windows`, the number of segments >>>>> would >>>>>>>>>>> always >>>>>>>>>>>>> be 3! >>>>>>>>>>>>> >>>>>>>>>>>>> There's one caveat to this, which I think was a mistake. The >>>>> field >>>>>>>>>>>>> `segments` in Windows is public, which means that anyone can >>>>>>>>> actually >>>>>>>>>>> set >>>>>>>>>>>>> it directly on any Window instance like: >>>>>>>>>>>>> >>>>>>>>>>>>> TimeWindows tw = TimeWindows.of(100); >>>>>>>>>>>>> tw.segments = 12345; >>>>>>>>>>>>> >>>>>>>>>>>>> Bypassing the bounds check and contradicting the javadoc in >>>>>> Windows >>>>>>>>>>> that >>>>>>>>>>>>> says users can't directly set it. Sadly there's no way to >>>>>>>>> "deprecate" >>>>>>>>>>>> this >>>>>>>>>>>>> exposure, so I propose just to make it private. >>>>>>>>>>>>> >>>>>>>>>>>>> With this new knowledge, I agree, I think we can switch to >>>>>>>>>>>>> "segmentInterval" throughout the interface. >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, Jun 20, 2018 at 5:06 PM Guozhang Wang < >>>>> wangg...@gmail.com >>>>>>> >>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hello John, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks for the KIP. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Should we consider making the change on >>>>>>>>>> `Stores#persistentWindowStore` >>>>>>>>>>>>>> parameters as well? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Guozhang >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Wed, Jun 20, 2018 at 1:31 PM, John Roesler < >>>>> j...@confluent.io >>>>>>> >>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Ted, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Ah, when you made that comment to me before, I thought you >>>>> meant >>>>>>>>>> as >>>>>>>>>>>>>> opposed >>>>>>>>>>>>>>> to "segments". Now it makes sense that you meant as opposed >>> to >>>>>>>>>>>>>>> "segmentSize". >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I named it that way to match the peer method "windowSize", >>>>> which >>>>>>>>>> is >>>>>>>>>>>>>> also a >>>>>>>>>>>>>>> quantity of milliseconds. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I agree that "interval" is more intuitive, but I think I >>> favor >>>>>>>>>>>>>> consistency >>>>>>>>>>>>>>> in this case. Does that seem reasonable? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> -John >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Wed, Jun 20, 2018 at 1:06 PM Ted Yu <yuzhih...@gmail.com> >>>>>>>>>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Normally size is not measured in time unit, such as >>>>>>>>>> milliseconds. >>>>>>>>>>>>>>>> How about naming the new method segmentInterval ? >>>>>>>>>>>>>>>> Thanks >>>>>>>>>>>>>>>> -------- Original message --------From: John Roesler < >>>>>>>>>>>>>> j...@confluent.io> >>>>>>>>>>>>>>>> Date: 6/20/18 10:45 AM (GMT-08:00) To: >>>>> dev@kafka.apache.org >>>>>>>>>>>>>> Subject: >>>>>>>>>>>>>>>> [DISCUSS] KIP-319: Replace segments with segmentSize in >>>>>>>>>>>>>>>> WindowBytesStoreSupplier >>>>>>>>>>>>>>>> Hello All, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I'd like to propose KIP-319 to fix an issue I identified in >>>>>>>>>>>>>> KAFKA-7080. >>>>>>>>>>>>>>>> Specifically, we're creating CachingWindowStore with the >>>>>>>>> *number >>>>>>>>>>> of >>>>>>>>>>>>>>>> segments* instead of the *segment size*. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Here's the jira: >>>>>>>>>> https://issues.apache.org/jira/browse/KAFKA-7080 >>>>>>>>>>>>>>>> Here's the KIP: >>> https://cwiki.apache.org/confluence/x/mQU0BQ >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> additionally, here's a draft PR for clarity: >>>>>>>>>>>>>>>> https://github.com/apache/kafka/pull/5257 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Please let me know what you think! >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>> -John >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> -- Guozhang >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> -- Guozhang >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> -- Guozhang >>>>>> >>>>> >>>> >>> >>> >
signature.asc
Description: OpenPGP digital signature