Ok. @John: can you create a JIRA to track this? I think KAFKA-4730 is related, but actually an own ticket (that is blocked by not having Materialized for stream-stream joins).
-Matthias On 6/25/18 2:10 PM, Bill Bejeck wrote: > I agree that it makes sense to have segmentInterval as a parameter to a > store, but I also agree with Guozhang's point about not moving as part of > this KIP. > > Thanks, > Bill > > On Mon, Jun 25, 2018 at 4:17 PM John Roesler <j...@confluent.io> wrote: > >> Thanks Matthias and Guozhang, >> >> About deprecating the "segments" field instead of making it private. Yes, I >> just took another look at the code, and that is correct. I'll update the >> KIP. >> >> I do agree that in the long run, it makes more sense as a parameter to the >> store somehow than as a parameter to the window. I think this isn't a super >> high priority, though, because it's not exposed in the DSL (or it wasn't >> intended to be). >> >> I felt Guozhang's point is valid, and that we should probably revisit it >> later, possibly in the scope of >> https://issues.apache.org/jira/browse/KAFKA-4730 >> >> I'll wait an hour or so for more feedback before moving on to a vote. >> >> Thanks again, >> -John >> >> On Mon, Jun 25, 2018 at 12:20 PM Guozhang Wang <wangg...@gmail.com> wrote: >> >>> Re `segmentInterval` parameter in Windows: currently it is used in two >>> places, the windowed stream aggregation, and the stream-stream joins. For >>> the former, we can potentially move the parameter from windowedBy() to >>> Materialized, but for the latter we currently do not expose a >> Materialized >>> object yet, only the Windows spec. So I think in this KIP we probably >>> cannot move it immediately. >>> >>> But in future KIPs if we decide to expose the stream-stream join's store >> / >>> changelog / repartition topic names, we may well adding the Materialized >>> object into the operator, and we can then move the parameter to >>> Materialized by then. >>> >>> >>> Guozhang >>> >>> On Sun, Jun 24, 2018 at 5:16 PM, Matthias J. Sax <matth...@confluent.io> >>> wrote: >>> >>>> Thanks for the KIP. Overall, I think it makes sense to clean up the >> API. >>>> >>>> Couple of comments: >>>> >>>>> Sadly there's no way to "deprecate" this >>>>> exposure >>>> >>>> I disagree. We can just mark the variable as deprecated and I advocate >>>> to do this. When the deprecation period passed, we can make it private >>>> (or actually remove it; cf. my next comment). >>>> >>>> >>>> Parameter, `segmentInterval` is semantically not a "window" >>>> specification parameter but an implementation detail and thus a store >>>> parameter. Would it be better to add it to `Materialized`? >>>> >>>> >>>> -Matthias >>>> >>>> On 6/22/18 5:13 PM, Guozhang Wang wrote: >>>>> Thanks John. >>>>> >>>>> On Fri, Jun 22, 2018 at 5:05 PM, John Roesler <j...@confluent.io> >>> wrote: >>>>> >>>>>> Thanks for the feedback, Bill and Guozhang, >>>>>> >>>>>> I've updated the KIP accordingly. >>>>>> >>>>>> Thanks, >>>>>> -John >>>>>> >>>>>> On Fri, Jun 22, 2018 at 5:51 PM Guozhang Wang <wangg...@gmail.com> >>>> wrote: >>>>>> >>>>>>> Thanks for the KIP. I'm +1 on the proposal. One minor comment on >> the >>>>>> wiki: >>>>>>> the `In Windows, we will:` section code snippet is empty. >>>>>>> >>>>>>> On Fri, Jun 22, 2018 at 3:29 PM, Bill Bejeck <bbej...@gmail.com> >>>> wrote: >>>>>>> >>>>>>>> Hi John, >>>>>>>> >>>>>>>> Thanks for the KIP, and overall it's a +1 for me. >>>>>>>> >>>>>>>> In the JavaDoc for the segmentInterval method, there's no mention >> of >>>>>> the >>>>>>>> number of segments there can be at any one time. While it's >> implied >>>>>> that >>>>>>>> the number of segments is potentially unbounded, would be better >> to >>>>>>>> explicitly state that the previous limit on the number of segments >>> is >>>>>>> going >>>>>>>> to be removed as well? >>>>>>>> >>>>>>>> I have a couple of nit comments. The method name is still >>>> segmentSize >>>>>>> in >>>>>>>> the code block vs segmentInterval and the order of the parameters >>> for >>>>>> the >>>>>>>> third persistentWindowStore don't match the order in the JavaDoc. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Bill >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Jun 21, 2018 at 3:32 PM John Roesler <j...@confluent.io> >>>>>> wrote: >>>>>>>> >>>>>>>>> I've updated the KIP and draft PR accordingly. >>>>>>>>> >>>>>>>>> On Thu, Jun 21, 2018 at 2:03 PM John Roesler <j...@confluent.io> >>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Interesting... I did not initially consider it because I didn't >>>>>> want >>>>>>> to >>>>>>>>>> have an impact on anyone's Streams apps, but now I see that >> unless >>>>>>>>>> developers have subclassed `Windows`, the number of segments >> would >>>>>>>> always >>>>>>>>>> be 3! >>>>>>>>>> >>>>>>>>>> There's one caveat to this, which I think was a mistake. The >> field >>>>>>>>>> `segments` in Windows is public, which means that anyone can >>>>>> actually >>>>>>>> set >>>>>>>>>> it directly on any Window instance like: >>>>>>>>>> >>>>>>>>>> TimeWindows tw = TimeWindows.of(100); >>>>>>>>>> tw.segments = 12345; >>>>>>>>>> >>>>>>>>>> Bypassing the bounds check and contradicting the javadoc in >>> Windows >>>>>>>> that >>>>>>>>>> says users can't directly set it. Sadly there's no way to >>>>>> "deprecate" >>>>>>>>> this >>>>>>>>>> exposure, so I propose just to make it private. >>>>>>>>>> >>>>>>>>>> With this new knowledge, I agree, I think we can switch to >>>>>>>>>> "segmentInterval" throughout the interface. >>>>>>>>>> >>>>>>>>>> On Wed, Jun 20, 2018 at 5:06 PM Guozhang Wang < >> wangg...@gmail.com >>>> >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hello John, >>>>>>>>>>> >>>>>>>>>>> Thanks for the KIP. >>>>>>>>>>> >>>>>>>>>>> Should we consider making the change on >>>>>>> `Stores#persistentWindowStore` >>>>>>>>>>> parameters as well? >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Guozhang >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Wed, Jun 20, 2018 at 1:31 PM, John Roesler < >> j...@confluent.io >>>> >>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi Ted, >>>>>>>>>>>> >>>>>>>>>>>> Ah, when you made that comment to me before, I thought you >> meant >>>>>>> as >>>>>>>>>>> opposed >>>>>>>>>>>> to "segments". Now it makes sense that you meant as opposed to >>>>>>>>>>>> "segmentSize". >>>>>>>>>>>> >>>>>>>>>>>> I named it that way to match the peer method "windowSize", >> which >>>>>>> is >>>>>>>>>>> also a >>>>>>>>>>>> quantity of milliseconds. >>>>>>>>>>>> >>>>>>>>>>>> I agree that "interval" is more intuitive, but I think I favor >>>>>>>>>>> consistency >>>>>>>>>>>> in this case. Does that seem reasonable? >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> -John >>>>>>>>>>>> >>>>>>>>>>>> On Wed, Jun 20, 2018 at 1:06 PM Ted Yu <yuzhih...@gmail.com> >>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Normally size is not measured in time unit, such as >>>>>>> milliseconds. >>>>>>>>>>>>> How about naming the new method segmentInterval ? >>>>>>>>>>>>> Thanks >>>>>>>>>>>>> -------- Original message --------From: John Roesler < >>>>>>>>>>> j...@confluent.io> >>>>>>>>>>>>> Date: 6/20/18 10:45 AM (GMT-08:00) To: >> dev@kafka.apache.org >>>>>>>>>>> Subject: >>>>>>>>>>>>> [DISCUSS] KIP-319: Replace segments with segmentSize in >>>>>>>>>>>>> WindowBytesStoreSupplier >>>>>>>>>>>>> Hello All, >>>>>>>>>>>>> >>>>>>>>>>>>> I'd like to propose KIP-319 to fix an issue I identified in >>>>>>>>>>> KAFKA-7080. >>>>>>>>>>>>> Specifically, we're creating CachingWindowStore with the >>>>>> *number >>>>>>>> of >>>>>>>>>>>>> segments* instead of the *segment size*. >>>>>>>>>>>>> >>>>>>>>>>>>> Here's the jira: >>>>>>> https://issues.apache.org/jira/browse/KAFKA-7080 >>>>>>>>>>>>> Here's the KIP: https://cwiki.apache.org/confluence/x/mQU0BQ >>>>>>>>>>>>> >>>>>>>>>>>>> additionally, here's a draft PR for clarity: >>>>>>>>>>>>> https://github.com/apache/kafka/pull/5257 >>>>>>>>>>>>> >>>>>>>>>>>>> Please let me know what you think! >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> -John >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> -- Guozhang >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> -- Guozhang >>>>>>> >>>>>> >>>>> >>>>> >>>>> >>>> >>>> >>> >>> >>> -- >>> -- Guozhang >>> >> >
signature.asc
Description: OpenPGP digital signature