Thanks Matthias and Guozhang,

About deprecating the "segments" field instead of making it private. Yes, I
just took another look at the code, and that is correct. I'll update the
KIP.

I do agree that in the long run, it makes more sense as a parameter to the
store somehow than as a parameter to the window. I think this isn't a super
high priority, though, because it's not exposed in the DSL (or it wasn't
intended to be).

I felt Guozhang's point is valid, and that we should probably revisit it
later, possibly in the scope of
https://issues.apache.org/jira/browse/KAFKA-4730

I'll wait an hour or so for more feedback before moving on to a vote.

Thanks again,
-John

On Mon, Jun 25, 2018 at 12:20 PM Guozhang Wang <wangg...@gmail.com> wrote:

> Re `segmentInterval` parameter in Windows: currently it is used in two
> places, the windowed stream aggregation, and the stream-stream joins. For
> the former, we can potentially move the parameter from windowedBy() to
> Materialized, but for the latter we currently do not expose a Materialized
> object yet, only the Windows spec. So I think in this KIP we probably
> cannot move it immediately.
>
> But in future KIPs if we decide to expose the stream-stream join's store /
> changelog / repartition topic names, we may well adding the Materialized
> object into the operator, and we can then move the parameter to
> Materialized by then.
>
>
> Guozhang
>
> On Sun, Jun 24, 2018 at 5:16 PM, Matthias J. Sax <matth...@confluent.io>
> wrote:
>
> > Thanks for the KIP. Overall, I think it makes sense to clean up the API.
> >
> > Couple of comments:
> >
> > > Sadly there's no way to "deprecate" this
> > > exposure
> >
> > I disagree. We can just mark the variable as deprecated and I advocate
> > to do this. When the deprecation period passed, we can make it private
> > (or actually remove it; cf. my next comment).
> >
> >
> > Parameter, `segmentInterval` is semantically not a "window"
> > specification parameter but an implementation detail and thus a store
> > parameter. Would it be better to add it to `Materialized`?
> >
> >
> > -Matthias
> >
> > On 6/22/18 5:13 PM, Guozhang Wang wrote:
> > > Thanks John.
> > >
> > > On Fri, Jun 22, 2018 at 5:05 PM, John Roesler <j...@confluent.io>
> wrote:
> > >
> > >> Thanks for the feedback, Bill and Guozhang,
> > >>
> > >> I've updated the KIP accordingly.
> > >>
> > >> Thanks,
> > >> -John
> > >>
> > >> On Fri, Jun 22, 2018 at 5:51 PM Guozhang Wang <wangg...@gmail.com>
> > wrote:
> > >>
> > >>> Thanks for the KIP. I'm +1 on the proposal. One minor comment on the
> > >> wiki:
> > >>> the `In Windows, we will:` section code snippet is empty.
> > >>>
> > >>> On Fri, Jun 22, 2018 at 3:29 PM, Bill Bejeck <bbej...@gmail.com>
> > wrote:
> > >>>
> > >>>> Hi John,
> > >>>>
> > >>>> Thanks for the KIP, and overall it's a +1 for me.
> > >>>>
> > >>>> In the JavaDoc for the segmentInterval method, there's no mention of
> > >> the
> > >>>> number of segments there can be at any one time.  While it's implied
> > >> that
> > >>>> the number of segments is potentially unbounded, would be better to
> > >>>> explicitly state that the previous limit on the number of segments
> is
> > >>> going
> > >>>> to be removed as well?
> > >>>>
> > >>>> I have a couple of nit comments.   The method name is still
> > segmentSize
> > >>> in
> > >>>> the code block vs segmentInterval and the order of the parameters
> for
> > >> the
> > >>>> third persistentWindowStore don't match the order in the JavaDoc.
> > >>>>
> > >>>> Thanks,
> > >>>> Bill
> > >>>>
> > >>>>
> > >>>>
> > >>>> On Thu, Jun 21, 2018 at 3:32 PM John Roesler <j...@confluent.io>
> > >> wrote:
> > >>>>
> > >>>>> I've updated the KIP and draft PR accordingly.
> > >>>>>
> > >>>>> On Thu, Jun 21, 2018 at 2:03 PM John Roesler <j...@confluent.io>
> > >>> wrote:
> > >>>>>
> > >>>>>> Interesting... I did not initially consider it because I didn't
> > >> want
> > >>> to
> > >>>>>> have an impact on anyone's Streams apps, but now I see that unless
> > >>>>>> developers have subclassed `Windows`, the number of segments would
> > >>>> always
> > >>>>>> be 3!
> > >>>>>>
> > >>>>>> There's one caveat to this, which I think was a mistake. The field
> > >>>>>> `segments` in Windows is public, which means that anyone can
> > >> actually
> > >>>> set
> > >>>>>> it directly on any Window instance like:
> > >>>>>>
> > >>>>>>         TimeWindows tw = TimeWindows.of(100);
> > >>>>>>         tw.segments = 12345;
> > >>>>>>
> > >>>>>> Bypassing the bounds check and contradicting the javadoc in
> Windows
> > >>>> that
> > >>>>>> says users can't directly set it. Sadly there's no way to
> > >> "deprecate"
> > >>>>> this
> > >>>>>> exposure, so I propose just to make it private.
> > >>>>>>
> > >>>>>> With this new knowledge, I agree, I think we can switch to
> > >>>>>> "segmentInterval" throughout the interface.
> > >>>>>>
> > >>>>>> On Wed, Jun 20, 2018 at 5:06 PM Guozhang Wang <wangg...@gmail.com
> >
> > >>>>> wrote:
> > >>>>>>
> > >>>>>>> Hello John,
> > >>>>>>>
> > >>>>>>> Thanks for the KIP.
> > >>>>>>>
> > >>>>>>> Should we consider making the change on
> > >>> `Stores#persistentWindowStore`
> > >>>>>>> parameters as well?
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> Guozhang
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On Wed, Jun 20, 2018 at 1:31 PM, John Roesler <j...@confluent.io
> >
> > >>>>> wrote:
> > >>>>>>>
> > >>>>>>>> Hi Ted,
> > >>>>>>>>
> > >>>>>>>> Ah, when you made that comment to me before, I thought you meant
> > >>> as
> > >>>>>>> opposed
> > >>>>>>>> to "segments". Now it makes sense that you meant as opposed to
> > >>>>>>>> "segmentSize".
> > >>>>>>>>
> > >>>>>>>> I named it that way to match the peer method "windowSize", which
> > >>> is
> > >>>>>>> also a
> > >>>>>>>> quantity of milliseconds.
> > >>>>>>>>
> > >>>>>>>> I agree that "interval" is more intuitive, but I think I favor
> > >>>>>>> consistency
> > >>>>>>>> in this case. Does that seem reasonable?
> > >>>>>>>>
> > >>>>>>>> Thanks,
> > >>>>>>>> -John
> > >>>>>>>>
> > >>>>>>>> On Wed, Jun 20, 2018 at 1:06 PM Ted Yu <yuzhih...@gmail.com>
> > >>> wrote:
> > >>>>>>>>
> > >>>>>>>>> Normally size is not measured in time unit, such as
> > >>> milliseconds.
> > >>>>>>>>> How about naming the new method segmentInterval ?
> > >>>>>>>>> Thanks
> > >>>>>>>>> -------- Original message --------From: John Roesler <
> > >>>>>>> j...@confluent.io>
> > >>>>>>>>> Date: 6/20/18  10:45 AM  (GMT-08:00) To: dev@kafka.apache.org
> > >>>>>>> Subject:
> > >>>>>>>>> [DISCUSS] KIP-319: Replace segments with segmentSize in
> > >>>>>>>>> WindowBytesStoreSupplier
> > >>>>>>>>> Hello All,
> > >>>>>>>>>
> > >>>>>>>>> I'd like to propose KIP-319 to fix an issue I identified in
> > >>>>>>> KAFKA-7080.
> > >>>>>>>>> Specifically, we're creating CachingWindowStore with the
> > >> *number
> > >>>> of
> > >>>>>>>>> segments* instead of the *segment size*.
> > >>>>>>>>>
> > >>>>>>>>> Here's the jira:
> > >>> https://issues.apache.org/jira/browse/KAFKA-7080
> > >>>>>>>>> Here's the KIP: https://cwiki.apache.org/confluence/x/mQU0BQ
> > >>>>>>>>>
> > >>>>>>>>> additionally, here's a draft PR for clarity:
> > >>>>>>>>> https://github.com/apache/kafka/pull/5257
> > >>>>>>>>>
> > >>>>>>>>> Please let me know what you think!
> > >>>>>>>>>
> > >>>>>>>>> Thanks,
> > >>>>>>>>> -John
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> --
> > >>>>>>> -- Guozhang
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>>
> > >>>
> > >>> --
> > >>> -- Guozhang
> > >>>
> > >>
> > >
> > >
> > >
> >
> >
>
>
> --
> -- Guozhang
>

Reply via email to