KAFKA-7080 is for this KIP.

I meant to create a JIRA to add `segmentInterval` to `Materialized` and
a JIRA to add `Materialized` to `KStream#join(KStream)`.

Thx.


-Matthias

On 6/25/18 2:46 PM, John Roesler wrote:
> Ah, it turns out I did create a ticket: it's KAFKA-7080:
> https://issues.apache.org/jira/browse/KAFKA-7080
> 
> -John
> 
> On Mon, Jun 25, 2018 at 4:44 PM John Roesler <j...@confluent.io> wrote:
> 
>> Matthias,
>>
>> That's a good idea. I'm not sure why I didn't...
>>
>> Thanks,
>> -John
>>
>> On Mon, Jun 25, 2018 at 4:35 PM Matthias J. Sax <matth...@confluent.io>
>> wrote:
>>
>>> Ok.
>>>
>>> @John: can you create a JIRA to track this? I think KAFKA-4730 is
>>> related, but actually an own ticket (that is blocked by not having
>>> Materialized for stream-stream joins).
>>>
>>>
>>> -Matthias
>>>
>>> On 6/25/18 2:10 PM, Bill Bejeck wrote:
>>>> I agree that it makes sense to have segmentInterval as a parameter to a
>>>> store, but I also agree with Guozhang's point about not moving as part
>>> of
>>>> this KIP.
>>>>
>>>> Thanks,
>>>> Bill
>>>>
>>>> On Mon, Jun 25, 2018 at 4:17 PM John Roesler <j...@confluent.io> wrote:
>>>>
>>>>> Thanks Matthias and Guozhang,
>>>>>
>>>>> About deprecating the "segments" field instead of making it private.
>>> Yes, I
>>>>> just took another look at the code, and that is correct. I'll update
>>> the
>>>>> KIP.
>>>>>
>>>>> I do agree that in the long run, it makes more sense as a parameter to
>>> the
>>>>> store somehow than as a parameter to the window. I think this isn't a
>>> super
>>>>> high priority, though, because it's not exposed in the DSL (or it
>>> wasn't
>>>>> intended to be).
>>>>>
>>>>> I felt Guozhang's point is valid, and that we should probably revisit
>>> it
>>>>> later, possibly in the scope of
>>>>> https://issues.apache.org/jira/browse/KAFKA-4730
>>>>>
>>>>> I'll wait an hour or so for more feedback before moving on to a vote.
>>>>>
>>>>> Thanks again,
>>>>> -John
>>>>>
>>>>> On Mon, Jun 25, 2018 at 12:20 PM Guozhang Wang <wangg...@gmail.com>
>>> wrote:
>>>>>
>>>>>> Re `segmentInterval` parameter in Windows: currently it is used in two
>>>>>> places, the windowed stream aggregation, and the stream-stream joins.
>>> For
>>>>>> the former, we can potentially move the parameter from windowedBy() to
>>>>>> Materialized, but for the latter we currently do not expose a
>>>>> Materialized
>>>>>> object yet, only the Windows spec. So I think in this KIP we probably
>>>>>> cannot move it immediately.
>>>>>>
>>>>>> But in future KIPs if we decide to expose the stream-stream join's
>>> store
>>>>> /
>>>>>> changelog / repartition topic names, we may well adding the
>>> Materialized
>>>>>> object into the operator, and we can then move the parameter to
>>>>>> Materialized by then.
>>>>>>
>>>>>>
>>>>>> Guozhang
>>>>>>
>>>>>> On Sun, Jun 24, 2018 at 5:16 PM, Matthias J. Sax <
>>> matth...@confluent.io>
>>>>>> wrote:
>>>>>>
>>>>>>> Thanks for the KIP. Overall, I think it makes sense to clean up the
>>>>> API.
>>>>>>>
>>>>>>> Couple of comments:
>>>>>>>
>>>>>>>> Sadly there's no way to "deprecate" this
>>>>>>>> exposure
>>>>>>>
>>>>>>> I disagree. We can just mark the variable as deprecated and I
>>> advocate
>>>>>>> to do this. When the deprecation period passed, we can make it
>>> private
>>>>>>> (or actually remove it; cf. my next comment).
>>>>>>>
>>>>>>>
>>>>>>> Parameter, `segmentInterval` is semantically not a "window"
>>>>>>> specification parameter but an implementation detail and thus a store
>>>>>>> parameter. Would it be better to add it to `Materialized`?
>>>>>>>
>>>>>>>
>>>>>>> -Matthias
>>>>>>>
>>>>>>> On 6/22/18 5:13 PM, Guozhang Wang wrote:
>>>>>>>> Thanks John.
>>>>>>>>
>>>>>>>> On Fri, Jun 22, 2018 at 5:05 PM, John Roesler <j...@confluent.io>
>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Thanks for the feedback, Bill and Guozhang,
>>>>>>>>>
>>>>>>>>> I've updated the KIP accordingly.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> -John
>>>>>>>>>
>>>>>>>>> On Fri, Jun 22, 2018 at 5:51 PM Guozhang Wang <wangg...@gmail.com>
>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Thanks for the KIP. I'm +1 on the proposal. One minor comment on
>>>>> the
>>>>>>>>> wiki:
>>>>>>>>>> the `In Windows, we will:` section code snippet is empty.
>>>>>>>>>>
>>>>>>>>>> On Fri, Jun 22, 2018 at 3:29 PM, Bill Bejeck <bbej...@gmail.com>
>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi John,
>>>>>>>>>>>
>>>>>>>>>>> Thanks for the KIP, and overall it's a +1 for me.
>>>>>>>>>>>
>>>>>>>>>>> In the JavaDoc for the segmentInterval method, there's no mention
>>>>> of
>>>>>>>>> the
>>>>>>>>>>> number of segments there can be at any one time.  While it's
>>>>> implied
>>>>>>>>> that
>>>>>>>>>>> the number of segments is potentially unbounded, would be better
>>>>> to
>>>>>>>>>>> explicitly state that the previous limit on the number of
>>> segments
>>>>>> is
>>>>>>>>>> going
>>>>>>>>>>> to be removed as well?
>>>>>>>>>>>
>>>>>>>>>>> I have a couple of nit comments.   The method name is still
>>>>>>> segmentSize
>>>>>>>>>> in
>>>>>>>>>>> the code block vs segmentInterval and the order of the parameters
>>>>>> for
>>>>>>>>> the
>>>>>>>>>>> third persistentWindowStore don't match the order in the JavaDoc.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Bill
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Jun 21, 2018 at 3:32 PM John Roesler <j...@confluent.io>
>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I've updated the KIP and draft PR accordingly.
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Jun 21, 2018 at 2:03 PM John Roesler <j...@confluent.io
>>>>
>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Interesting... I did not initially consider it because I didn't
>>>>>>>>> want
>>>>>>>>>> to
>>>>>>>>>>>>> have an impact on anyone's Streams apps, but now I see that
>>>>> unless
>>>>>>>>>>>>> developers have subclassed `Windows`, the number of segments
>>>>> would
>>>>>>>>>>> always
>>>>>>>>>>>>> be 3!
>>>>>>>>>>>>>
>>>>>>>>>>>>> There's one caveat to this, which I think was a mistake. The
>>>>> field
>>>>>>>>>>>>> `segments` in Windows is public, which means that anyone can
>>>>>>>>> actually
>>>>>>>>>>> set
>>>>>>>>>>>>> it directly on any Window instance like:
>>>>>>>>>>>>>
>>>>>>>>>>>>>         TimeWindows tw = TimeWindows.of(100);
>>>>>>>>>>>>>         tw.segments = 12345;
>>>>>>>>>>>>>
>>>>>>>>>>>>> Bypassing the bounds check and contradicting the javadoc in
>>>>>> Windows
>>>>>>>>>>> that
>>>>>>>>>>>>> says users can't directly set it. Sadly there's no way to
>>>>>>>>> "deprecate"
>>>>>>>>>>>> this
>>>>>>>>>>>>> exposure, so I propose just to make it private.
>>>>>>>>>>>>>
>>>>>>>>>>>>> With this new knowledge, I agree, I think we can switch to
>>>>>>>>>>>>> "segmentInterval" throughout the interface.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Jun 20, 2018 at 5:06 PM Guozhang Wang <
>>>>> wangg...@gmail.com
>>>>>>>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hello John,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for the KIP.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Should we consider making the change on
>>>>>>>>>> `Stores#persistentWindowStore`
>>>>>>>>>>>>>> parameters as well?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Guozhang
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Jun 20, 2018 at 1:31 PM, John Roesler <
>>>>> j...@confluent.io
>>>>>>>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Ted,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Ah, when you made that comment to me before, I thought you
>>>>> meant
>>>>>>>>>> as
>>>>>>>>>>>>>> opposed
>>>>>>>>>>>>>>> to "segments". Now it makes sense that you meant as opposed
>>> to
>>>>>>>>>>>>>>> "segmentSize".
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I named it that way to match the peer method "windowSize",
>>>>> which
>>>>>>>>>> is
>>>>>>>>>>>>>> also a
>>>>>>>>>>>>>>> quantity of milliseconds.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I agree that "interval" is more intuitive, but I think I
>>> favor
>>>>>>>>>>>>>> consistency
>>>>>>>>>>>>>>> in this case. Does that seem reasonable?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> -John
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Jun 20, 2018 at 1:06 PM Ted Yu <yuzhih...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Normally size is not measured in time unit, such as
>>>>>>>>>> milliseconds.
>>>>>>>>>>>>>>>> How about naming the new method segmentInterval ?
>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>> -------- Original message --------From: John Roesler <
>>>>>>>>>>>>>> j...@confluent.io>
>>>>>>>>>>>>>>>> Date: 6/20/18  10:45 AM  (GMT-08:00) To:
>>>>> dev@kafka.apache.org
>>>>>>>>>>>>>> Subject:
>>>>>>>>>>>>>>>> [DISCUSS] KIP-319: Replace segments with segmentSize in
>>>>>>>>>>>>>>>> WindowBytesStoreSupplier
>>>>>>>>>>>>>>>> Hello All,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'd like to propose KIP-319 to fix an issue I identified in
>>>>>>>>>>>>>> KAFKA-7080.
>>>>>>>>>>>>>>>> Specifically, we're creating CachingWindowStore with the
>>>>>>>>> *number
>>>>>>>>>>> of
>>>>>>>>>>>>>>>> segments* instead of the *segment size*.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Here's the jira:
>>>>>>>>>> https://issues.apache.org/jira/browse/KAFKA-7080
>>>>>>>>>>>>>>>> Here's the KIP:
>>> https://cwiki.apache.org/confluence/x/mQU0BQ
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> additionally, here's a draft PR for clarity:
>>>>>>>>>>>>>>>> https://github.com/apache/kafka/pull/5257
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Please let me know what you think!
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> -John
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> -- Guozhang
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> -- Guozhang
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> -- Guozhang
>>>>>>
>>>>>
>>>>
>>>
>>>
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to