Re: [DISCUSS] KIP-372: Naming Joins and Grouping

2018-09-16 Thread Matthias J. Sax
Thant for the update Bill. LGTM. On 9/14/18 5:33 PM, Guozhang Wang wrote: > Hello Bill, > > I've made another pass over the wiki page and it lgtm now. Thanks! > > > Guozhang > > > On Fri, Sep 14, 2018 at 3:05 PM, John Roesler wrote: > >> Hey Bill (and Guozhang), >> >> Apologies. I misunder

Re: [DISCUSS] KIP-372: Naming Joins and Grouping

2018-09-14 Thread Guozhang Wang
Hello Bill, I've made another pass over the wiki page and it lgtm now. Thanks! Guozhang On Fri, Sep 14, 2018 at 3:05 PM, John Roesler wrote: > Hey Bill (and Guozhang), > > Apologies. I misunderstood what Guozhang was getting at in his earlier > remark. > > I'm a +1 on the current proposal. >

Re: [DISCUSS] KIP-372: Naming Joins and Grouping

2018-09-14 Thread John Roesler
Hey Bill (and Guozhang), Apologies. I misunderstood what Guozhang was getting at in his earlier remark. I'm a +1 on the current proposal. Thanks, -John On Fri, Sep 14, 2018 at 3:10 PM Bill Bejeck wrote: > All, > > Thanks for the comments, I'll respond to all of your comments in the > recieved

Re: [DISCUSS] KIP-372: Naming Joins and Grouping

2018-09-14 Thread Bill Bejeck
All, Thanks for the comments, I'll respond to all of your comments in the recieved order. Guozhang, > And if [join-name] not specified, stay the same, which is: > * [previous-processor-name]-repartition for both Stream-Stream (S-S) join and S-T join I believe the current approach is [appId]

Re: [DISCUSS] KIP-372: Naming Joins and Grouping

2018-09-14 Thread Guozhang Wang
Hello John, Not sure if I completely understand your email above. Are you suggesting to still use the proposed Joined / Grouped object to indicate the underlying processor names *in addition to* the repartition topic names? My reasoning is that, if we do not want to use the proposed names to indi

Re: [DISCUSS] KIP-372: Naming Joins and Grouping

2018-09-13 Thread John Roesler
Hey all, I think it's slightly out of scope for this KIP, but I'm not sure it's right to add a name to ValueJoiner or KeyValueMapper. Both of those are "functional interfaces", that is, they are basically named functions. It seems like we should preserve this property both to provide a clean sepa

Re: [DISCUSS] KIP-372: Naming Joins and Grouping

2018-09-13 Thread Guozhang Wang
Just to clarify on 2): currently KIP-307 do not have proposed APIs for `groupBy/groupByKey` naming schemes, and for joins its current proposal is to extend ValueJoiner with Named and hence this part is what I meant to have "overlaps". Thinking about it a bit more, since Joined is only used for S-S

Re: [DISCUSS] KIP-372: Naming Joins and Grouping

2018-09-13 Thread Matthias J. Sax
Three more comments: (1) For `Grouped` should we add `with(String name, Serde key, Serde value)` to allow specifying all parameters at once? Produced/Consumed/Serialized etc follow a similar pattern. There is one static method for each config parameter, plus one static method that accepts all par

Re: [DISCUSS] KIP-372: Naming Joins and Grouping

2018-09-13 Thread Matthias J. Sax
I don't know what Samza does, however, Flink requires users to specify names similar to this proposal to be able to re-identify state in case the topology gets altered between deployments. Flink only has state they need to worry about. For Kafka Streams, it's state plus repartition topics. -Matt

Re: [DISCUSS] KIP-372: Naming Joins and Grouping

2018-09-13 Thread Eno Thereska
Hi folks, I know we don't normally have a "Related work" section in KIPs, but sometimes I find it useful to see what others have done in similar cases. Since this will be important for rolling re-deployments, I wonder what other frameworks like Flink (or Samza) have done in these cases. Perhaps th

Re: [DISCUSS] KIP-372: Naming Joins and Grouping

2018-09-12 Thread Matthias J. Sax
Follow up comments: 1) We should either use `[app-id]-this|other-[join-name]-repartition` or `app-id]-[join-name]-left|right-repartition` but we should not change the pattern depending if the user specifies a name of not. I am fine with both patterns---just want to make sure with stick with one.

Re: [DISCUSS] KIP-372: Naming Joins and Grouping

2018-09-12 Thread Guozhang Wang
Hello Bill, I made a pass over your proposal and here are some questions: 1. For Joined names, the current proposal is to define the repartition topic names as * [app-id]-this-[join-name]-repartition * [app-id]-other-[join-name]-repartition And if [join-name] not specified, stay the same, whi

[DISCUSS] KIP-372: Naming Joins and Grouping

2018-09-12 Thread Bill Bejeck
All I'd like to start a discussion on KIP-372 for the naming of joins and grouping operations in Kafka Streams. The KIP page can be found here: https://cwiki.apache.org/confluence/display/KAFKA/KIP-372%3A+Naming+Joins+and+Grouping I look forward to feedback and comments. Thanks, Bill