[jira] [Commented] (KAFKA-5245) KStream builder should capture serdes
[ https://issues.apache.org/jira/browse/KAFKA-5245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16049528#comment-16049528 ] Matthias J. Sax commented on KAFKA-5245: I just did this :) > KStream builder should capture serdes > -- > > Key: KAFKA-5245 > URL: https://issues.apache.org/jira/browse/KAFKA-5245 > Project: Kafka > Issue Type: Improvement > Components: streams >Affects Versions: 0.10.2.0, 0.10.2.1 >Reporter: Yeva Byzek >Assignee: anugrah >Priority: Minor > Labels: needs-kip > > Even if one specifies a serdes in `builder.stream`, later a call to > `groupByKey` may require the serdes again if it differs from the configured > streams app serdes. The preferred behavior is that if no serdes is provided > to `groupByKey`, it should use whatever was provided in `builder.stream` and > not what was in the app. > From the current docs: > “When to set explicit serdes: Variants of groupByKey exist to override the > configured default serdes of your application, which you must do if the key > and/or value types of the resulting KGroupedStream do not match the > configured default serdes.” -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5245) KStream builder should capture serdes
[ https://issues.apache.org/jira/browse/KAFKA-5245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16049523#comment-16049523 ] Evgeny Veretennikov commented on KAFKA-5245: [~guozhang], thanks for explanation! I guess, we should remove `beginner` and `newbie` tags from this ticket then... > KStream builder should capture serdes > -- > > Key: KAFKA-5245 > URL: https://issues.apache.org/jira/browse/KAFKA-5245 > Project: Kafka > Issue Type: Improvement > Components: streams >Affects Versions: 0.10.2.0, 0.10.2.1 >Reporter: Yeva Byzek >Assignee: anugrah >Priority: Minor > Labels: beginner, newbie > > Even if one specifies a serdes in `builder.stream`, later a call to > `groupByKey` may require the serdes again if it differs from the configured > streams app serdes. The preferred behavior is that if no serdes is provided > to `groupByKey`, it should use whatever was provided in `builder.stream` and > not what was in the app. > From the current docs: > “When to set explicit serdes: Variants of groupByKey exist to override the > configured default serdes of your application, which you must do if the key > and/or value types of the resulting KGroupedStream do not match the > configured default serdes.” -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5245) KStream builder should capture serdes
[ https://issues.apache.org/jira/browse/KAFKA-5245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16049515#comment-16049515 ] Guozhang Wang commented on KAFKA-5245: -- Anyways [~evis] I have added you to the contributor list, feel free to pick up tickets moving forward. Note that there are a list of tickets tagged with `newbie` and `newbie++` whose issue scope and solution proposal may be better scoped to start working on. > KStream builder should capture serdes > -- > > Key: KAFKA-5245 > URL: https://issues.apache.org/jira/browse/KAFKA-5245 > Project: Kafka > Issue Type: Improvement > Components: streams >Affects Versions: 0.10.2.0, 0.10.2.1 >Reporter: Yeva Byzek >Assignee: anugrah >Priority: Minor > Labels: beginner, newbie > > Even if one specifies a serdes in `builder.stream`, later a call to > `groupByKey` may require the serdes again if it differs from the configured > streams app serdes. The preferred behavior is that if no serdes is provided > to `groupByKey`, it should use whatever was provided in `builder.stream` and > not what was in the app. > From the current docs: > “When to set explicit serdes: Variants of groupByKey exist to override the > configured default serdes of your application, which you must do if the key > and/or value types of the resulting KGroupedStream do not match the > configured default serdes.” -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5245) KStream builder should capture serdes
[ https://issues.apache.org/jira/browse/KAFKA-5245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16049511#comment-16049511 ] Guozhang Wang commented on KAFKA-5245: -- Here is my take on the serde issues: We currently follow the pattern that "there is a global default serde specified in config; and whenever users want to override them, they need to do that specifically"; the suggested pattern which is "inheriting the overridden serde from the source, i.e. builder.stream/table to the downstream operators". I'm not sure which one is more intuitive, given that whenever you have some operators in between like "map", the serdes cannot be inherited any more and users are still required to specify them. So that by changing this pattern we cannot actually remove those overloads that requires serdes, and users need to have a deeper understanding on "when do I need to override the serde, and when I do not need to", this seems trickier for normal users than the somewhat-dumb but easy-to-understand pattern we have now. We have also considered another way of handling serdes long time ago, to infer data types with registered serdes so that users would "NEVER" need to specify the serde along with the topology, but unfortunately due to Java type erasure this is not 100 percent feasible: https://cwiki.apache.org/confluence/display/KAFKA/Discussion%3A+Serialization+and+Deserialization+Options That being said, for the specific issue that Yeva rasied, I think for {{groupByKey}} and {{selectKey}}, it is indeed counter-intuitive to even require a serde, only because we may need to repartition since the key has changed, and hence needing the serde to read / write to Kafka. That is, we are sort of exposing the internal implementation of the DSL that we want to hide from the users in its interface. I do not have a better solution on top of my head for this specific issue since again we cannot always inherit the serde from the source stream if there is some operators in between that changed the key / value types, like {{map}}. > KStream builder should capture serdes > -- > > Key: KAFKA-5245 > URL: https://issues.apache.org/jira/browse/KAFKA-5245 > Project: Kafka > Issue Type: Improvement > Components: streams >Affects Versions: 0.10.2.0, 0.10.2.1 >Reporter: Yeva Byzek >Assignee: anugrah >Priority: Minor > Labels: beginner, newbie > > Even if one specifies a serdes in `builder.stream`, later a call to > `groupByKey` may require the serdes again if it differs from the configured > streams app serdes. The preferred behavior is that if no serdes is provided > to `groupByKey`, it should use whatever was provided in `builder.stream` and > not what was in the app. > From the current docs: > “When to set explicit serdes: Variants of groupByKey exist to override the > configured default serdes of your application, which you must do if the key > and/or value types of the resulting KGroupedStream do not match the > configured default serdes.” -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5245) KStream builder should capture serdes
[ https://issues.apache.org/jira/browse/KAFKA-5245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16049353#comment-16049353 ] Matthias J. Sax commented on KAFKA-5245: That's a valid point you raise... Not sure about a KIP -- it might be better to do one IMHO. \cc [~guozhang] WDYT? > KStream builder should capture serdes > -- > > Key: KAFKA-5245 > URL: https://issues.apache.org/jira/browse/KAFKA-5245 > Project: Kafka > Issue Type: Improvement > Components: streams >Affects Versions: 0.10.2.0, 0.10.2.1 >Reporter: Yeva Byzek >Assignee: anugrah >Priority: Minor > Labels: beginner, newbie > > Even if one specifies a serdes in `builder.stream`, later a call to > `groupByKey` may require the serdes again if it differs from the configured > streams app serdes. The preferred behavior is that if no serdes is provided > to `groupByKey`, it should use whatever was provided in `builder.stream` and > not what was in the app. > From the current docs: > “When to set explicit serdes: Variants of groupByKey exist to override the > configured default serdes of your application, which you must do if the key > and/or value types of the resulting KGroupedStream do not match the > configured default serdes.” -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5245) KStream builder should capture serdes
[ https://issues.apache.org/jira/browse/KAFKA-5245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16048738#comment-16048738 ] Evgeny Veretennikov commented on KAFKA-5245: There are actually a lot of KStream methods, that require serdes again, not only groupByKey(). For example, there are print(), through(), to(), etc. We should pass serdes from builder.stream() to all such methods, am I right? Also, behaviour of methods like groupByKey() will change after solving this ticket. Clients could be broken, in case they really need to use default serdes. Do we need KIP for this ticket? > KStream builder should capture serdes > -- > > Key: KAFKA-5245 > URL: https://issues.apache.org/jira/browse/KAFKA-5245 > Project: Kafka > Issue Type: Improvement > Components: streams >Affects Versions: 0.10.2.0, 0.10.2.1 >Reporter: Yeva Byzek >Assignee: anugrah >Priority: Minor > Labels: beginner, newbie > > Even if one specifies a serdes in `builder.stream`, later a call to > `groupByKey` may require the serdes again if it differs from the configured > streams app serdes. The preferred behavior is that if no serdes is provided > to `groupByKey`, it should use whatever was provided in `builder.stream` and > not what was in the app. > From the current docs: > “When to set explicit serdes: Variants of groupByKey exist to override the > configured default serdes of your application, which you must do if the key > and/or value types of the resulting KGroupedStream do not match the > configured default serdes.” -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5245) KStream builder should capture serdes
[ https://issues.apache.org/jira/browse/KAFKA-5245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16048119#comment-16048119 ] Matthias J. Sax commented on KAFKA-5245: [~guozhang] Can you add [~evis] to the contributor list so he can assign tickets to himself. Thx. If [~anukin] does not response, we should reassign the ticket. > KStream builder should capture serdes > -- > > Key: KAFKA-5245 > URL: https://issues.apache.org/jira/browse/KAFKA-5245 > Project: Kafka > Issue Type: Improvement > Components: streams >Affects Versions: 0.10.2.0, 0.10.2.1 >Reporter: Yeva Byzek >Assignee: anugrah >Priority: Minor > Labels: beginner, newbie > > Even if one specifies a serdes in `builder.stream`, later a call to > `groupByKey` may require the serdes again if it differs from the configured > streams app serdes. The preferred behavior is that if no serdes is provided > to `groupByKey`, it should use whatever was provided in `builder.stream` and > not what was in the app. > From the current docs: > “When to set explicit serdes: Variants of groupByKey exist to override the > configured default serdes of your application, which you must do if the key > and/or value types of the resulting KGroupedStream do not match the > configured default serdes.” -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5245) KStream builder should capture serdes
[ https://issues.apache.org/jira/browse/KAFKA-5245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16047933#comment-16047933 ] Evgeny Veretennikov commented on KAFKA-5245: Matthias, thanks for explanation! How can I self-assign this ticket? Seems like current assigner [~anukin] isn't active here... > KStream builder should capture serdes > -- > > Key: KAFKA-5245 > URL: https://issues.apache.org/jira/browse/KAFKA-5245 > Project: Kafka > Issue Type: Improvement > Components: streams >Affects Versions: 0.10.2.0, 0.10.2.1 >Reporter: Yeva Byzek >Assignee: anugrah >Priority: Minor > Labels: beginner, newbie > > Even if one specifies a serdes in `builder.stream`, later a call to > `groupByKey` may require the serdes again if it differs from the configured > streams app serdes. The preferred behavior is that if no serdes is provided > to `groupByKey`, it should use whatever was provided in `builder.stream` and > not what was in the app. > From the current docs: > “When to set explicit serdes: Variants of groupByKey exist to override the > configured default serdes of your application, which you must do if the key > and/or value types of the resulting KGroupedStream do not match the > configured default serdes.” -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5245) KStream builder should capture serdes
[ https://issues.apache.org/jira/browse/KAFKA-5245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16047608#comment-16047608 ] Michal Borowiecki commented on KAFKA-5245: -- Just wanted to say it's great to see there's a ticket for this :-) Always found it counter-intuitive that the default serdes are taken from config instead of upstream in these cases. > KStream builder should capture serdes > -- > > Key: KAFKA-5245 > URL: https://issues.apache.org/jira/browse/KAFKA-5245 > Project: Kafka > Issue Type: Improvement > Components: streams >Affects Versions: 0.10.2.0, 0.10.2.1 >Reporter: Yeva Byzek >Assignee: anugrah >Priority: Minor > Labels: beginner, newbie > > Even if one specifies a serdes in `builder.stream`, later a call to > `groupByKey` may require the serdes again if it differs from the configured > streams app serdes. The preferred behavior is that if no serdes is provided > to `groupByKey`, it should use whatever was provided in `builder.stream` and > not what was in the app. > From the current docs: > “When to set explicit serdes: Variants of groupByKey exist to override the > configured default serdes of your application, which you must do if the key > and/or value types of the resulting KGroupedStream do not match the > configured default serdes.” -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5245) KStream builder should capture serdes
[ https://issues.apache.org/jira/browse/KAFKA-5245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16046796#comment-16046796 ] Matthias J. Sax commented on KAFKA-5245: But {{KStream#groupByKey()}} uses the serdes from {{StreamsConfig}} -- this ticket is about changing this behavior and "carry" Serde information downstream. If you set a serde on a source, it's obvious that you can reuse it as long as the type does not change -- however, Streams does not exploit this atm. > KStream builder should capture serdes > -- > > Key: KAFKA-5245 > URL: https://issues.apache.org/jira/browse/KAFKA-5245 > Project: Kafka > Issue Type: Improvement > Components: streams >Affects Versions: 0.10.2.0, 0.10.2.1 >Reporter: Yeva Byzek >Assignee: anugrah >Priority: Minor > Labels: beginner, newbie > > Even if one specifies a serdes in `builder.stream`, later a call to > `groupByKey` may require the serdes again if it differs from the configured > streams app serdes. The preferred behavior is that if no serdes is provided > to `groupByKey`, it should use whatever was provided in `builder.stream` and > not what was in the app. > From the current docs: > “When to set explicit serdes: Variants of groupByKey exist to override the > configured default serdes of your application, which you must do if the key > and/or value types of the resulting KGroupedStream do not match the > configured default serdes.” -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (KAFKA-5245) KStream builder should capture serdes
[ https://issues.apache.org/jira/browse/KAFKA-5245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16045936#comment-16045936 ] Evgeny Veretennikov commented on KAFKA-5245: There is Kstream.groupByKey() method, which doesn't require serdes. I guess, this ticket should be just closed. > KStream builder should capture serdes > -- > > Key: KAFKA-5245 > URL: https://issues.apache.org/jira/browse/KAFKA-5245 > Project: Kafka > Issue Type: Improvement > Components: streams >Affects Versions: 0.10.2.0, 0.10.2.1 >Reporter: Yeva Byzek >Assignee: anugrah >Priority: Minor > Labels: beginner, newbie > > Even if one specifies a serdes in `builder.stream`, later a call to > `groupByKey` may require the serdes again if it differs from the configured > streams app serdes. The preferred behavior is that if no serdes is provided > to `groupByKey`, it should use whatever was provided in `builder.stream` and > not what was in the app. > From the current docs: > “When to set explicit serdes: Variants of groupByKey exist to override the > configured default serdes of your application, which you must do if the key > and/or value types of the resulting KGroupedStream do not match the > configured default serdes.” -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5245) KStream builder should capture serdes
[ https://issues.apache.org/jira/browse/KAFKA-5245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16017431#comment-16017431 ] anugrah commented on KAFKA-5245: sure. I will start on this. > KStream builder should capture serdes > -- > > Key: KAFKA-5245 > URL: https://issues.apache.org/jira/browse/KAFKA-5245 > Project: Kafka > Issue Type: Improvement > Components: streams >Affects Versions: 0.10.2.0, 0.10.2.1 >Reporter: Yeva Byzek >Priority: Minor > Labels: beginner, newbie > > Even if one specifies a serdes in `builder.stream`, later a call to > `groupByKey` may require the serdes again if it differs from the configured > streams app serdes. The preferred behavior is that if no serdes is provided > to `groupByKey`, it should use whatever was provided in `builder.stream` and > not what was in the app. > From the current docs: > “When to set explicit serdes: Variants of groupByKey exist to override the > configured default serdes of your application, which you must do if the key > and/or value types of the resulting KGroupedStream do not match the > configured default serdes.” -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5245) KStream builder should capture serdes
[ https://issues.apache.org/jira/browse/KAFKA-5245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16014328#comment-16014328 ] Matthias J. Sax commented on KAFKA-5245: You can just assign the JIRA to yourself and get started :) This might help too: https://cwiki.apache.org/confluence/display/KAFKA/Contributing+Code+Changes If you have any further question, just let us know. > KStream builder should capture serdes > -- > > Key: KAFKA-5245 > URL: https://issues.apache.org/jira/browse/KAFKA-5245 > Project: Kafka > Issue Type: Improvement > Components: streams >Affects Versions: 0.10.2.0, 0.10.2.1 >Reporter: Yeva Byzek >Priority: Minor > Labels: beginner, newbie > > Even if one specifies a serdes in `builder.stream`, later a call to > `groupByKey` may require the serdes again if it differs from the configured > streams app serdes. The preferred behavior is that if no serdes is provided > to `groupByKey`, it should use whatever was provided in `builder.stream` and > not what was in the app. > From the current docs: > “When to set explicit serdes: Variants of groupByKey exist to override the > configured default serdes of your application, which you must do if the key > and/or value types of the resulting KGroupedStream do not match the > configured default serdes.” -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5245) KStream builder should capture serdes
[ https://issues.apache.org/jira/browse/KAFKA-5245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16014310#comment-16014310 ] anugrah commented on KAFKA-5245: hey, i would like to work on this. Any pointers as to how am i supposed to proceed ? > KStream builder should capture serdes > -- > > Key: KAFKA-5245 > URL: https://issues.apache.org/jira/browse/KAFKA-5245 > Project: Kafka > Issue Type: Improvement > Components: streams >Affects Versions: 0.10.2.0, 0.10.2.1 >Reporter: Yeva Byzek >Priority: Minor > Labels: beginner, newbie > > Even if one specifies a serdes in `builder.stream`, later a call to > `groupByKey` may require the serdes again if it differs from the configured > streams app serdes. The preferred behavior is that if no serdes is provided > to `groupByKey`, it should use whatever was provided in `builder.stream` and > not what was in the app. > From the current docs: > “When to set explicit serdes: Variants of groupByKey exist to override the > configured default serdes of your application, which you must do if the key > and/or value types of the resulting KGroupedStream do not match the > configured default serdes.” -- This message was sent by Atlassian JIRA (v6.3.15#6346)