No worries. It's just good to know. It seems that some other people are
interested to drive this further. So we will just "reassign" it to them.

Thanks for letting us know.


-Matthias

On 6/20/18 2:51 PM, Jeyhun Karimov wrote:
> Hi Matthias, all,
> 
> Currently, I am not able to complete this KIP. Please accept my
> apologies for that. 
> 
> 
> Cheers,
> Jeyhun
> 
> On Mon, Jun 11, 2018 at 2:25 AM Matthias J. Sax <matth...@confluent.io
> <mailto:matth...@confluent.io>> wrote:
> 
>     What is the status of this KIP?
> 
>     -Matthias
> 
> 
>     On 2/13/18 1:43 PM, Matthias J. Sax wrote:
>     > Is there any update for this KIP?
>     >
>     >
>     > -Matthias
>     >
>     > On 12/4/17 2:08 PM, Matthias J. Sax wrote:
>     >> Jeyhun,
>     >>
>     >> thanks for updating the KIP.
>     >>
>     >> I am wondering if you intend to add a new class `Produced`? There is
>     >> already `org.apache.kafka.streams.kstream.Produced`. So if we want to
>     >> add a new class, it must have a different name -- or we might be
>     able to
>     >> merge both into one?
>     >>
>     >> Also, for the KStream overlaods of `through()` and `to()`, can
>     you add
>     >> the different behavior using different overloads? It's not clear from
>     >> the KIP what the semantics are.
>     >>
>     >>
>     >> -Matthias
>     >>
>     >> On 11/17/17 3:27 PM, Jeyhun Karimov wrote:
>     >>> Hi,
>     >>>
>     >>> Thanks for your comments. I agree with Matthias partially.
>     >>> I think we should relax some requirements related with to() and
>     through()
>     >>> methods.
>     >>> IMHO, Produced class can cover (existing/to be created) topic
>     information,
>     >>> and which will ease our effort:
>     >>>
>     >>> KStream.to(Produced topicInfo)
>     >>> KStream.through(Produced topicInfo)
>     >>>
>     >>> This will decrease the number of overloads but we will need to
>     deprecate
>     >>> the existing to() and through() methods, perhaps.
>     >>> I updated the KIP accordingly.
>     >>>
>     >>>
>     >>> Cheers,
>     >>> Jeyhun
>     >>>
>     >>> On Thu, Nov 16, 2017 at 10:21 PM Matthias J. Sax
>     <matth...@confluent.io <mailto:matth...@confluent.io>>
>     >>> wrote:
>     >>>
>     >>>> @Jan:
>     >>>>
>     >>>> The `Produced` class was introduced in 1.0 to specify key and valud
>     >>>> Serdes (and partitioner) if data is written into a topic.
>     >>>>
>     >>>> Old API:
>     >>>>
>     >>>> KStream#to("topic", keySerde, valueSerde);
>     >>>>
>     >>>> New API:
>     >>>>
>     >>>> KStream#to("topic", Produced.with(keySerde, valueSerde));
>     >>>>
>     >>>>
>     >>>> This allows to reduce the number of overloads for `to()` (and
>     >>>> `through()` that follows the same pattern) -- the second
>     parameter is
>     >>>> used to cover all different variations of option parameters
>     users can
>     >>>> specify, while we only have 2 overload for `to()` itself.
>     >>>>
>     >>>> What is still unclear to me it, what you mean by this topic prefix
>     >>>> thing? Either a user cares about the topic name and thus, must
>     create
>     >>>> and manage it manually. Or the user does not care, and Streams
>     create
>     >>>> it. How would this prefix idea fit in here?
>     >>>>
>     >>>>
>     >>>>
>     >>>> @Guozhang:
>     >>>>
>     >>>> My idea was to extend `Produced` with the hint we want to give for
>     >>>> creating internal topic and pass a optional `Produced`
>     parameter. There
>     >>>> are multiple things we can do here:
>     >>>>
>     >>>> 1) stream.through(null, Produced...).groupBy().aggregate()
>     >>>> -> just allow for `null` topic name indicating that Streams should
>     >>>> create an internal topic
>     >>>>
>     >>>> 2) stream.through(Produced...).groupBy().aggregate()
>     >>>> -> add one overload taking an mandatory `Produced`
>     >>>>
>     >>>> We use `Serialized` to picky back the information
>     >>>>
>     >>>> 3) stream.groupBy(Serialized...).aggregate()
>     >>>> and stream.groupByKey(Serialized...).aggregate()
>     >>>> -> we don't need new top level overloads
>     >>>>
>     >>>>
>     >>>> There are different trade-offs for those alternatives and maybe
>     there
>     >>>> are other ways to change the API. It's just to push the
>     discussion further.
>     >>>>
>     >>>>
>     >>>> -Matthias
>     >>>>
>     >>>> On 11/12/17 1:22 PM, Jan Filipiak wrote:
>     >>>>> Hi Gouzhang,
>     >>>>>
>     >>>>> this felt like these questions are supposed to be answered by me.
>     >>>>> I do not understand the first one. I don't understand why the user
>     >>>>> shouldn't be able to specify a suffix for the topic name.
>     >>>>>
>     >>>>>  For the third question I am not 100% familiar if the Produced
>     class
>     >>>>> came to existence
>     >>>>> at all. I remember proposing it somewhere in our redo DSL
>     discussion that
>     >>>>> I dropped out of later. Finally any call that does:
>     >>>>>
>     >>>>> 1. create the internal topic
>     >>>>> 2. register sink
>     >>>>> 3. register source
>     >>>>>
>     >>>>> will always get the work done. If we have a Produced like
>     class. putting
>     >>>>> all the parameters
>     >>>>> in there make sense. (Partitioner, serde, PartitionHint,
>     internal, name
>     >>>>> ... )
>     >>>>>
>     >>>>> Hope this helps?
>     >>>>>
>     >>>>>
>     >>>>> On 10.11.2017 07:54, Guozhang Wang wrote:
>     >>>>>> A few clarification questions on the proposal details.
>     >>>>>>
>     >>>>>> 1. API: although the repartition only happens at the final
>     stateful
>     >>>>>> operations like agg / join, the repartition flag info was
>     actually
>     >>>> passed
>     >>>>>> from an earlier operator like map / groupBy. So what should
>     be the new
>     >>>>>> API
>     >>>>>> look like? For example, if we do
>     >>>>>>
>     >>>>>> stream.groupBy().through("topic-name", Produced..).aggregate
>     >>>>>>
>     >>>>>> This would be add a bunch of APIs to GroupedKStream/KTable
>     >>>>>>
>     >>>>>> 2. Semantics: as Matthias mentioned, today any topics defined in
>     >>>>>> "through()" call is considered a user topic, and hence users are
>     >>>>>> responsible for managing them, including the topic name. For
>     this KIP's
>     >>>>>> purpose, though, users would not care about the topic name.
>     I.e. as a
>     >>>>>> user
>     >>>>>> I still want to make it be an internal topic so that I do not
>     need to
>     >>>>>> worry
>     >>>>>> about it at all, but only specify num.partitions.
>     >>>>>>
>     >>>>>> 3. Details: in Produced we do not have specs for specifying the
>     >>>>>> num.partitions or should we repartition or not. So it is
>     still not
>     >>>>>> clear to
>     >>>>>> me how we would make use of that to achieve what's in the old
>     >>>>>> proposal's RepartitionHint class.
>     >>>>>>
>     >>>>>>
>     >>>>>>
>     >>>>>> Guozhang
>     >>>>>>
>     >>>>>>
>     >>>>>> On Mon, Nov 6, 2017 at 1:21 PM, Ted Yu <yuzhih...@gmail.com
>     <mailto:yuzhih...@gmail.com>> wrote:
>     >>>>>>
>     >>>>>>> bq. enlarge the score of through()
>     >>>>>>>
>     >>>>>>> I guess you meant scope.
>     >>>>>>>
>     >>>>>>> On Mon, Nov 6, 2017 at 1:15 PM, Jeyhun Karimov
>     <je.kari...@gmail.com <mailto:je.kari...@gmail.com>>
>     >>>>>>> wrote:
>     >>>>>>>
>     >>>>>>>> Hi,
>     >>>>>>>>
>     >>>>>>>> Sorry for the late reply. I am convinced that we should
>     enlarge the
>     >>>>>>>> score
>     >>>>>>>> of through() (add more overloads) instead of introducing a
>     separate
>     >>>> set
>     >>>>>>> of
>     >>>>>>>> overloads to other methods.
>     >>>>>>>> I will update the KIP soon based on the discussion and inform.
>     >>>>>>>>
>     >>>>>>>>
>     >>>>>>>> Cheers,
>     >>>>>>>> Jeyhun
>     >>>>>>>>
>     >>>>>>>> On Mon, Nov 6, 2017 at 9:18 PM Jan Filipiak
>     <jan.filip...@trivago.com <mailto:jan.filip...@trivago.com>
>     >>>>>
>     >>>>>>>> wrote:
>     >>>>>>>>
>     >>>>>>>>> Sorry for not beeing 100% up to date.
>     >>>>>>>>> Back then we had the discussion that when an operation
>     puts a >Sink<
>     >>>>>>>>> into the topology, a >Produced<
>     >>>>>>>>> parameter is added. This produced parameter could have
>     internal or
>     >>>>>>>>> external. If internal I think the name would still make
>     >>>>>>>>> a great suffix for the topic name
>     >>>>>>>>>
>     >>>>>>>>> Is this plan still around? Otherwise having the name as
>     suffix is
>     >>>>>>>>> probably always good it can help the user quicker to
>     identify hot
>     >>>>>>> topics
>     >>>>>>>>> that need more
>     >>>>>>>>> partitions if he has many of these internal repartitions
>     >>>>>>>>>
>     >>>>>>>>> Best Jan
>     >>>>>>>>>
>     >>>>>>>>>
>     >>>>>>>>> On 06.11.2017 20:13, Matthias J. Sax wrote:
>     >>>>>>>>>> I absolute agree with what you say. It's not a requirement to
>     >>>>>>> specify a
>     >>>>>>>>>> topic name -- and this was the idea -- if user does
>     specify a name,
>     >>>>>>> we
>     >>>>>>>>>> treat as is -- if users does not specify a name, Streams
>     create an
>     >>>>>>>>>> internal topic.
>     >>>>>>>>>>
>     >>>>>>>>>> The goal of the Jira is to allow a simplified way to control
>     >>>>>>>>>> repartitioning (atm, user needs to manually create a
>     topic and use
>     >>>>>>> via
>     >>>>>>>>>> through()).
>     >>>>>>>>>>
>     >>>>>>>>>> Thus, the idea is to make the topic name parameter of through
>     >>>>>>> optional.
>     >>>>>>>>>> It's of course just an idea. Happy do have a other API
>     design. The
>     >>>>>>> goal
>     >>>>>>>>>> was, to avoid to many new overloads.
>     >>>>>>>>>>
>     >>>>>>>>>>>> Could you clarify exactly what you mean by keeping the
>     current
>     >>>>>>>>> distinction?
>     >>>>>>>>>> Current distinction is: user topics are created manually
>     and user
>     >>>>>>>>>> specifies the name -- internal topics are created by
>     Kafka Streams
>     >>>>>>> and
>     >>>>>>>>>> an name is generated automatically.
>     >>>>>>>>>>
>     >>>>>>>>>> -> through("user-topic")
>     >>>>>>>>>> -> through(TopicConfig.withNumberOfPartitions(5)) //
>     Streams creates
>     >>>>>>>> an
>     >>>>>>>>>> internal topic
>     >>>>>>>>>>
>     >>>>>>>>>>
>     >>>>>>>>>> -Matthias
>     >>>>>>>>>>
>     >>>>>>>>>>
>     >>>>>>>>>> On 11/6/17 6:56 PM, Thomas Becker wrote:
>     >>>>>>>>>>> Could you clarify exactly what you mean by keeping the
>     current
>     >>>>>>>>> distinction?
>     >>>>>>>>>>> Actually, re-reading the KIP and JIRA, it's not clear
>     that being
>     >>>>>>> able
>     >>>>>>>>> to specify a custom name is actually a requirement. If the
>     goal is to
>     >>>>>>>>> control repartitioning and tune parallelism, maybe we can just
>     >>>>>>>>> sidestep
>     >>>>>>>>> this issue altogether by removing the ability to set a
>     different
>     >>>> name.
>     >>>>>>>>>>> On Mon, 2017-11-06 at 16:51 +0100, Matthias J. Sax wrote:
>     >>>>>>>>>>>
>     >>>>>>>>>>> That's a good point. In current design, we strictly
>     distinguish
>     >>>>>>> both.
>     >>>>>>>>>>> For example, the reset tools deletes internal topics
>     (starting with
>     >>>>>>>>>>> prefix `<application.id <http://application.id>>-` and
>     ending with either `-repartition`
>     >>>> or
>     >>>>>>>>>>> `-changelog`.
>     >>>>>>>>>>>
>     >>>>>>>>>>> Thus, from my point of view, it would make sense to keep the
>     >>>> current
>     >>>>>>>>>>> distinction.
>     >>>>>>>>>>>
>     >>>>>>>>>>> -Matthias
>     >>>>>>>>>>>
>     >>>>>>>>>>> On 11/6/17 4:45 PM, Thomas Becker wrote:
>     >>>>>>>>>>>
>     >>>>>>>>>>>
>     >>>>>>>>>>> I think this sounds good as well. It's worth clarifying
>     whether
>     >>>>>>> topics
>     >>>>>>>>> that are named by the user but created by streams are
>     considered
>     >>>>>>>> "internal"
>     >>>>>>>>> topics also.
>     >>>>>>>>>>> On Sun, 2017-11-05 at 23:02 +0100, Matthias J. Sax wrote:
>     >>>>>>>>>>>
>     >>>>>>>>>>> My idea was, to relax the requirement for through() that
>     a topic
>     >>>>>>> must
>     >>>>>>>> be
>     >>>>>>>>>>> created manually before startup.
>     >>>>>>>>>>>
>     >>>>>>>>>>> Thus, if no through() call is made, a (internal) topic
>     is created
>     >>>>>>> the
>     >>>>>>>>>>> same way we do it currently.
>     >>>>>>>>>>>
>     >>>>>>>>>>> If one uses `through(String topicName)` we keep the current
>     >>>> behavior
>     >>>>>>>> and
>     >>>>>>>>>>> require users to create the topic manually.
>     >>>>>>>>>>>
>     >>>>>>>>>>> The reasoning is as follows: if a user creates a topic
>     manually, a
>     >>>>>>>> user
>     >>>>>>>>>>> can just use it for repartitioning. As the topic is
>     already there,
>     >>>>>>>> there
>     >>>>>>>>>>> is no need to specify any topic configs.
>     >>>>>>>>>>>
>     >>>>>>>>>>> We add a new `through()` overload (details TBD) that
>     allows to
>     >>>>>>> specify
>     >>>>>>>>>>> topic configs and Streams create the topic with those
>     configs.
>     >>>>>>>>>>>
>     >>>>>>>>>>> Reasoning: user don't want to manage topic manually,
>     thus, it's
>     >>>>>>> still
>     >>>>>>>> an
>     >>>>>>>>>>> internal topic and Streams create the topic name
>     automatically as
>     >>>>>>> for
>     >>>>>>>>>>> all other internal topics. However, users gets some more
>     control
>     >>>>>>> about
>     >>>>>>>>>>> topic parameters like number of partitions (we should
>     discuss what
>     >>>>>>>> other
>     >>>>>>>>>>> configs would be useful).
>     >>>>>>>>>>>
>     >>>>>>>>>>>
>     >>>>>>>>>>> Does this make sense?
>     >>>>>>>>>>>
>     >>>>>>>>>>>
>     >>>>>>>>>>> -Matthias
>     >>>>>>>>>>>
>     >>>>>>>>>>>
>     >>>>>>>>>>> On 11/5/17 1:21 AM, Jan Filipiak wrote:
>     >>>>>>>>>>>
>     >>>>>>>>>>>
>     >>>>>>>>>>> Hi.
>     >>>>>>>>>>>
>     >>>>>>>>>>>
>     >>>>>>>>>>> Im not 100 % up to date what version 1.0 DSL looks like ATM.
>     >>>>>>>>>>> I just would argue that repartitioning should be an own
>     API call
>     >>>>>>> like
>     >>>>>>>>>>> through or something.
>     >>>>>>>>>>> One can use through or to already to get this. I would
>     argue one
>     >>>>>>>> should
>     >>>>>>>>>>> look there instead of overloads
>     >>>>>>>>>>>
>     >>>>>>>>>>> Best Jan
>     >>>>>>>>>>>
>     >>>>>>>>>>> On 04.11.2017 16:01, Jeyhun Karimov wrote:
>     >>>>>>>>>>>
>     >>>>>>>>>>>
>     >>>>>>>>>>> Dear community,
>     >>>>>>>>>>>
>     >>>>>>>>>>> I would like to initiate discussion on KIP-221 [1] based
>     on issue
>     >>>>>>> [2].
>     >>>>>>>>>>> Please feel free to comment.
>     >>>>>>>>>>>
>     >>>>>>>>>>> [1]
>     >>>>>>>>>>>
>     >>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>     >>>>>>>> 221%3A+Repartition+Topic+Hints+in+Streams
>     >>>>>>>>>>> [2] https://issues.apache.org/jira/browse/KAFKA-6037
>     >>>>>>>>>>>
>     >>>>>>>>>>>
>     >>>>>>>>>>>
>     >>>>>>>>>>> Cheers,
>     >>>>>>>>>>> Jeyhun
>     >>>>>>>>>>>
>     >>>>>>>>>>>
>     >>>>>>>>>>>
>     >>>>>>>>>>>
>     >>>>>>>>>>>
>     >>>>>>>>>>>
>     >>>>>>>>>>>
>     >>>>>>>>>>>
>     >>>>>>>>>>>
>     >>>>>>>>>>> ________________________________
>     >>>>>>>>>>>
>     >>>>>>>>>>> This email and any attachments may contain confidential and
>     >>>>>>> privileged
>     >>>>>>>>> material for the sole use of the intended recipient. Any
>     review,
>     >>>>>>> copying,
>     >>>>>>>>> or distribution of this email (or any attachments) by
>     others is
>     >>>>>>>> prohibited.
>     >>>>>>>>> If you are not the intended recipient, please contact the
>     sender
>     >>>>>>>>> immediately and permanently delete this email and any
>     attachments. No
>     >>>>>>>>> employee or agent of TiVo Inc. is authorized to conclude
>     any binding
>     >>>>>>>>> agreement on behalf of TiVo Inc. by email. Binding
>     agreements with
>     >>>>>>>>> TiVo
>     >>>>>>>>> Inc. may only be made by a signed written agreement.
>     >>>>>>>>>>>
>     >>>>>>>>>>>
>     >>>>>>>>>>>
>     >>>>>>>>>>>
>     >>>>>>>>>>>
>     >>>>>>>>>>> ________________________________
>     >>>>>>>>>>>
>     >>>>>>>>>>> This email and any attachments may contain confidential and
>     >>>>>>> privileged
>     >>>>>>>>> material for the sole use of the intended recipient. Any
>     review,
>     >>>>>>> copying,
>     >>>>>>>>> or distribution of this email (or any attachments) by
>     others is
>     >>>>>>>> prohibited.
>     >>>>>>>>> If you are not the intended recipient, please contact the
>     sender
>     >>>>>>>>> immediately and permanently delete this email and any
>     attachments. No
>     >>>>>>>>> employee or agent of TiVo Inc. is authorized to conclude
>     any binding
>     >>>>>>>>> agreement on behalf of TiVo Inc. by email. Binding
>     agreements with
>     >>>>>>>>> TiVo
>     >>>>>>>>> Inc. may only be made by a signed written agreement.
>     >>>>>>>>>
>     >>>>>>
>     >>>>>>
>     >>>>>
>     >>>>
>     >>>>
>     >>>
>     >>
>     >
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to