Hi Matthias, all,

Currently, I am not able to complete this KIP. Please accept my apologies
for that.


Cheers,
Jeyhun

On Mon, Jun 11, 2018 at 2:25 AM Matthias J. Sax <matth...@confluent.io>
wrote:

> What is the status of this KIP?
>
> -Matthias
>
>
> On 2/13/18 1:43 PM, Matthias J. Sax wrote:
> > Is there any update for this KIP?
> >
> >
> > -Matthias
> >
> > On 12/4/17 2:08 PM, Matthias J. Sax wrote:
> >> Jeyhun,
> >>
> >> thanks for updating the KIP.
> >>
> >> I am wondering if you intend to add a new class `Produced`? There is
> >> already `org.apache.kafka.streams.kstream.Produced`. So if we want to
> >> add a new class, it must have a different name -- or we might be able to
> >> merge both into one?
> >>
> >> Also, for the KStream overlaods of `through()` and `to()`, can you add
> >> the different behavior using different overloads? It's not clear from
> >> the KIP what the semantics are.
> >>
> >>
> >> -Matthias
> >>
> >> On 11/17/17 3:27 PM, Jeyhun Karimov wrote:
> >>> Hi,
> >>>
> >>> Thanks for your comments. I agree with Matthias partially.
> >>> I think we should relax some requirements related with to() and
> through()
> >>> methods.
> >>> IMHO, Produced class can cover (existing/to be created) topic
> information,
> >>> and which will ease our effort:
> >>>
> >>> KStream.to(Produced topicInfo)
> >>> KStream.through(Produced topicInfo)
> >>>
> >>> This will decrease the number of overloads but we will need to
> deprecate
> >>> the existing to() and through() methods, perhaps.
> >>> I updated the KIP accordingly.
> >>>
> >>>
> >>> Cheers,
> >>> Jeyhun
> >>>
> >>> On Thu, Nov 16, 2017 at 10:21 PM Matthias J. Sax <
> matth...@confluent.io>
> >>> wrote:
> >>>
> >>>> @Jan:
> >>>>
> >>>> The `Produced` class was introduced in 1.0 to specify key and valud
> >>>> Serdes (and partitioner) if data is written into a topic.
> >>>>
> >>>> Old API:
> >>>>
> >>>> KStream#to("topic", keySerde, valueSerde);
> >>>>
> >>>> New API:
> >>>>
> >>>> KStream#to("topic", Produced.with(keySerde, valueSerde));
> >>>>
> >>>>
> >>>> This allows to reduce the number of overloads for `to()` (and
> >>>> `through()` that follows the same pattern) -- the second parameter is
> >>>> used to cover all different variations of option parameters users can
> >>>> specify, while we only have 2 overload for `to()` itself.
> >>>>
> >>>> What is still unclear to me it, what you mean by this topic prefix
> >>>> thing? Either a user cares about the topic name and thus, must create
> >>>> and manage it manually. Or the user does not care, and Streams create
> >>>> it. How would this prefix idea fit in here?
> >>>>
> >>>>
> >>>>
> >>>> @Guozhang:
> >>>>
> >>>> My idea was to extend `Produced` with the hint we want to give for
> >>>> creating internal topic and pass a optional `Produced` parameter.
> There
> >>>> are multiple things we can do here:
> >>>>
> >>>> 1) stream.through(null, Produced...).groupBy().aggregate()
> >>>> -> just allow for `null` topic name indicating that Streams should
> >>>> create an internal topic
> >>>>
> >>>> 2) stream.through(Produced...).groupBy().aggregate()
> >>>> -> add one overload taking an mandatory `Produced`
> >>>>
> >>>> We use `Serialized` to picky back the information
> >>>>
> >>>> 3) stream.groupBy(Serialized...).aggregate()
> >>>> and stream.groupByKey(Serialized...).aggregate()
> >>>> -> we don't need new top level overloads
> >>>>
> >>>>
> >>>> There are different trade-offs for those alternatives and maybe there
> >>>> are other ways to change the API. It's just to push the discussion
> further.
> >>>>
> >>>>
> >>>> -Matthias
> >>>>
> >>>> On 11/12/17 1:22 PM, Jan Filipiak wrote:
> >>>>> Hi Gouzhang,
> >>>>>
> >>>>> this felt like these questions are supposed to be answered by me.
> >>>>> I do not understand the first one. I don't understand why the user
> >>>>> shouldn't be able to specify a suffix for the topic name.
> >>>>>
> >>>>>  For the third question I am not 100% familiar if the Produced class
> >>>>> came to existence
> >>>>> at all. I remember proposing it somewhere in our redo DSL discussion
> that
> >>>>> I dropped out of later. Finally any call that does:
> >>>>>
> >>>>> 1. create the internal topic
> >>>>> 2. register sink
> >>>>> 3. register source
> >>>>>
> >>>>> will always get the work done. If we have a Produced like class.
> putting
> >>>>> all the parameters
> >>>>> in there make sense. (Partitioner, serde, PartitionHint, internal,
> name
> >>>>> ... )
> >>>>>
> >>>>> Hope this helps?
> >>>>>
> >>>>>
> >>>>> On 10.11.2017 07:54, Guozhang Wang wrote:
> >>>>>> A few clarification questions on the proposal details.
> >>>>>>
> >>>>>> 1. API: although the repartition only happens at the final stateful
> >>>>>> operations like agg / join, the repartition flag info was actually
> >>>> passed
> >>>>>> from an earlier operator like map / groupBy. So what should be the
> new
> >>>>>> API
> >>>>>> look like? For example, if we do
> >>>>>>
> >>>>>> stream.groupBy().through("topic-name", Produced..).aggregate
> >>>>>>
> >>>>>> This would be add a bunch of APIs to GroupedKStream/KTable
> >>>>>>
> >>>>>> 2. Semantics: as Matthias mentioned, today any topics defined in
> >>>>>> "through()" call is considered a user topic, and hence users are
> >>>>>> responsible for managing them, including the topic name. For this
> KIP's
> >>>>>> purpose, though, users would not care about the topic name. I.e. as
> a
> >>>>>> user
> >>>>>> I still want to make it be an internal topic so that I do not need
> to
> >>>>>> worry
> >>>>>> about it at all, but only specify num.partitions.
> >>>>>>
> >>>>>> 3. Details: in Produced we do not have specs for specifying the
> >>>>>> num.partitions or should we repartition or not. So it is still not
> >>>>>> clear to
> >>>>>> me how we would make use of that to achieve what's in the old
> >>>>>> proposal's RepartitionHint class.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> Guozhang
> >>>>>>
> >>>>>>
> >>>>>> On Mon, Nov 6, 2017 at 1:21 PM, Ted Yu <yuzhih...@gmail.com> wrote:
> >>>>>>
> >>>>>>> bq. enlarge the score of through()
> >>>>>>>
> >>>>>>> I guess you meant scope.
> >>>>>>>
> >>>>>>> On Mon, Nov 6, 2017 at 1:15 PM, Jeyhun Karimov <
> je.kari...@gmail.com>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> Sorry for the late reply. I am convinced that we should enlarge
> the
> >>>>>>>> score
> >>>>>>>> of through() (add more overloads) instead of introducing a
> separate
> >>>> set
> >>>>>>> of
> >>>>>>>> overloads to other methods.
> >>>>>>>> I will update the KIP soon based on the discussion and inform.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Cheers,
> >>>>>>>> Jeyhun
> >>>>>>>>
> >>>>>>>> On Mon, Nov 6, 2017 at 9:18 PM Jan Filipiak <
> jan.filip...@trivago.com
> >>>>>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Sorry for not beeing 100% up to date.
> >>>>>>>>> Back then we had the discussion that when an operation puts a
> >Sink<
> >>>>>>>>> into the topology, a >Produced<
> >>>>>>>>> parameter is added. This produced parameter could have internal
> or
> >>>>>>>>> external. If internal I think the name would still make
> >>>>>>>>> a great suffix for the topic name
> >>>>>>>>>
> >>>>>>>>> Is this plan still around? Otherwise having the name as suffix is
> >>>>>>>>> probably always good it can help the user quicker to identify hot
> >>>>>>> topics
> >>>>>>>>> that need more
> >>>>>>>>> partitions if he has many of these internal repartitions
> >>>>>>>>>
> >>>>>>>>> Best Jan
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On 06.11.2017 20:13, Matthias J. Sax wrote:
> >>>>>>>>>> I absolute agree with what you say. It's not a requirement to
> >>>>>>> specify a
> >>>>>>>>>> topic name -- and this was the idea -- if user does specify a
> name,
> >>>>>>> we
> >>>>>>>>>> treat as is -- if users does not specify a name, Streams create
> an
> >>>>>>>>>> internal topic.
> >>>>>>>>>>
> >>>>>>>>>> The goal of the Jira is to allow a simplified way to control
> >>>>>>>>>> repartitioning (atm, user needs to manually create a topic and
> use
> >>>>>>> via
> >>>>>>>>>> through()).
> >>>>>>>>>>
> >>>>>>>>>> Thus, the idea is to make the topic name parameter of through
> >>>>>>> optional.
> >>>>>>>>>> It's of course just an idea. Happy do have a other API design.
> The
> >>>>>>> goal
> >>>>>>>>>> was, to avoid to many new overloads.
> >>>>>>>>>>
> >>>>>>>>>>>> Could you clarify exactly what you mean by keeping the current
> >>>>>>>>> distinction?
> >>>>>>>>>> Current distinction is: user topics are created manually and
> user
> >>>>>>>>>> specifies the name -- internal topics are created by Kafka
> Streams
> >>>>>>> and
> >>>>>>>>>> an name is generated automatically.
> >>>>>>>>>>
> >>>>>>>>>> -> through("user-topic")
> >>>>>>>>>> -> through(TopicConfig.withNumberOfPartitions(5)) // Streams
> creates
> >>>>>>>> an
> >>>>>>>>>> internal topic
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> -Matthias
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On 11/6/17 6:56 PM, Thomas Becker wrote:
> >>>>>>>>>>> Could you clarify exactly what you mean by keeping the current
> >>>>>>>>> distinction?
> >>>>>>>>>>> Actually, re-reading the KIP and JIRA, it's not clear that
> being
> >>>>>>> able
> >>>>>>>>> to specify a custom name is actually a requirement. If the goal
> is to
> >>>>>>>>> control repartitioning and tune parallelism, maybe we can just
> >>>>>>>>> sidestep
> >>>>>>>>> this issue altogether by removing the ability to set a different
> >>>> name.
> >>>>>>>>>>> On Mon, 2017-11-06 at 16:51 +0100, Matthias J. Sax wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> That's a good point. In current design, we strictly distinguish
> >>>>>>> both.
> >>>>>>>>>>> For example, the reset tools deletes internal topics (starting
> with
> >>>>>>>>>>> prefix `<application.id>-` and ending with either
> `-repartition`
> >>>> or
> >>>>>>>>>>> `-changelog`.
> >>>>>>>>>>>
> >>>>>>>>>>> Thus, from my point of view, it would make sense to keep the
> >>>> current
> >>>>>>>>>>> distinction.
> >>>>>>>>>>>
> >>>>>>>>>>> -Matthias
> >>>>>>>>>>>
> >>>>>>>>>>> On 11/6/17 4:45 PM, Thomas Becker wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> I think this sounds good as well. It's worth clarifying whether
> >>>>>>> topics
> >>>>>>>>> that are named by the user but created by streams are considered
> >>>>>>>> "internal"
> >>>>>>>>> topics also.
> >>>>>>>>>>> On Sun, 2017-11-05 at 23:02 +0100, Matthias J. Sax wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> My idea was, to relax the requirement for through() that a
> topic
> >>>>>>> must
> >>>>>>>> be
> >>>>>>>>>>> created manually before startup.
> >>>>>>>>>>>
> >>>>>>>>>>> Thus, if no through() call is made, a (internal) topic is
> created
> >>>>>>> the
> >>>>>>>>>>> same way we do it currently.
> >>>>>>>>>>>
> >>>>>>>>>>> If one uses `through(String topicName)` we keep the current
> >>>> behavior
> >>>>>>>> and
> >>>>>>>>>>> require users to create the topic manually.
> >>>>>>>>>>>
> >>>>>>>>>>> The reasoning is as follows: if a user creates a topic
> manually, a
> >>>>>>>> user
> >>>>>>>>>>> can just use it for repartitioning. As the topic is already
> there,
> >>>>>>>> there
> >>>>>>>>>>> is no need to specify any topic configs.
> >>>>>>>>>>>
> >>>>>>>>>>> We add a new `through()` overload (details TBD) that allows to
> >>>>>>> specify
> >>>>>>>>>>> topic configs and Streams create the topic with those configs.
> >>>>>>>>>>>
> >>>>>>>>>>> Reasoning: user don't want to manage topic manually, thus, it's
> >>>>>>> still
> >>>>>>>> an
> >>>>>>>>>>> internal topic and Streams create the topic name automatically
> as
> >>>>>>> for
> >>>>>>>>>>> all other internal topics. However, users gets some more
> control
> >>>>>>> about
> >>>>>>>>>>> topic parameters like number of partitions (we should discuss
> what
> >>>>>>>> other
> >>>>>>>>>>> configs would be useful).
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Does this make sense?
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> -Matthias
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On 11/5/17 1:21 AM, Jan Filipiak wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Hi.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Im not 100 % up to date what version 1.0 DSL looks like ATM.
> >>>>>>>>>>> I just would argue that repartitioning should be an own API
> call
> >>>>>>> like
> >>>>>>>>>>> through or something.
> >>>>>>>>>>> One can use through or to already to get this. I would argue
> one
> >>>>>>>> should
> >>>>>>>>>>> look there instead of overloads
> >>>>>>>>>>>
> >>>>>>>>>>> Best Jan
> >>>>>>>>>>>
> >>>>>>>>>>> On 04.11.2017 16:01, Jeyhun Karimov wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Dear community,
> >>>>>>>>>>>
> >>>>>>>>>>> I would like to initiate discussion on KIP-221 [1] based on
> issue
> >>>>>>> [2].
> >>>>>>>>>>> Please feel free to comment.
> >>>>>>>>>>>
> >>>>>>>>>>> [1]
> >>>>>>>>>>>
> >>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >>>>>>>> 221%3A+Repartition+Topic+Hints+in+Streams
> >>>>>>>>>>> [2] https://issues.apache.org/jira/browse/KAFKA-6037
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Cheers,
> >>>>>>>>>>> Jeyhun
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> ________________________________
> >>>>>>>>>>>
> >>>>>>>>>>> This email and any attachments may contain confidential and
> >>>>>>> privileged
> >>>>>>>>> material for the sole use of the intended recipient. Any review,
> >>>>>>> copying,
> >>>>>>>>> or distribution of this email (or any attachments) by others is
> >>>>>>>> prohibited.
> >>>>>>>>> If you are not the intended recipient, please contact the sender
> >>>>>>>>> immediately and permanently delete this email and any
> attachments. No
> >>>>>>>>> employee or agent of TiVo Inc. is authorized to conclude any
> binding
> >>>>>>>>> agreement on behalf of TiVo Inc. by email. Binding agreements
> with
> >>>>>>>>> TiVo
> >>>>>>>>> Inc. may only be made by a signed written agreement.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> ________________________________
> >>>>>>>>>>>
> >>>>>>>>>>> This email and any attachments may contain confidential and
> >>>>>>> privileged
> >>>>>>>>> material for the sole use of the intended recipient. Any review,
> >>>>>>> copying,
> >>>>>>>>> or distribution of this email (or any attachments) by others is
> >>>>>>>> prohibited.
> >>>>>>>>> If you are not the intended recipient, please contact the sender
> >>>>>>>>> immediately and permanently delete this email and any
> attachments. No
> >>>>>>>>> employee or agent of TiVo Inc. is authorized to conclude any
> binding
> >>>>>>>>> agreement on behalf of TiVo Inc. by email. Binding agreements
> with
> >>>>>>>>> TiVo
> >>>>>>>>> Inc. may only be made by a signed written agreement.
> >>>>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>
> >
>
>

Reply via email to